Random samples from R data frames.

There are times you just have too much data, random samples are nice to test assumptions and algorithms first.

So in R you can create a function to return a random sample of a data frame for such emergencies.

randomSample = function(df,n) { 
   return (df[sample(nrow(df), n),])
}

And to use:

smallerDF<-randomSample(bigDF, 40)

(40 being the number of rows you want in your sample).

 

 

 

 

3 responses to “Random samples from R data frames.”

  1. Thanks for the post! I was able to use this function. Strange that I hadn’t encountered previously the need to randomly sample observations from a data frame?!

    To other readers, make sure you use the appropriate value for your replace = argument.I think for sampling a data frame, most often the default of replace = FALSE is probably going to be what you want.

  2. Hi Philip,

    Thanks for taking the time to comment. I don’t work with small data sets so the need to create a small random sample to test an algorithm is probably more than most. And you’re right about the FALSE statement, nice catch!

    Regards
    Jase

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: