Retrieving Random Records in Rails

What is the best way to retrieve one or more random records in Rails?

Given that there are multiple ways to retrieve random records in Rails and Ruby (take a look at this SO post), it can be confusing and frustrating to decide which course of action to take.

“Is there a technique to retrieve random records that is correct/general/Rails way/cross-database compatible? “

In this post, I’ll cover a couple of simple approaches that you can take to figure out what the best way might be for you.

The first priority that you likely have to keep in mind is speed. Most applications have a threshold for what is considered slow – in web apps for example, anything that takes longer than a few hundred milliseconds to load is considered slow. So you don’t want to be choosing a method that will have your users waiting around forever for your app to do something. That being said, there is also no point in sacrificing code readability, ease of re-use and just time in general, to get faster than you have to.

The second thing you have to consider is the size (number of rows) of the database table you are looking to retrieve a random record from. In general, the bigger your table, the more knowledgable you’ll have to be about your database platform (typically Postgres or MySQL) to get to your desired speed.

Take this common solution to retrieve a random record in Rails:  

random_user = User.all.sample

To use the code above requires no internal knowledge of the database you’re running on. The #all method is a well-known ActiveRecord method and #sample is a Ruby method on Array that retrieves a random value from a given array.

The flip side of this approach is that if your database has a sizable number of rows, it will take a long time to execute. Try it out on your Rails console and see! The reason it does so is because User.all retrieves all the users from the database, and then builds ActiveRecord objects from them. So if you have let’s say 10,000 users, building 10,000 ActiveRecord objects takes a while.  If you’re reasonably certain that the table in question will not grow much in size over the lifetime of your app, and the speed at which the above code executes is agreeable to you, then by all means go for it.

Now, if your table is HUGE (has a million+ rows for example), the above method is not going to work for you. Consider using the ActiveRecord #offset and the Ruby rand methods. The #offset method in ActiveRecord specifies how many rows to skip before returning the rest of the rows. The rand method, as expected, generates a random number less than a maximum value – so when you say rand(10), it will generate a random number less than 10.

So something like this:

count = User.count
random_offset = rand(count)
random_user = User.offset(random_offset).first`

…works better (faster) for big tables, because you’re not making the database scan through every single row in the table.

Resources:

Leave a Reply

Your email address will not be published.