RailsMelon: how to set up Rails 3 Reference Data using Seeds


I’ve started learning Ruby on Rails by building a web application for my own startup (sorry but it’s still too early to release full details of what it is).

Like all the things that I do, I like to share my learnings. As I will be doing more Rails stuff than Watir stuff, I was thinking of starting a new blog called RailsMelon, but I changed my mind and have decided to blog about Rails things here along side Watir and Testing things. I will prefix any Rails related posts with RailsMelon so those of you who aren’t interested in such things can safely ignore them. If the Rails content becomes too strong or I receive too many complaints, I will move the content to a separate RailsMelon blog.

Blog All The Things

Rails 3 Reference Data using Seeds

I’ve been trying to work out the best way to load reference data in a Rails 3 application. The inbuilt seeding mechanism (seeds.rb run via rake db:seed or db:setup) seemed the obvious choice to begin with but then I started seeing examples of people using seeds.rb to also load test data which confused me. When I refer to ‘reference data’ I am referring to data that needs to exist for your application to work, whether in development, test and then ultimately production. Test data is different in that I would only use it for testing, and I would never want to load it into production.

I decided upon using seeds.rb (and rake db:seed) but only to specify reference data I will use in production. Test data will be loaded in other ways (which I will discuss in a future post).

I was confused initially because the Agile Development with Rails book actually says you should delete all the data before then creating it when using seeds.

[sourcecode language=”ruby” light=”true”]
Product.create(title: ‘Programming Ruby 1.9’)

If I do this for reference data, it will continually delete/create the reference data and cause problems with other models that have related data.

To avoid this, I thought I could check to see if the object already exists, and only create if it didn’t.

[sourcecode language=”ruby” light=”true”]
Product.create(title: ‘Programming Ruby 1.9’) unless Product.find_by_title(‘Programming Ruby 1.9’)

This is slightly better, but I then realized that if you have a uniqueness constraint on a property of your object, then you don’t even need to check that the object exists, because it will only create it if it’s not already there (but if you don’t have a uniqueness constraint it will create your object every time).

[sourcecode language=”ruby” light=”true”]
Product.create(title: ‘Programming Ruby 1.9’)

If you ran this on a blank database ten times, you would only get one ‘Programming Ruby 1.9’ product because title is unique, and each run of rake db:seed would be successful, which is exactly what we want.

Making it prettier for multiples

With reference data I find there are often lots of items, so instead of calling .create methods line after line, you can store the items in an array or hash and iterate over them:

A simple list of colors:

[sourcecode language=”ruby” light=”true”]
colors = %w(blue red green orange brown)
colors.each { |color| Color.create name: color}

A more detailed list of users:

[sourcecode language=”ruby” light=”true”]
users =
{ given_name: ‘Admin’, surname: ‘User’, email: ‘admin@test.com’, password: ‘password’, password_confirmation: ‘password’, admin: true },
{ given_name: ‘Standard’, surname: ‘User’, email: ‘user@test.com’, password: ‘password’, password_confirmation: ‘password’, admin: false }
users.each { |user| User.create user }


I have found that Rails 3 seeds provides an easy way to manage reference data, but it shouldn’t be used to manage test data as well. As reference data is needed in all environments, it should be repeatable and therefore shouldn’t be destructive to existing data, so avoid using statements such as delete_all in your seeds.rb file.

Author: Alister Scott

Alister is a Software Quality Engineer for Automattic.

2 thoughts on “RailsMelon: how to set up Rails 3 Reference Data using Seeds”

  1. Just an fyi, active record actually has a method called find_or_create_by_[something] which does the check/create for you in the cases where uniqueness isn’t built into the model layer.

Comments are closed.