Rails Schema With LHM

At TaskRabbit, we like MySQL. As with everything, it has its own set of issues, though. One of these issues is that it locks the table while a column is added. This is not the biggest deal in the early days of a table, but once you start getting millions of rows and consistent traffic around the clock, this prevents the site from working as expected.

The first way that we worked around this problem was with the pt-online-schema-change tool. It does the following:

  • Makes a a new table with the old table’s schema
  • Copies data from the old table to the new table
  • Sets up a trigger for data to keep syncing
  • Adds the column to the new table
  • Renames the old table to something else and renames the new table to replace it
  • Deletes the old table

This worked quite well, but had a flaw within our process. It was outside of the development and deployment workflow. In order to develop, we would still make a Rails migration. Then just before deploy, we would run the tool and add the row to the schema_migrations table. When we deploy, it runs migrations, but in this case it would not run because we had added that row. I don’t think we ever messed it up, but the process had a few gaps.

We now use LHM also known as Large Hadron Migrator from the good folks at soundcloud. It does basically the same thing, but allows you to do it right there in the migration.

1
2
3
4
5
6
7
8
9
10
require 'lhm'

class AddMiddleNameToUsers < ActiveRecord::Migration
  def change
    Lhm.change_table :users do |m|
      # same as: add_column :users, :middle_name
      m.add_column :middle_name, "VARCHAR(191) DEFAULT NULL"
    end
  end
end

So this is great. Now it’s in the process.

Schema

One design choice (and it seems like the right one) is that it makes the last step manual. That is, it leaves the old (renamed) table around. We go back and delete these from production when sure everything worked out. The above migration might leave around a table named something like this: lhma_2014_10_28_20_41_56_933_users.

So that’s fine, but you’ll notice that when you run rake db:migrate that your schema.rb file now has that table in it. This happens because ActiveRecord takes a snapshot of your database right after the migration. We try to take really good care of our schema file, so this made us sad.

I went poking around in the ActiveRecord code ready to monkey patch the adapter that reads the table list or the code that generates the schema file. When I got there, though, I found there was already a class setting that did what I wanted. It even took a regex!

So here’s what to add if you don’t want those tables showing up in your schema file:

1
2
3
4
5
# config/initializers/active_record_schema.rb
require 'active_record'

# ignore LHM
ActiveRecord::SchemaDumper.ignore_tables << /^lhma_/

That’s it. Happy migrating.

Copyright © 2017 Brian Leonard