Big Decimal

At TaskRabbit, people have hourly rates for the tasks that they do. They get paid based on the number of hours that they work.

We try to deal in base units to remove ambiguity. For example, we store the hourly rate in cents as an integer instead of in dollars as a double. Reporting time worked occurs in seconds. When that reporting occurs, there is a calculation to do to understand how much the Tasker needs to be paid. It might go something like this:

class FloatCalculator
  def initialize(hourly_rate_in_cents)
    @hourly_rate_in_cents = hourly_rate_in_cents
  end

  def total_in_dollars(seconds_worked)
    hours = seconds_worked / 3600.0 # seconds in an hour
    cents = @hourly_rate_in_cents * hours
    dollars = cents / 100 # make it dollars
    dollars.round(2)
  end
end

Over the years, we saw a few errors because of limitations of floating point math. Since then, we have switched to using BigDecimal to prevent these kinds of errors.

There were no visible performance issues, but I recently got curious because all of the docs say that they are substantial. By writing the same calculator using BigDecimal, we can see the difference.

class BigDecimalCalculator
  def initialize(hourly_rate_in_cents)
    @hourly_rate_in_cents = BigDecimal.new(hourly_rate_in_cents)
  end

  def total_in_dollars(seconds_worked)
    hours = BigDecimal.new(seconds_worked) / BigDecimal.new(3600)
    cents = @hourly_rate_in_cents * BigDecimal.new(hours)
    dollars = cents / BigDecimal.new(100)
    dollars.round(2).to_f
  end
end

This does turn out to be significantly slower, so I also benchmarked marking the constants class level variables.

class BigDecimalStaticCalculator
  CONVERSION_RATE = BigDecimal.new(3600) * BigDecimal.new(100)

  def initialize(hourly_rate_in_cents)
    @hourly_rate_in_cents = BigDecimal.new(hourly_rate_in_cents)
  end

  def total_in_dollars(seconds_worked)
    dollars = @hourly_rate_in_cents * BigDecimal.new(seconds_worked) / CONVERSION_RATE
    dollars.round(2).to_f
  end
end

Results

The results are in!

                     user     system      total        real
floats               0.220000   0.010000   0.230000 (  0.231120)
big decimal          4.130000   0.070000   4.200000 (  4.234744)
big decimal static   2.620000   0.040000   2.660000 (  2.680582)

The cost of using BigDecimal to reduce the error rate is a slowdown of 15x-20x compared to floating point math. This is not a big deal in practice. First of all, we need to get the money correct. Secondly, these are times for 250,000 calculations. Any given request only has 1, which is nothing compared to the other stuff (SQL, template rendering) happening.

It seems that the initialization process takes up a lot of time, though, so one technique to speed it up is to pull out constants in your BigDecimal code to the class level. These would only be initialized once, but may also affect the base memory level.

These classes and the benchmark itself are available as a gist if you are interested.