Singletons, Threads, and Flexibility

In Ruby, we often like very simple APIs, but this often comes at the price of thread safety and code flexibility. I’ve found that if you use a few tricks from the start, you can get the best of it all.

I recently did a project where I tried to use the VCR gem, but it went awry when working in multiple threads. This is a great gem that, like many of my own, falls into the trap of module/class level singleton configuration/execution.

This is approach is characterized by things like extend self in the top-level module and then having instance variables at that level. This is not to call out VCR specifically. it’s just my most recent example of hundreds of gems that take this overall approach.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
module VCR
  extend self

  def current_cassette
    cassetes.last
  end

  def configure
    yield configuration
  end

  def configuration
    @configuration ||= Configuration.new
  end

  private

  def cassettes
    @cassettes ||= []
  end
end

When operating on multiple threads, things get wacky because of this because they are sharing this current_cassette and writing to the associated file. You end up with recordings on top of each other.

I am inclined (and some say over-inclined) to use singletons to do something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
class VCR::Client
  def current_cassette
    cassetes.last
  end

  def configure
    yield configuration
  end

  def configuration
    @configuration ||= Configuration.new
  end

  private

  def cassettes
    @cassettes ||= []
  end
end

module VCR
  extend self

  delegate :current_cassette, :configure, :configuration, :to => :default_client

  def default_client
    @default_client ||= Client.new
  end
end

The most common use case of the module doesn’t change at all because I delegate everything to a default one. You can still do:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# global
VCR.configure do |c|
  c.cassette_library_dir = 'fixtures/vcr_cassettes'
end

class Fetcher
  def initialize(path)
    @path = path
  end

  def fetch!
    VCR.use_cassette('#{@path}/fetched') do
      response = Net::HTTP.get_response(URI("api.http://example.com/#{@path}"))
      process(response)
    end
  end
end

class Main
  def process_all
    self.paths.each do |path|
      fetcher = Fetcher.new(path)
      fetcher.fetch!
    end
  end
end

and it will use the default_client.

But this whole scheme now allows my threaded code to do something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
class ThreadedFetcher
  def initialize(path)
    @path = path
  end

  def vcr_client
    return @vcr_client if @vcr_client
    @vcr_client = VCR::Client.new
    @vcr_client.configure do |c|
      c.cassette_library_dir = "fixtures/vcr_cassettes/#{@path}"
    end
    @vcr_client
  end

  def fetch!
    # the same original code would probably work but I like it even more separated 
    # that is, move the @path into client init above
    vcr_client.use_cassette('fetched') do
      response = Net::HTTP.get_response(URI("http://api.example.com/#{@path}"))
      process(response)
    end
  end
end

class Main
  def process_all
    mutex = Mutex.new
    queue = self.paths.dup

    self.thread_count.times.map {
      Thread.new do
        while path = mutex.synchronize { queue.pop }
          fetcher = ThreadedFetcher.new(path)
          fetcher.fetch!
        end
      end
    }.each(&:join)
  end
end

Clearly there is more code, but it is now 8x (or whatever) faster.

One example that I’ve seen done really well in this way is the twitter gem and others that seems to follow that pattern like octokit which I used for hubtime in such a threaded way.

Again, I’m not calling out VCR or anything and I’m sure I’ve trivialized the complexity involved. I would love to put a pull request link to VCR here, but alas, for another time.

If you do this from the beginning, though, it can be a strong win with minimal overhead. It adds multi-threaded capabilities as well as the ability (such as with twitter) to work with two different users in your app without changing anything global.

Comments

Copyright © 2017 Brian Leonard