adrift on a cosmic ocean

Writings on various topics (mostly technical) from Oliver Hookins and Angela Collins. We have lived in Berlin since 2009, have two kids, and have far too little time to really justify having a blog.

Asynchronous MySQL queries with non-blocking readiness checks

Posted by Oliver on the 17th of February, 2013 in category Tech
Tagged with: asynchronousiomysqlmysql2rubyselect

Well, despite my best intentions, here I am again writing Ruby. I decided to automate a small part of some data analysis I've had to do a few times, starting with the database queries themselves. Unfortunately the data is spread over several hosts and databases and the first implementation simply queried them serially. The next iteration used the mysql2 gem's asynchronous query functionality but still naively blocked on the results retrieval rather than polling the IOs to see when they could be read from.

It doesn't actually add anything to my script to do this, but it seemed like a small learning opportunity and somewhat interesting so here is the guts of that code:

# Set up the three different clients to different hosts, and a starting time
client1 =
client2 =
client3 =
start_time =

# Run the first query
query1 = "SELECT SLEEP(5);"
puts "Running Query:\n >> #{query1}\n\n"
client1.query(query1, :async => true)

# Run the second query
query2 = "SELECT SLEEP(10);"
puts "Running Query:\n >> #{query2}\n\n"
client2.query(query2, :async => true)

# Run the third query
query3 = "SELECT SLEEP(15);"
puts "Running Query:\n >> #{query3}\n\n"
client3.query(query3, :async => true)

# Wait for results. Select on query IO sockets to rapidly determine
# when the results are ready
puts "Waiting for results..."
query_ios = [IO::open(client1.socket), IO::open(client2.socket), IO::open(client3.socket)]
orig_ios = query_ios.length
received = 0

while true
  # select with a timeout of 1 second
  ready = IO::select(query_ios, nil, nil, 1)
  next if ready.nil? # next iteration if nothing is ready
  ready_reads = ready[0] # IO::select returns [[rd],[wr],[ex]]

  received += ready_reads.length
  print "\rReceived #{received}/#{orig_ios} results after #{( - start_time).round} seconds" if ready.length > 0

  # Remove any IOs from the list to select from, if they are ready.
  query_ios -= ready_reads
  break if query_ios.length == 0 # Nothing left to select from
puts ""

# Now all results are ready
res1 = client1.async_results
res2 = client2.async_results
res3 = client3.async_results

The code is pretty simple and the comments should reveal the intent of any confusing lines. The only part that was slightly irritating was receiving file descriptor numbers from Mysql2::Client#socket rather than the IO itself, hence having to re-open the same file descriptor.

In this case I haven't done anything fancy after checking when the results are ready, but you can see how this could be trivially turned into a system for querying multiple backends for the same data and returning the fastest result which is a quite popular pattern at the moment.

© 2010-2018 Oliver Hookins and Angela Collins