The Blog

Posts from March 2011

Mar 7

Ensuring balanced reads with PyMongo

By Aaron Westendorf

Over the past year we’ve been evaluating and migrating many of our game services over to mongoDB. Its feature set gives us complete freedom to iterate with developers during game production, its performance is sufficient to power our infrastructure during peak load, our operations team appreciate its powerful administration tools, and 10gen has so far proven itself a reliable technology partner and steward of the Mongo roadmap. One of the very important features of PyMongo that we have made use of is the MasterSlaveConnection, but it’s not without its caveats, one of those being that it does not guarantee you are balancing your reads across a replica set.

Before we dig in, a few items of note. First, the situation I’m covering here is specifically when you’re connecting directly to a replica set. There may be a similar pattern to follow on a sharded database, but that’s outside the scope of this post and what our team is working with. Secondly, if you’re not familiar with the MasterSlaveConnection, the gist of the class is that it directs all writes to the current master, and randomly choses a slave for each read-only query. Lastly, we have some pending patches to pymongo 1.9 that I recommend applying before you begin, as they’ll affect your ability to use the MasterSlaveConnection and seamlessly restart or shutdown hosts in your replica set.

The situation we’re correcting for here is where we have a replica set, with one or more slaves, and we want to maintain balanced reads across them. Our application is long running, and at any time the operations team may remove a host from the replica set to perform maintenance. Mongo and pymongo will work together to ensure high availability failover, but once you’ve brought your replica set back to full capacity, your application will not connect to that host unless there’s another socket disconnect, and even then not guarantee it will connect to the original host. You’ve now lost a substantial amount of performance and scalability until your application has restarted.

{% gist 858617 %}

Read through connection() to understand the basics of instantiating a MasterSlaveConnection, or in this case our subclass ClusterConnection. It is initialized with a single Connection to the master, and then for each host in our list, we create a slave Connection. The Connection class is smart in that it can maintain a list of hosts, both configured and discovered from the replica set, and connect to any one of them at any time. The master Connection will always connect to the current master out of our list of hosts and handle when the master changes. The slave connections are configured with slave_okay=True so that they’ll stay connected to whichever host we tell them and failover to other slaves, and _connect=False is passed so that if a slave is not currently available, Connection and MasterSlaveConnection can kindly follow their AutoReconnect paths without interrupting your application. So long as you have just the master up, you’ll never know that the replica set may be impaired at the time you initialize the connection.

The class ApplicationDatabaseInterface is simply an example of whatever you use to maintain an interface to the database within your application. The only requirement here is that we have some ability to cache an existing MasterSlaveConnection and call validate_slaves() on it before we use it. I recommend validating the slaves only once per application “transaction”, however that may apply to your use case (see query() in the example above).

The validate_slaves() method is really where the important work is done. This example implementation will check for us every 5 minutes, but your criteria and interval can be whatever suites your needs. First it uses the same code as pymongo to parse the slave URIs so you can support host and port schemes. It walks through all of the slaves and if the current host and port of a Connection are not in the list of configured hosts, closes that connection and removes it from the slave list. If there are any hosts in our list of slaves which do not have a connection, it then re-populates the slave list with a fresh Connection for that host. The new connection also includes the _connect=False flag, so we’ll never get an AutoReconnect exception in case a slave is not available at the time this code is run.

The reason we remove connections from the slave list is because the Connection class doesn’t maintain its internal list of possible hosts in any priority. If we simply closed the connection it will reconnect, but we aren’t guaranteed that it will reconnect to its originally-configured host, and that’s what we’re trying to ensure.

The last important bit of this is the with_reconnect decorator. It’s not strictly required, but for completeness, you should have something like this in your stack when communicating with Mongo. In addition to connection drops, there are situations such as this where the driver will raise an AutoReconnect because the replica set is in transition. I’ve found that 2 seconds is about the maximum amount of time it takes for pymongo and the replica set to agree that all is well with the world, so this example decorator gives enough time for that situation to resolve itself.

Mar 2

Chai, the simple Python mock

By David Czarnecki

Here at Agora we take testing seriously, insisting that full test coverage always be part of a deliverable and adjusting our schedules accordingly. We are historically a Rails shop, but for a few years we have been developing an extensive Python code base and infrastructure to power our in-game offerings.

We’ve been using Mox as our mock testing solution since 2009, and though it has met all of our functional requirements, we’ve longed for the simplicity and power of Ruby’s Mocha framework. This past Friday we held our 3rd Hack-A-Thon, and Vitaly Babiy and I developed Chai, a mocking framework patterned after Mocha.

{% gist 848107 %}

You can find the latest source and documentation at GitHub, or install from pypi. The current release is 0.1.0 and we’re looking for feedback. We’re adding lots of features and improving the API, so check back frequently.