The Blog

Posts from 2011

Oct 19

Up and running with Zappa, Coffeescript & Mongoose

By Brian Corrigan

At Major League Gaming we recently launched an internal profile service using NodeJS. In an effort to get more familiar with Node, I decided to use it on my latest personal project. In addition, since CoffeeScript is now included out of the box in Rails 3.1 I decided it was time to try that too. Here’s a simple app to get you up and running with Node, Zappa, Coffeescript & Mongoose.

To get it working:

  • Install MongoDB
  • Install NodeJS
  • Install NPM
  • npm install -g mongoose
  • npm install -g zappa
  • Run the gist ‘coffee’

Dig it?  Follow me  @genexp

Oct 4

How to: Create a user on S3 and grant access to a bucket

By Brian Corrigan

In our never ending quest to spend more time on software creation and less time on software administration Clarke and I moved our content team from a self managed FTP site to an S3 account.

Besides, administering an FTP site is so 1971.

It took an hour or two to figure out the proper permissions, so I’m documenting them here for posterity. Read on if you’re interested.

  1. Login to the S3 management tool and create a bucket. Write down the name!
  2. Login to the Amazon Identity & Access Management tool and create a group.
  3. Create users and add them to the group you created. Write down the secret key and access key for each user.
  4. Edit the group and attach a group policy. Here’s the one I used.
  5. Download Cyberduck - It’s cross-platform, open source and is simple to setup and use. While you’re there donate a few bucks. { "Statement": [ { "Action": [ "s3:ListAllMyBuckets" ], "Effect": "Allow", "Resource": "arn:aws:s3:::*" }, { "Action": "s3:*", "Effect": "Allow", "Resource": [ "arn:aws:s3:::my_bucket", "arn:aws:s3:::my_bucket/*" ] } ] }
  6. Open up Cyberduck, choose Amazon S3 and test using a Secret Key/Access Key pair.


Sep 30

Make Your Databases Work Smarter, Not Harder

By Jason LaPorte

When optimizing code, you profile it, find out where it’s spending most of it’s time, and figure out how to make it spend less time there. As a sysadmin, I tend to wear a lot of hats, but the one I wear the most by far is the DBA hat–I spend far more time hacking on MySQL optimizations than any particular other thing I do. So, in order to make myself more efficient, if I can find a way to streamline the time I spend with MySQL, I can reap the biggest benefits.

It turns out doing so isn’t so hard. Here are some tricks I’ve learned.

Now, the reason I spend so much time with MySQL is that it’s a prime cause for bottlenecks in our applications. Those bottlenecks tend to fall into three categories:

  1. The queries cause tables to lock, halting further queries until the first completes.

  2. The queries are improperly indexed, requiring queries to examine or sort the entire table when they don’t have to.

  3. The queries act on more data than the system can adequately manage at once.

There are different solutions to each of the above problems. I’ll walk through a few of them, but before I do, let me start with the lowest-hanging fruit that is common to all three: upgrade to the latest version of MySQL, and then upgrade to Percona Server. Percona Server is a drop-in replacement for MySQL than has tighter optimization and better monitoring features. And when I say drop-in replacement, I really mean drop-in replacement: installing it is literally trivial. We’ve done so on both Ubuntu 10.04 Server and RHEL 5, and the process in both cases was backup your data (just in case), remove the MySQL Server package, and then install the Percona Server package. That’s it. All of your client libraries will continue to work, you can still use handy performance tools like MySQL Tuner and MyTop, you can even continue replicating to your non-Percona slaves (which you should also upgrade, of course). It grants you instant benefits for virtually no investment.

After that, we can’t know what process to take to optimize a query without knowing what queries need to be optimized. We at Agora use several tools for tracking them down. The main tools we use are:

  1. Munin, which is trending software: it tracks statistics over time and plots them onto a graph. While we’re not 100% happy with Munin, it’s issues are easy enough to work around (should you be at the scale that requires working around them), and there’s nothing simpler for getting up and running quickly. This can query MySQL’s statistics counters and plot them, so you can see all manner of data on how your server is behaving: from memory usage, to breakdowns of query types, to number of slow queries, to disk access patterns, and so on.

  2. The MySQL slow query log, which simply logs any queries that took longer to execute than some (easily configured) threshold.

  3. The EXPLAIN SELECT query will identify how MySQL interprets a given query, allowing you to identify how expensive your queries are, and how well-utilized your indices are. (EXPLAIN is very powerful, but also confusing. I recommend looking it up on Google. Here is an explanation in further detail.)

  4. The MySQL SHOW PROCESSLIST query, which can help identify locking queries while they’re running. One notable benefit of Percona is that it is much better about showing the current state of a query, and how many rows are being acted upon, than MySQL, which makes it easier to track down what the problematic part of a query is.

  5. MySQL Tuner is a little Perl script that helps identify common causes on a MySQL Server by running SHOW VARIABLES and comparing the output to several heuristics. While this isn’t as useful when you have a lot of experience tuning MySQL’s variables, it’s enormously useful if you havn’t.

Our usual process is to identify when we hit performance issues with Munin, check the slow query log at those times to identify problem queries, and then use EXPLAIN, SHOW PROCESSLIST, and MySQL Tuner (and/or looking directly at SHOW VARIABLES, if you know what you’re doing) to help identify why the queries are problematic.

While I’d love to break down every possible symptom I’ve seen and how it can be fixed, that could easily fill a book. (And, in fact, a quick search on Amazon reveals that it already does.) So, you’re just going to have to use your intuition (and Google) to track it down yourself. The most important quality is having an inquisitive nature. If you see that something’s wrong, instead of looking for a magic word that fixes it, try to understand what’s going on behind the visible symptoms, or try to understand why the magic word works.

Let me walk through common solutions to the three above types of problems to get you started.

Let’s suppose that you’re finding, via SHOW PROCESSLIST, that a number of processes are piling up due to a locked query. What would you do to fix that? I would first start by looking at the table’s engine (most easily seen by SHOW CREATE TABLE). If the table is MyISAM, try converting it to InnoDB. InnoDB has a lot more overhead, but in exchange it never locks a whole table at once; so while individual queries will require more time, they won’t preclude other queries from running. Let’s suppose that your tables are already InnoDB, or that for some reason they can’t be (for example, you’re relying on FULLTEXT indices, which InnoDB does not support). The next thing I’d check is to ensure your various server tuning variables are adequately sized. This is a very complex topic with lots of things to fiddle with, but I recommend running MySQL Tuner as a starting point. Common problems are setting the MyISAM key buffer or the InnoDB buffer pool too small, causing your database to swap. If you’ve already checked that, then your problem may be the efficiency of your indices.

You can identify how your indices are being used by running EXPLAIN to see how the query is being executed. If it’s doing a full table scan, creating a temporary table, or performing a filesort, you should either create a new index or restructure the query. Additionally, be aware that indices add overhead: whenever the table is modified, each index needs to be updated as well. Oftentimes the overhead in doing so can cause locks, especially on expensive indices (such as indices on a large number of columns, on large text columns, or any FULLTEXT indices). You may need to evaluate how the table is being accessed and remove any unnecessary indices, so there’s less overhead needed. For example, we had a number of FULLTEXT indices on Gamebattles that were causing a large number of table locks in order to keep them updated. We ended up moving to Sphinx for FULLTEXT indexing and removing the indices in MySQL.

Finally, the greatest bane of MySQL is hitting disk. If you ever have to swap memory to disk ever, then your database performance will drop like a rock. If Munin or MySQL Tuner indicate that your disk usage is too high, look at increasing your buffer sizes. If you’ve maxed your system’s physical RAM, then you should buy more. If you have adequate RAM but your disks are still too slow, buy faster disks. When in doubt, spring for SSDs.

While this is a pretty quick overview of how to approach debugging inefficiencies in your databases, I hope that you’ll find it helpful. If you’re interested in more detailed reviews of any particular point (as there’s an almost bottomless amount of practical wisdom to be acquired when dealing with MySQL), feel free to let me know via the comments.

Sep 23

The Art of Refactoring

By Jason LaPorte

GameBattles is one of our most popular sites, with an active user base in the millions. While we’ve dealt with sites in that scale before, this particular case has been no walk in the park. While we’re a Rails shop, the fact that the site is written in PHP actually has nothing to due with the difficulties we’ve encountered. The real problems are that the codebase is almost a decade old, has been developed by dozens of developers over that time, has grown organically with a clear roadmap defined only recently, and has almost never been refactored.

So, at various intervals over the last year, I (and others) have been digging into the site and trying to clean up what we can. It seemed that the process of what weve been doing is interesting, but talking about it at any length has been something I’ve hesitated to do, as I don’t like talking about something without quantifiable statistics. But, the fact of the matter is, programming is an art, not a science, and anyone who says different is selling something. And the real trick with refactoring is that it doesn’t meet any immediate business goals: it’s purpose is entirely human, as it entirely exists to make a clean and sane working environment. All of the tangible business gains (security, performance, developer velocity, etc.) are all secondary, and so unless your managers “get it,” it can be a hard thing to argue for.

In our case, it’s no secret that GameBattles' stability was going downhill, and so we needed to do something. And the only way for us to be able to audit for problems was to make the codebase manageable.

Let me give you a sense of where we started: the code was in version control, but only in the technical sense of the word; in reality, SVN was being used more as an excessively complicated rsync—it was a means to transport code from one developer machine to many web servers. The repository was over two gigabytes in size, about a third of which was PHP code, and the remainder being static assets. (And this is in addition to static assets housed elsewhere, such as on our CDN.) There were at least four PHP frameworks and three CMSs. Some of the code was object-oriented PHP5, but plotting a class diagram would have required at least four dimensions. If you think you’ve seen spaghetti code, think again: walking into this mess was like walking into the alien hive.

There is no magic in refactoring. When the scale is monumental, you should fully expect nothing other than a long, tedious trek. On any journey, though, you’ll need a guide. Here are some of the rules of thumb that I’ve been going by:

Always, always start with the low-hanging fruit. Your task is both difficult and boring. Don’t make it any harder than it already has to be.

Similarly, start with broad strokes. Try to find where you’ll experience the biggest gains first, and don’t work on things with small returns until you’re reasonably sure there’s nothing bigger.

However, working on anything is better than working on nothing, so don’t dither. Be decisive.

Make lists. I, personally, keep three: alive, dead, and unsettled, each referring to files in the codebase. Alive files are those known to be (at least partially) necessary to the core functionality of the site. Dead files are those known to be not referenced by anything alive. Unsettled files are those you just don’t know enough about. Never fail to keep these lists up to date, even though you’ll be changing them almost constantly. Yes, it’s tedious, but it’s necessary.

Use common sense. Trust your feelings. If something seems important, it probably is. If something seems useless, it probably is. If the files havn’t been updated in over two years, they’re very likely dead.

Sometimes you’re lucky and you can nuke entire files. Sometimes you’re not and you have to perform surgery on files to remove the parts that don’t need to be there. Don’t be afraid to dive in, but remember: don’t get in over your head. If you need to make a lot of changes for a small gain, move on to somewhere else for now. You can always come back later.

If you’re not on a UNIX system, get on one. Your best friends in this process are _ find_ and _ grep_, and I can’t even imagine where you’d begin without them. Find is a tool that lets you scan for files on disk based off of a number of parameters—name, type, last updated time, etc. Grep, of course, lets you search the contents of files. You can chain them together, too: I have probably typed find . -type f -name ‘*.php’ | xargs grep ‘search term’ more times than I can count—that snippet will search all PHP files in (or below) the current directory and tell you which ones contain the search term. (You can also just use grep -R ‘search term’ .—which does nearly same thing—if you don’t have a lot of binary files that will take up a lot of search time.) This is particularly handy for seeing if a class or database table is referenced somewhere.

In fact, the above is true pretty much universally. If you’re not an expert with find and grep, and how to use them together, become one. There are many good resources on how to use them.

Looking at the database is a great heuristic for what’s important and what’s not. Recent versions of MySQL (which is what we use for GameBattles' backend) actually have a lot of useful metadata hiding in the information_schema.TABLES table—things like how large a table is (both in rows, and bytes), when it was last written to, and so on. If you find tables that haven’t been updated in a year, then they’re probably not used, and any files that reference them probably aren’t either. Additionally, if you find tables that are written to but never read from, they’re dead. Delete the references and then delete the table.

Make backups of everything. Date them. Have a rollback plan in writing for when you delete something a little overzealously, since your brain is going to be far too fried when you’re in the thick of it to do anything other than follow instructions.

Once you’ve gotten rid of something, deploy it immediately. Not only will you find if you broke the world while what you did is still fresh in your mind, but it’s also remarkably cathartic to put some closure on what you just accomplished. And you will need the positive reinforcement.

I cannot emphasize the human factor of the above notes enough. The entire point of refactoring is maintaining your sanity. Don’t sacrifice it in the process.

While the labor is toilsome, the rewards are unparalleled. GameBattles codebase is currently an eighty-megabyte Git repository, less than four percent of the size of what we started with. The number of potential security holes has dropped like a rock. Our uptime has gone from less than two “ nines” to almost four, and is continuing to improve. Our developers are moving faster than ever. Our users are happier. And, most importantly, I havn’t lost my mind. (Yet.)

Aug 22

My grandfather is on Facebook, why aren't you?

By Leslie Brezslin

The world is changing. It’s a blue sphere in constant motion; every second of the day produces a new life changing thought. Whether that thought is acted upon or publicized however, is debatable. Considering that fact, one of the greatest things about human nature is that we are prone to adaptation. As our environments change (consequently we are normally the cause of change), we find ways to change with. Now that that’s out of the way, marketing is a topic that’s been around since the establishment of trade. How else would you draw people to your wooden establishment and convince them that your corn is better than your neighbor’s?

The reality is that marketing has been an essential part of history and those who effectively master the far from dying art, are the most successful. Lets look at Apple for example. Their focus has been brand management since CEO Steve Jobs hopped on board. Because of this, they’ve gone from near bankruptcy to one of the top companies in the world. As of August 1, they held more cash on hand than the United States treasury. Are they selling any products that are necessary for survival of mankind? Are their products much different from those on the market?

The answer to both questions in my opinion, is no. I consciously know that there are phones that can do much more than the iPhone can. I also know that there are phones that are much more customizable and free from Apple’s unreasonable restrictions, but I want an iPhone. I know that a Windows computer has the same capabilities if not more than Mac computers, but I want the Mac even though it costs more. Why do think that is?

It’s all about the way Apple markets their products. They’ve successfully made it to the point where it’s sort of a social “norm” to have certain items. Apple products are now for the most part, fashion statements. I remember when the iPod first came out. Everyone at school wore them on their belt buckles with their shirts just above the clip so that it could be seen. Guess what, if you didn’t have one, you felt left out. Fast-forward to a few years later and it’s on a greater scale, meanwhile the main beneficiary is Apple.

So now that we’ve established the importance of marketing, how has it changed over the years? Well for one thing, you’re dealing with a new generation. Before, you had all the power and control when it came to your audience. Flash the ads on the television or radio and no matter how reluctant a viewer, the reality was that they had no choice but to watch. That was then, nowadays, people have much more control. With things like Tivo and the Internet, your audience decides what to watch and when to watch it (same goes for all other types of media). Not only that, the generation isn’t on the floor with their mom and dad, watching The Brady Bunch anymore. They’re on Facebook, Twitter or YouTube and spending countless numbers of hours a month there (I know because I am one of them). So how can you reach an audience that is no longer passive and won’t sit through your boring pitches? It’s simple, change your tactics.

First, go to where they are; don’t wait for them to come to you. Second, ask what they would like to see or what types of products they’d like to see advertised. If your product isn’t included on their lists, then you save you time and money. The fact is, people are unlikely to buy a product that they’re not in the market for. There’s been several times on YouTube where I’d be watching an ad and right before the time is up, they ask if I found the ad relevant. I normally said no in hopes of eliminating them completely but if it were for a product that I was in the market for, I would have definitely said yes.

If you look at some of the ads on those videos, they are a lot of the times relevant to what the video is showing. The fact of the matter is, if I typed in “ how to play blackbird” into YouTube, there is a huge probability that I’m interested in the guitar. Maybe I want a new one or maybe I want local lessons. At this point I don’t even watch television anymore so I, like many others, won’t sit and endure just any product pitch.

As far as Facebook and twitter, those are respectively the 2nd and 9th ranked sites on the Internet. That being said, there is a high probability that your market is there, so why aren’t you? Every company should have a Facebook and Twitter profile in my opinion, even if it is just informative. The point of those sites is self-expression. As it is “cool” to own Apple products (I think I might be on my way to becoming a fan boy), I want to “like” Apple on Facebook so that my network knows, “ hey, I’m cool”. As dorky as that sounds, that is the simplified version of the image you need to create for your company. People should want to be associated with it. Additionally, I’ll add that if you have people on your Facebook constantly pitching products at me, I will “de-friend” or “unlike” you.

That brings us on to the Twitter regime. Twitter is an amazing way to interact with your customers (the key to modern marketing – interaction). With it, you can easily find out what they like, their issues and inform them of your upcoming products. Like anything however, it can easily be abused. Consider this situation, you are following 50 people; one of those is a company. We’ll say that the remaining 49 update their statuses about 5 times a day. If you as a company are updating you status every thirty minutes, you’re spamming that person’s feed. So what’s the result? Your number of followers goes down by one whether or not they liked you. One of our practices at Agora is to update the status every 2 hours and respond to questions every hour. This way we remain visible and give our community an efficient response time without spamming everyone else.

So what are the lessons to be learned here? Target your market, go to the places that they spend most of their time and be courteous. They now have the power to choose what they will allow themselves to be exposed to so keep it relevant. Additionally, one last piece of advice, don’t use them to market your products alone. Give people information that they would like to know about as well. For example, we do not work with nor endorse Game Informer in any way shape or form. However, they post information on their site that is relevant to our audience, information that we do not specialize in producing. Since the majority of our community are gamers and would like to be kept informed, we link to some of their updates. So now that you know what it takes to formulate a success like Apple’s, what will you do with it? Let us know in the comments below or on Twitter.

Aug 12

What is it like interning with MLG?

By Brian Corrigan

What is it like Interning at Major League Gaming?

“I honestly can’t find the difference between an intern and a regular employee. Employees get one trip to a related conference every year; I was also sent to one. Employees may go to MLG events and I’ve been to two. I deploy apps to production on a daily basis, make design decisions when required and am involved in the process from top to bottom. That is why I feel MLG has been a great experience, because I’ve been completely immersed in the software development process. “– Matt Perry (Intern)

You’ve learned a bit in school, but don’t know if you’re ready for the elite work force. Well, most of us have been there so we understand. I guess one of the royalties of an internship is that you’re not expected to know everything. School has set the foundation for you and now its time to start conditioning your catalogue with experience. That’s the exact same feeling that troubled me in the beginning of this summer, so I figured that my perspective might be useful to anyone considering an internship at Major League Gaming.

To begin, you’re treated and welcomed like any other employee. Also aside from compensation, the main difference between an intern and a full time person is that the interns are expected to make basic mistakes. They know that most of us lack experience and probably won’t know exactly what we’re doing so we’re granted mentors. Additionally, the work that you are given is not very different from the projects that you’d be working on as a full time employee. So, with that in mind, exude the thought that you’ll be assigned tasks like operating the fax machine, making coffee, or fetching the mail when it arrives. The work that you are assigned is very valuable and is usually necessary for projects to progress.

“When you first start, you are taught the basic processes that allow to you integrate with the MLG workflow. After you’ve become familiar with these processes, depending on your role, you may start tending to issue tracking tickets, working on independent projects or working with your team on high-level projects. One thing is for sure: you will be challenged, so bring your A-game.” – Cameron Cope (Intern)

At some point they we’re all students so they understand the pressures and troublesome schedules that you’ll show up with. As long as you’re willing to apply yourself and learn, they are willing to work with you. Also, a lot of the current employees started out as interns so there’s always the possibility of being hired full time. It all really depends on your aspirations.

Aug 11

What do Zombies, devs and 300 have in common? Hack-a-thon!!

By Brian Corrigan

August 4th 3:00 PM: Excitement looms in their eyes as they await the fourth coming. An extraordinary combination of producers, interns, devs and more, unify to kick off the event of the century: Hack-A-thon 4. What is a Hack-a-thon you ask? Well, I’m glad you did because only the foolish can live in ignorance of the mankind’s greatest celebration. It is only that moment that we’ve all dreamed of and considered impossible as children. The one moment that all of our parents told us about, but never truly believed we’d actually see. The period when you’re granted the highest level of honor and allowed 24 hours to work on any project that should strike your chord of desire. Brilliant, isn’t it? Now, the Agorians are an interesting people. Masters of the event is only the least that could be said of their Hack-a-thon abilities.

Seeking the guiding words that will resound with them through the night’s perils, they sit restlessly as they await the official orders of the commander.

In a flash, he enters and our leader speaks. The speech rebounds off of the walls with power and echos through everyone’s mind. Simultaneously, he is demanding yet asking, telling yet not, requiring but not really. More or less something like this:

Having internalized the words that would hype them up to a sense of self-belief and desire to work, the diverse people of the Agora rejoice. They then disperse to attack the monstrous tasks that they have chosen to resolve.

Agust 4th 7:00PM: Although four hours have passed, the Agorians are still hard at work. Their determination prevents any pesky amount of noise or discomfort from discouraging them and preventing the success of the vision at hand. They have a task to fulfill and intend to do so. Bright eyed and in high spirit, they continue to bang away at their instruments.

Some have even been inspired to a level where chairs are irrelevant; working while standing up, an unnecessary yet amazing triumph.

August 5th 1:00 AM: Signs of tire begins to appear as the day progresses…

August 5th 5:00AM: The Agorians have taken the ultimate form of humankind. The zombie. Driven by their goals, the Agorian continues. He continues working without the slightest thought of rest.

Having diligently continued regardless of the obstacles, 3:00 PM arrives and the event is concluded. All of the Agorians meet at the same place that their tasks were appointed to speak of their successes and failures. Another grand Hack-a-thon has been successfully completed. Many accomplishments came out of this event and will be implemented into the natural flow of life soon. Visit our Facebook for more photos and because we’re cool.

Aug 10

Chef Explosion

By Waldo Grunenwald

Here at MLG, we use a product from Opscode called Chef to manage our server environments. Chef allows us to reliably manage our infrastructure by providing us with the ability to write code that describes how a server should be configured. While not perfect, it has served us well.

Chef leverages CouchDB for it’s datastore. CouchDB is “NoSQL” database product, similar in concept to MongoDB. CouchDB provides a lot of features and usability, but as a tradeoff for versioning, speed, and convenience it sacrifices disk space. In OpsCode’s documentation, they do helpfully point out in the “CouchDB Administration for Chef Server” page that you should periodically run a Compaction. Basically what this does is remove some of the older versions of documents.

Following their advice, we set it up as a weekly cron (in our Cron cookbook, naturally), and so it looks like this:

cron 'Compact Chef DB' do
  user 'nobody'
  weekday '1'
  hour '4'
  minute '0'
  command 'curl -X POST http://localhost:5984/chef/_compact'

which results in a crontab entry that looks like this:

# Chef Name: Compact Chef DB
 0 4 \* \* 1 curl -X POST http://localhost:5984/chef/_compact

One fine summer morning I came in one morning to several thousand emails saying “Chef Run Failed.” This, as you may understand, severely degraded my opinion of the morning.

Cue the Swedish Chef crying “Bork Bork Bork!”

After I determined that a full disk was the problem and deleting an old unneeded backup file to get some headroom, I found that the biggest contributor was the /var/lib/couchdb/0.10.0/.chef_design directory.

root@chefserver:/var/lib/couchdb/0.10.0/.chef_design# ls -lh
 total 96G
 -rw-rw-r-- 1 couchdb couchdb 30M 2011-07-21 18:26 07ccb0c12664d1f1ca746003182b521a.view
 -rw-r--r-- 1 couchdb couchdb 1.7G 2011-05-11 12:03 178087e2a7c06ff437482555acf60bab.view
 -rw-rw-r-- 1 couchdb couchdb 8.5G 2011-07-22 08:24 18757f7428c465dd0504ac3d5d7ce577.view
 -rw-rw-r-- 1 couchdb couchdb 8.9G 2011-07-22 08:24 367772ed026257ff1f88a1011576c9c3.view
 -rw-rw-r-- 1 couchdb couchdb 6.6M 2011-07-21 15:52 3970d32b6acb424bb4d19684bdf9aff1.view
 -rw-r--r-- 1 couchdb couchdb 8.6M 2011-07-22 08:11 91188e3c7d61bdf079eee6ca719be05c.view
 -rw-rw-r-- 1 couchdb couchdb 6.0G 2011-03-16 16:44 9f39fce5f578a23cc8cad7b3fe9b8ce9.view
 -rw-r--r-- 1 couchdb couchdb 1.4G 2011-07-22 08:24 af280ad217f6edca6276d1d1bcbc069d.view
 -rw-rw-r-- 1 couchdb couchdb 19G 2011-05-11 12:00 b96879fe1377e2b91f228109f3aac384.view
 -rw-rw-r-- 1 couchdb couchdb 565K 2011-07-20 09:31 be708387555557a5b4886292346da6bb.view
 -rw-rw-r-- 1 couchdb couchdb 3.0M 2011-07-20 11:27 d381d1f4b207dc3d9624720a7e88f881.view
 -rw-r--r-- 1 couchdb couchdb 51G 2011-07-22 08:20 fe06cf9119d23dd7fec2492b79e7ebef.view

I was surprised that there was so much disk use, since we had been running the Chef Compactions, and expected this kind of thing to be taken care of. Wondering if it was throwing some kind of error that we weren’t seeing (since it’s running as a cron), I ran it manually:

root@chefserver:~# curl -H "Content-Type: application/json" -X POST http://localhost:5984/chef/_view_cleanup

Which yielded:

root@chefserver:/var/lib/couchdb/0.10.0/.chef_design# ls -lh
 total 70G
 -rw-rw-r-- 1 couchdb couchdb 30M 2011-07-22 08:43 07ccb0c12664d1f1ca746003182b521a.view
 -rw-rw-r-- 1 couchdb couchdb 8.5G 2011-07-22 08:43 18757f7428c465dd0504ac3d5d7ce577.view
 -rw-rw-r-- 1 couchdb couchdb 8.9G 2011-07-22 08:43 367772ed026257ff1f88a1011576c9c3.view
 -rw-rw-r-- 1 couchdb couchdb 6.6M 2011-07-22 08:43 3970d32b6acb424bb4d19684bdf9aff1.view
 -rw-r--r-- 1 couchdb couchdb 51 2011-07-22 08:42 7bbcbf585caef33abc0733282f40a22a.view
 -rw-r--r-- 1 couchdb couchdb 8.6M 2011-07-22 08:43 91188e3c7d61bdf079eee6ca719be05c.view
 -rw-r--r-- 1 couchdb couchdb 1.4G 2011-07-22 08:42 af280ad217f6edca6276d1d1bcbc069d.view
 -rw-rw-r-- 1 couchdb couchdb 573K 2011-07-22 08:43 be708387555557a5b4886292346da6bb.view
 -rw-rw-r-- 1 couchdb couchdb 3.0M 2011-07-22 08:43 d381d1f4b207dc3d9624720a7e88f881.view
 -rw-r--r-- 1 couchdb couchdb 51G 2011-07-22 08:26 fe06cf9119d23dd7fec2492b79e7ebef.view

Well, that was a significant but only partial win. Why do I still have 70GB in .view files?

What Opscode hasn’t told us about is that CouchDB has a thing called “Views”, and these can - over time - come to take up space. A lot of space. (CouchDB views are the “primary tool used for querying and reporting on CouchDB documents” according to the CouchDB Wiki.) Opscode also hadn’t mentioned that CouchDB says that these, too, need to be compacted.

The good folks on the internet, notably the CouchDB docs and a question on StackOverflow “CouchDB .view file growing out of control”.

Among our findings we came upon this link to the Compaction page in the CouchDB Documentation.

My compatriot Jeff Hagadorn and I were both looking into identifing the design view names, and he beat me to the solution:

bash -c \'for x in checksums clients cookbooks data_bags environments id_map nodes roles sandboxes users; do curl -H "Content-Type: application/json" -X POST http://localhost:5984/chef/_compact/$x ; done\'

(I had found a posting on the couchdbkit Google Group describing a script a user had written to solve this very problem here, if you prefer a Python-based solution which doesn’t require you to know your view names.)

After doing that, our disk was in a much healthier state, and our chef-db-compact recipe now looks like this:

cron 'Compact Chef DB' do
  user 'nobody'
  weekday '1'
  hour '4'
  minute '0'
  command 'curl -X POST http://localhost:5984/chef/_compact'

 cron 'Compact Chef Views' do
   user 'nobody'
   weekday '1'
   hour '5'
   minutes '0'
   command 'bash -c \'for x in checksums clients cookbooks data_bags environments id_map nodes roles sandboxes users; do curl -H "Content-Type: application/json" -X POST http://localhost:5984/chef/_compact/$x ; done\''

which produces a crontab that looks like this:

# Chef Name: Compact Chef DB
 0 4 \* \* 1 curl -X POST http://localhost:5984/chef/_compact
 # Chef Name: Compact Chef Views
 0 5 \* \* 1 bash -c 'for x in checksums clients cookbooks data_bags environments id_map nodes roles sandboxes users; do curl -H "Content-Type: application/json" -X POST http://localhost:5984/chef/_compact/$x ; done'

Now, you may suggest that we mount this location on a separate disk. The answer is that we had. /var/lib/couchdb is a separate 100GB physical disk. The problem was that /var/log is on the / partition, and that is only a 7GB disk. Once the views had filled their disk, the couchdb and chef logfiles had swelled with errors, and even mighty logrotate could only held them off for so long.

Bear in mind that there was no impact to Production during this event; the only outcome was that new changes would not have been able to be pushed out via Chef, and a couple of filled inboxes. Nevertheless, this highlighted some of our flaws. The most important of which is that our monitoring of the server was imperfect, and we missed the alerts that the CouchDB disk was filling. Had we not missed those alerts we could have diagnosed this before it was a problem.

As an aside, in addition to just alerting when disk reaches a certain capacity, you should also watch for sudden increases in utilization. If a particular disk normally runs at 20% capacity, but overnight a logfile swells the disk to 73%, it won’t trigger your “75% Full” alert, but there is very likely a problem. One way to solve this is to record a “previous percentage” and compare that to the “current percentage” and alert in the event that there is a sudden increase.

(NOTE: Server Names have been changed to protect the guilty.)

UPDATE: I’m notified by the Senior Systems Admin at Opscode that they have added these compactions to the chef-server recipe. Their implementation is quite a bit different than ours, but no matter.

Harold “Waldo” Grunenwald Systems Engineer @gwaldo

Aug 4

Play and Learn

By Brian Corrigan

When I was in grade school, I had your typical learning experience. My teacher stood up in front of us everyday and lectured about something that I usually found utterly boring. Fast-forward to college and its still pretty much the same ordeal, except now there are more students. Having said that, you can understand the significant amount of envy that poured from my eyes as I read about the new ways that classes are being taught; one of which is video games. Imagine you went to school excited everyday because calculus was no longer your boring system of numbers and formulas. To add to it, you’ve also sparked a desire to learn because it now becomes a fun challenge. Well that’s the situation with these 6th graders who are learning via the Jason Experiment.

Of course, gaming is still dominated by its entertainment persona, but today we’ve seen it evolve and disperse into several different categories. One of these is education. Nowadays, a simple Google search will easily lead you to sites that specialize in this area. We’ve also witnessed it make its way into the mobile phone section and become one of the benchmarks of its category list. Educational gaming has been an aid in the Jason project, which focuses on learning through interactivity. The concept has also been used to support learning of subjects like energy flow or even adult popular subjects like politics or finance. Residing in its early stages of practical, productive use, can you see gaming evolving to become sufficient in other aspects of society?

Jul 29

Who Reads the Manuals?

By Leslie Brezslin

So I recently ordered my own copy of Mortal Kombat and it arrived just a few days ago. Once in my impatient hands, I savagely tore open its containing box without a fragment of mercy; ending with something similar to this:

After that, I proceeded to remove the inferior plastic barrier that stood between me and the future of gory deaths that I’d be responsible for. Having taken care of the plastic with my razor sharp teeth, I pried open the casing open to reveal a fairly well decorated DVD and a booklet on the side. Naturally the booklet never experienced the faintest drop of my attention, but the DVD was pampered with the greatest of care. I removed it from its casing, popped it into my PS3 and enjoyed the awkwardly menacing side of me that enjoyed achieving the bloodiest of fatalities.

Today, with most things, you normally learn as you go along. Somehow, I find myself remembering how I use to value those little booklets that I now show no love to. In the past, before delicately removing the DVD, I would have carefully removed the manual and used it to avoid the learning curve. Now, I realize that it had been a long time since I had read a game manual and will probably be an even longer time before I read another one. So I wonder, are game manuals worth it or have they become a waste of our limited resources? I personally think they should be available on the website or on the game disk. Of course, I haven’t ruled out that I may be the only one who doesn’t care for the manuals, so let me know your thoughts on our Facebook or below!