- October 13, 2011
Knowledge is Power
Information drives knowledge, therefore information is power.
Earlier this year I started the P90X workout regimen. The program is 90 days long and emphasizes “muscular confusion” through a variety of cross-training exercises. Throughout the program, Tony, your lovable yet demanding trainer, reminds you again and again to write down how many reps you do. Whether it’s 10 pull ups, 30 squats, or 15 push ups, you’ll be reminded to “write it down” each and every time. It gets monotonous, sure. It seems silly at first, yes. But in the end it’s probably the single best way to ensure you get the most out of the program. The worksheets are designed for quick comparisons of your success. With these worksheets, it’s easy to track your progress over time and see that your upper body is getting stronger or that your lower body is remaining stagnant, for example. With this information, you’ll know what areas need work and what areas need rest.
Toning your website
Flash forward several months, I’ve completed P90X, and find myself working on scaling a web service from several thousand to a few million visitors. Luckily, scaling has been a hot topic lately. Whether it’s Instagram discussing their sharded databases or WordPress users defending against the ‘Fireball’ effect, there is a wealth of information out there covering web application scaling. So technically, it’s easy to answer the question “how do you scale a database?” or “how do you distribute your web servers?” A much trickier question to answer is “what do you scale?” It’s specific to your application and carries greatly between systems, platforms, applications, and developers. For example, caching may not help Instagram at their traffic levels, just as sharding may not be worth the complexity for a simple WordPress site. So, taking a page out of Tony’s book, “write it down” and with a little luck the answer will jump right out at you.
These are the ways I “write it down” when looking to scale an application:
MySQL Slow Query Logging
Probably the easiest to implement of these suggestions, MySQL Slow Query logging is a simple flag in your MySQL configuration. Enabling slow query logs will log any query that takes longer than one second (the default) to execute. tail
-ing this log while the site is live will show you, in real time, exactly what queries are dragging their feet. If you see a few queries show up more frequently than others, then you’ve got a good place to start your optimization.
Lesson learned: Logging slow queries quickly identifies trouble spots in scale.
Cron logging
Not every site utilizes a cron script (running periodically, cumulating data), but if your site does, don’t send the output to /dev/nul
as is so common in cron
examples. In a recent project, we had a cron
script running every half hour. This script logged its execution time every time it ran. Watching the log, I noticed that the script was adding around 15 seconds every time it ran (as it poured over larger and larger data sets). That would be an extra 12 minutes every day for this script to execute. Obviously, this is a problem. If the script takes 12 minutes one day, and 24 minutes the next day, on the third day it’s going to take 36 minutes to execute. As a result, on the third day, for 6 minutes, I’ll actually have two scripts executing at the same time doing, mostly, the same thing. Luckily, logging showed me this and I was able to optimize the script well before any overlap happened.
Lesson Learned: Never send cron output to `/dev/null`.
Page load time
Monitoring load time is another simple test. At Happy Cog, we prefer Pingdom for all of our monitoring needs. Pingdom will not only track page loads but also graph the data (over time). This allows us to monitor code deployments against load time to determine if updates are causing load times to go up, down, or remain the same. Between Pingdom’s graphs and our SCM’s date stamps, we can pinpoint exactly what code caused a change in our load time.
Lesson Learned: Page load time and code deployments can be directly related.
Page load weight
A bit more difficult to set up but incredibly useful. Web servers can be memory hogs as they loop through entries, iterate over users, and parse complex template languages. Within PHP, there’s an excellent command called `memory_get_usage` to tell us exactly how much memory went into outputting a specific page. We log this data on each page load so we can track changes in weight over time. This tells us, primarily, where our web servers are spending most of their time as well as how lengthening comment threads and archiving pages affects load.
Lesson Learned: Page weight can change over time as content changes.
What about…
There are a million and one other metrics you can capture, analyze, and dissect. Some people prefer to monitor CPU usage over time, others, available RAM. More important than any single metric, however, is that you have some metrics. These metrics will allow you to answer: Why are we looking at sharding our database? What benefit will we gain by adding another web server? Unless you have that data in front of you, you’re just shooting in the dark.