Last week we had a unique opportunity to observe two web sites experiencing traffic spikes. One came through with flying colors planning for the spike and handling it with aplomb. The other one crashed and burned under heavy traffic pressure.
Let’s look at this tale of two spikes starting with the positive story first.
Last week Instagram, the photo sharing site, launched Instagram for Android. According to an LA Times article, Instagram boasts 30 million iOS users and had an Android waiting list in the neighborhood of 430,000. CBS News reported a million downloads the first day.
And these people aren’t just grabbing the application and leaving. They are very likely uploading and sharing photos to the server farm, the very purpose of the Instagram app, putting additional pressure on the servers beyond the app downloads.
Well, according to a post on the Instagram web site in spite of all that pressure, the site took the licking and still kept ticking. In fact, Instagram is kind enough to list the tools they use to track traffic data including statsd and Dogslow among others. It makes for interesting reading if you’re a web site stats geek and they even encourage conversation with like-minded geeks at the end of the post.
Regardless, Instagram says this arsenal of tools let them stay on top of the traffic and to ride it out without major delays or a total shut down.
Unfortunately, the long-awaited 1940s Census data web site didn’t fare nearly as well. According to an LA Times article, the site, which is run by the National Archives and Records Administration, crashed minutes after its launch, obviously unprepared for the curiosity surrounding the data.
To be fair, the article reports that 1940s site had over 23 million hits in three hours, calling itself a victim of its own success.
While, Instagram was able to stay running, albeit with a million hits, the 1940s site was unprepared for the onslaught.
Every time, I hear a site administrator say he or she was unprepared for the traffic, I shake my head because there are ways to deal with unanticipated traffic spikes. You can use a service like Amazon Web Services to handle any excess capacity you need, and only pay for what you use.
So why wouldn’t you err on the side of safety and do that instead of crossing your fingers and hoping for the best.
Before you slam me for making an unfair comparison, I realize 23 million is a lot more than a million by 22 million, but as a means of comparison, one small company was ready to deal with a significant product launch without crashing, while the well publicized government site was not.
Let this be a lesson for all IT pros charged with keeping your site running. When you anticipate a spike (or you’re launching a new site), assume worst (actually best) case and you get tons more traffic than you ever dreamed of. If you have virtual servers lined up, you should be able to ride out the spike and keep your visitors happy.
If not, you’ll crash and burn like the 1940s census data site did.
It’s worth noting the 1940s census data site is working fine today.