When Target’s web site went down last Tuesday for the second time in 6 weeks, it had to be a red flag for the company’s IT staff that something was very wrong, and they better find a way to fix it before Black Friday later this month.
When RIM had a three-day outage earlier this month, it was embarrassing and damaged the company’s reputation, but I’m sure one outcome it never expected was a lawsuit.
When the UNC Blackboard site went down last week, it turned out to be a traffic spike on an unrelated site and a load balancer was at the root of the problem.
When the federal government launched an updated version of the USAJobs.gov web site recently, it was hampered by performance issues and problems with the search engine, issues that should have been picked up in pre-launch performance testing.
Scott Noteboom, the man who helped build Yahoo!’s cloud infrastructure recently left the company and joined Apple just in time to help build up iCloud.
It’s possible that when we have too much monitoring information, we lose site of actual problems in the sea of data, and we need to find a way to make sense of it all.
Don’t wait until your web site is under stress from holiday shoppers to make your adjustments. The time is now to do some testing to make sure your web site is tuned and ready when those holiday shoppers show up.
When Blackberry went down last week, it hurt its diminishing reputation, but the failure to communicate for more than two days didn’t help matters. It’s worth noting that my informal research found the few users I asked weren’t affected by the outage.
While application and web site monitoring tools are great for understanding how well your system is performing, you can also use these tools for monitoring resource usage for chargeback purposes in a private cloud environment.
The applications that should be keeping you up at night aren’t the ones you’re monitoring. They’re the ones you’re not.