Last week, the University of North Carolina Blackboard site, the one that students access for all their class information, went down while students were on break, leaving students without access to this crucial information — due to a load balancer issue taking down the site.
According to an article on The Dailytarheel.com, the college’s newspaper web site, the problem happened the week of October 23rd when students were on Fall break. When they tried to access the site, instead of getting class information as they normally would, students instead encountered error messages.
Interestingly enough, the problem wasn’t caused by a traffic spike on the Blackboard site itself, which is usually able to handle all the traffic it has to, but because of traffic to an unrelated site that shared a load balancer tool.
The load balancer forced some of that other site’s traffic onto the Blackboard server causing it to overload. According to the Dailytarheel.com article, the university is implementing a new system next year, which should handle disruptions like this better, but the older Blackboard system couldn’t deal with connectivity problems.
Michael Barker, assistant vice chancellor for infrastructure and operations and chief technology officer quoted on the web site admitted this was an unusual problem. “An unusually large load of traffic on a website that uses the same load balancer as Blackboard made its connectivity suffer, he said. “The cause of the problem was completely independent of Blackboard,” Barker was quoted on the Dailtarheel.com web site.
This is an interesting case for a number of reasons. First of all, a tool meant to help balance traffic levels across servers ended up taking down an unrelated system. That’s a fairly unusual problem and one that could be hard to track down.
It was about as crucial a system as you could hope to find from a student perspective because it includes all lecture material, handouts, class notes and other materials a student needs to participate in a class. And with many classes today being online only, accessing this system is even more crucial.
When a system goes down like this in an unexpected fashion, it might leave the IT staff unprepared because the root of the problem wasn’t even related to the performance of the actual Blackboard site, but to another site.
It’s these types of challenges that monitoring professionals face on a daily basis trying to keep a company’s, or in this case, a major university’s, mission critical systems running.
When an unrelated system is the problem, it could be difficult for the staff to track it down and solve the issue, but to their credit the IT pros at UNC found the problem and fixed it.