Lessons from SharePoint Fail-over Demo

Earlier this week at the SharePoint Conference keynote address, Richard Riley, director of technical product management for SharePoint at Microsoft, gave an impressive demonstration of SharePoint’s fail-over capabilities.

Now to be sure, this was not a typical scenario. The rack that Microsoft showed off on stage was of mini-Watson proportion. It boasted, for example 4 TB of storage and 1 TB of memory (yes, memory). Microsoft pulled together quite a package. They created an index of more than a million documents. You can get full details on a similar test Microsoft did last June in a White Paper on the Microsoft SharePoint Server website.

But all this was not just to impress a room filled with geeks–although there was an element of that to be sure; this was a user conference, after all–it was to prove SharePoint’s fail-over capabilities. And to do this, they literally pulled the plug on the main server and we waited along with them with baited breath to see how long it would take for the fail-over server to take over.

At first, Riley refreshed the browser and saw a message that the database was not available. This was live and real and it was quite dramatic, but less than a minute later, the monitors showed the database had failed over to the backup and everything was up and running again.

As a monitoring professional, it would have been a demo that made you stand up and take notice because chances are your goal is to keep those mission critical applications going. If you have a database failure, you want it to be over so quickly that your users never even knew it happened.

Now, Riley fessed up later on that they had sped up the database monitor polling time from 15 seconds to two to speed things up for the purposes of the demo, and they had also used a beta version of the SQL database server that facilitates this type of fail-over more easily, but they really pulled the plug and watched with the rest of us (although I’m sure they rehearsed it a number of times beforehand to have at least a sense that it would work as planned).

But as one marketing VP I spoke to put it, if he had millions of dollars worth of hardware, he could fail over that quickly too. His feeling was that it wasn’t a very realistic demonstration because most data centers aren’t dealing with hardware configurations anywhere close to what Riley demonstrated at the SharePoint conference.

That’s fine, but the lesson here is not that you have to invest in millions of dollars worth of hardware to protect your most critical applications from failing. The lesson is that you should have these types of systems in place to get your critical applications up and running automatically as quickly as possible.

It sure would be fun to have all that hardware at your disposal, but for mere mortals, just making sure you take reasonable precautions is going to be good enough most of the time.

