Wouldn’t it be great if instead of watching your systems all day, seeing an issue, tracking the problem and fixing the problem; the problem simply identified itself and the system found a way to correct itself? This is precisely what VMware is trying to do with its new Automated Service Assurance idea.
But I’m sure it’s a vision that’s easier to conceive on paper, then it is to implement in practice. Writing on the Virtualization Practice blog, Bernd Harzog had this to say about it:
While the idea of a system that automatically heals itself, or automatically delivers the right level of service to your most important applications and customers is incredibly appealing, the reality is that his is very hard to accomplish.
No doubt, it does sound appealing. It would make monitoring a whole lot easier, but in reality, there is a dizzying array of possibilities to monitor including hardware, software and various systems that work in different ways with different contingency application. Is it the database causing the problem or maybe the web server or maybe a bad part on one of your servers?
What’s more as Harzog points out, what causes x problem on one system, might not be the same cause on another, nor will it require the same solution. How do you program for all those contingencies.
I look at how bad automated translation services are as an example of just how hard this is to do. Translation deals with fixed languages. Granted there are huge variations on grammar, syntax, idioms, cultural touchpoints and so forth, but if programmers have problems with translation, I can only imagine the challenge it would be to provide self-healing computer systems with far greater complexity involved.
Perhaps, if you were to drink the VMware Kool-aid and went all VMware, all the time, with most of your systems virtualized, it would be an easier nut to crack, but I doubt there are many shops doing that out there. And even IT departments that have bought into VMware’s vision have to have critical systems that have yet to be virtualized.
And just because they *say* it’s self-healing, I’m betting there would be a fair amount of work to make that happen and that there will be times where it just doesn’t work as planned. If you doubt that, just look at yesterday’s post regarding the role of backup systems in last week’s Southwest blackout. They were supposed to kick in when something went wrong, only they failed too.
But the fact that VMware is talking about this and has reportedly made some progress to make it happen is an impressive feat and I don’t want to minimize that. I just want to apply a little good old-fashioned skepticism here because this is going to be very difficult to pull off and do it consistently well so that you can count on it.