Bloat costs millions
“‘Application Bloat’ Costs Businesses Millions, Survey Says” is a headline that just appeared this week. My experience with surveys leaves me very wary about the correspondence between what the respondents had in mind when answering, and the conclusions drawn by survey analysts. Still, when 30% of all respondent “estimate that slow, crashed or unresponsive applications cost their businesses at least” a million dollars daily, something serious is surely going on.
“Bloat” has several meanings. Harris Interactive seems to blame at least part of the problem on “unused or little-used apps”. My speculation is that someone has in mind what I call services, more than “apps”. It’s common to find whole servers or at least virtual machines configured to provide services that are no longer needed, have been obsoleted, and often are active security risks. While I wouldn’t have thought to label that situation “application bloat”, it certainly deserves attention.
It also illustrates a value of application performance management (APM) beyond the simple-minded model of measurement of response time of a single Web page retrieval. As the “Real User Monitoring” blog frequently emphasizes, APM is both deeper and broader than that: it’s deep in encompassing the entire end-user experience (EUE), and ideally as broad as the whole of information technology’s (IT’s) responsibilities. A good APM dashboard should be the locus of control for all services and applications. In that sense, one effective plan for trimming bloat is to adopt the rule: if an application’s not under APM, then it’s not supported, and it’s time to disconnect-decommission-delete-end-of-life that application.
The first step in attacking bloat is to map all resources to the applications consuming them, and from the applications to the business values those applications promote. Any unattached branches of that diagram are candidates for removal.
That’s far from the only contribution of APM. Some of the most difficult problems I’ve had in DevOps have been with data-dependent hanging applications: applications or services that behave “in-spec” most of the time, but occasionally encounter a request or input that sends them churning grossly disproportionate system resources. These can be both difficult and frustrating to diagnose, and detrimental to all the other services in the same resource pool. Such a case effectively demands not just monitoring strong enough to determine that something has gone wrong and shut it down, but also enough traceability or introspection to relate the bad behavior to inputs which determine it.
Once enough APM is in place to remove the noise and distraction of the kind of application bloat described here, then it often becomes evident that well-behaved applications are over-provisioned. They can perform adequately with smaller and less expensive resources. In any case, the relation between resources and business requirements is clearer, and more easily managed.
Another trend promoting the attack on application bloat is improved chargeback. With departments paying explicit usage costs in public or private clouds, there is plenty of incentive to make sure that computations and the services they support are efficient.
In many organizations, it’s been rational until now to focus on new work. DevOps savvy is in such short supply that the people who can plan a successful clean-up of bloat and waste already have their hands full with more important matters, and a few orphaned servers in the datacenter have been an acceptable price to pay. Improved APM is exactly the tool that can improve this situation.