Error-handling: a fable in code
One of my favorite domains to review in existing applications, because it tends to be so error-ridden, is … error-handling. Too many programmers regard a language’s exception-handling syntax as a solution rather than just a mechanism, so error-handling tends to be misguided or at least neglected. A little more attention in this area often pays off with far greater end-user satisfaction.
Perhaps the hardest part of handling errors is simply to remember that it is programming. I encounter many coders who appear to believe that it’s someone else’s job. In fact handling errors should be a routine part of definition and fulfillment of requirements. Here’s a parable about what often happens with even a single line of code:
An application needs to read a configuration file:
fp = open(CONF, "r")
… until the day
CONF goes missing, and an end-user sees a traceback on her screen. That is clearly not acceptable, and someone quickly rushes
try: fp = open(CONF, "r") except: pass
into production while hunting down
CONF. It turns out that the user had launched the application from a bookmark no one had considered (or disabled cookies, or had customized the installation in an unexpected way, or done any of the other things end-users do). Folklore within the organization concluded that the error was “fixed”, and someone elsewhere coded in protection against the bookmarking …
… until the next time an end-user was clever enough to re-create a similar situation. This time, instead of appearance of a traceback, a distant part of the application broke down. Eventually, after too-much debugging effort, the code in the vicinity of
CONF was upgraded to
try: fp = open(CONF, "r") except: alert_user() return
Business returns to normal …
… until the day an end-user sees a warning on his screen about bookmarks (or cookies, or missing initialization), and is more frustrated than ever, because he already did what the warning advises. After more too-difficult debugging, someone discovers there’s a rare possibility that CONF hasn’t been properly assigned. The coders begin to realize the hazard of a “naked
except“, and more carefully qualify:
try: fp = open(CONF, "r") except NameError: alert_user_about_initialization() return except IOError: alert_user() return
Problem solved …
… until the day a sysad rationalizes networking in the back-office, and a critical file-share ends up with unexpected permissions. An end-user sees a warning about a condition that has nothing to do with firewalls, and is utterly frustrated until someone recognizes that
IOError covers a multitude of causes. Soon our
CONF reader looks something like
try: fp = open(CONF, "r") except NameError: alert_user_about_initialization() return except IOError, e: if e.errno == 13: alert_about_networking() elif e.errno == 2: alert_user() return except: last_ditch_alert()
It’s still not done. This is far from the end. The last iteration above of what started as a single line would eventually toss at least two more as-yet-undiagnosed problems.
Something is clearly wrong. To reach this point involved multiple upset end-users and too-many late-night debugging sessions, and the “hot spot” of the initial
open still is not “bullet-proof”.
This is the point in a tale where I like to present a solution with almost miraculous powers. For this problem, though, there isn’t one; in fact, “error-handling” is so thorny that I’ve already collected a book’s-worth of material on the subject and its remedies. While there are plenty of tips along the way–no bare
except-s, for instance–and articles like “Robust exception handling” do a good job of explaining the basics, the general problem simply lacks a magical solution. IT organizations need to recognize that “error-handling” demands its own analysis, requirements definition, testing, and maintenance. Customers pay for positive features, of course, not for nicely-handled errors, of course. Features-and-functionality need to come first; still, a majority of the time or at least attention in any particular session of use of an application can lie within its error-handling. Improvements in error-handling represent a great opportunity to eliminate distractions so that users can appreciate functionality. Often, the best way to help users see the value of the features in your programs is to make sure errors are handled professionally.