This month we’ll look at several examples of low-level performance analysis related to Oracle and other toolsets common in Operations. First is a nearly invisible change that impacted performance by a factor of ten.
That’s the “incredible behavior” Oracle Certified Master Laurent Schneider reported in a recent posting: two straightforward Oracle SQL queries differed in performance by this rather shocking multiple. Schneider eventually tracked the difference to an
NLS_LANG setting, that is, configuration of the locale behavior for Oracle software. One spoke Swiss German, and one American English.
Is this realistic? In my experience, yes, it is. Some encodings amount to an identity operation, at the same time as others require look-up through two different levels of indirection. While I wasn’t able to reproduce Schneider’s exact results, I’ve certainly seen plenty of cases where encoding matters, even by an order of magnitude or more. This is important because it’s so easy to overlook, yet has the potential for such a large consequence.
One lesson to draw from the experience is the importance of disciplined testing and deployment. Most of us are old hands by now at teaming up with colleagues in different timezones, and we expect our information technology (IT) artifacts to travel more-or-less seamlessly around the globe. Without good understanding of localization issues, though, this can easily break down across national borders. An application or monitor or measurement that works perfectly in the lab might stumble when installed in a datacenter with a different locale. Development and testing teams rarely emphasize localization; problems with localization are too often caught late in deployment. It behooves Operations to recognize the range of difficulties typical of localization confusions.
Ignorance is certainly a factor: too many programmers treat encoding, for example, as a domain best handled by trial and error (“it seemed to work with ASCII characters–I must have the right encoding”). Worse, some languages or toolkits actively impose clumsiness or even error. Python, for example, has long been sensitive to encoding issues, and generally has been a leader in handling Unicode, for instance. Still, the language mishandles several situations that arise when a folder name includes non-ASCII characters. In concrete terms: a team develops a perfectly valid application, packages it according to the latest Python prescription, tests it thoroughly–then is shocked when it immediately fails in a distant datacenter that habitually uses non-English characters in folder names.
The point is not that Python or Oracle Forms is hopeless; far from it. These difficulties are simply typical of the subtle and sometimes maddeningly difficult challenges that arise when an application moves from the calm harbor of a developer’s desktop to full-time, round-the-clock, mission-critical deployment in a typical operations datacenter.
What can we do about them? Plan for maintenance: build functional testing and performance measurement into your applications. Thoroughly automate Operations, so it costs as little as possible to deploy, test, correct, and re-deploy solutions.
Over the next couple of weeks, the Real User Monitoring blog will, among other things, look at a few more examples of subtle technical details in time and validation with dramatic consequences.