Wide Awake Developers

« July 2007 | Main | September 2007 »

On the Widespread Abuse of SLAs

Technical terminology sneaks into common use. Terms such as "bandwidth" and "offline" get used and abused, slowly losing touch with their original meaning. ("Bandwidth" has suffered multiple drifts. It started out in radio, not computer networking, let alone the idea of "personal attention space".) It is the nature of language to evolve, so I would have no problem with this linguistic drift, if it were not for the way that the mediocre and the clueless clutch to these seemingly meaningful phrases.

The latest victim of this linguistic vampirism is the "Service Level Agreement". This term, birthed in IT governance, sounds wonderful. It sounds formal and official.

An example of the vulgar usage: "I have a five-day SLA."

It sounds so very proactive and synergistic and leveraged, doesn't it? Theoretically, it means that we've got an agreement between our two groups; I am your customer and you commit to delivering service within five days.

A real SLA has important dimensions that I never see addressed with internal "organizational" SLAs.

First, boundaries.

When does that five day clock begin ticking? Is it when I submit my request to the queue? Or, is it when someone from your group picks the request up from the queue? If the latter, then how long do requests sit in queue before they get picked up? What's the best case? Worst case? Average?

When does the clock stop ticking? If you just say, "not approved" or "needs additional detail", does that meet your SLA? Do I have to resubmit for the next iteration, with a whole new five day clock? Or, does the original five day SLA run through resolution rather than just response?

An internal SLA must begin with submission into the request queue and end when the request is fully resolved.

Second, measurement and tracking.

How often do you meet your internal SLA? 100% of the time? 95% of the time? 50% of the time? Unless you can tell me your "on-time performance", there's no way for me to have confidence in your SLA.

How many requests have to be escalated or prioritized in order to meet SLA? Do any non-escalated requests actually get resolved within the alloted time?

How well does your on-time performance correlate with the incoming workload? If the request volume goes up by 25%, but your on-time performance does not change, then your SLA is too loose.

An SLA must be tracked and trended. It must be correlated with demand metrics.

Third, consequences.

If there is no penalty, then there is no SLA. In fact, the IT Infrastructure Library considers penalties to be the defining characteristic of SLAs. (Of course, ITIL also says that SLAs are only possible with external suppliers, because it is only with external suppliers that you can have a contract.)

When was the last time that an internal group had its budget dinged for breaking an SLA? What would that even mean? How would the health and performance of the whole company be aided by taking resources away from a unit that already cannot perform?  The Theory of Constraints says that you devote more resources to the bottleneck, not less. Penalizing you for breaking SLA probably makes your performance worse, not better.

(External suppliers are different because a) you're paying them, and b) they have a profit margin. I doubt the same is true for your own internal groups.)

If there's no penalty, then it's not an SLA.

Fourth, consent.

SLAs are defined by joint consent of both the supplier and consumer of the service. As a subscriber to your service, I can make economic judgments about how much to pay for what level of service. You can make economic judgments about how well you can deliver service at the required level for the offered payment.

When are internal "service level agreements" actually an "agreement"? Never. I always see SLAs being imposed by one group upon all of their subscribers.

An SLA must be an agreement, not a dictum.

 

If any of these conditions are not met, then it's not really an SLA. It's just a "best effort response time". As a consumer, and sometimes victim, of the service, I cannot plan to the SLA time. Rather, I must manage around it. Calling a "best effort response time" an "SLA" is just an attempt to deceive both of us.