Wide Awake Developers

« April 2007 | Main | June 2007 »

ITIL and Extreme Programming

Esther Schindler asked if I'd be willing to post my earlier article on staying agile in the face of ITIL at CIO.com.  How could I say no?  The piece is here.

 

ITIL and XP

The Agile Manifesto is explicit about it. "We value individuals and interactions over processes and tools." How should an Agile team---more specifically, an XP team---respond to the IT Infrastructure Library (ITIL), then? After all, ITIL takes seven books just to define the customizable framework for the actual practices. An IT organization usually takes at least seven more binders to define its actual processes.

Can XP and ITIL coexist in the same building, or is XP just incompatible with ITIL? In short: no.

ITIL and XP (or agile in general) are not fundamentally incompatible, but there will definitely be an interface between the XP world and the ITIL world. Whether this interface becomes an impedance barrier or not depends entirely on the way that your company chooses to implement ITIL.

I'll run down the Service Support processes and identify some of the problems I've encountered. (I'm focusing on Service Support because businesses tend to implement these processes first. Few of them get far enough down the road to really attack the Service Delivery processes. It's a shame, because I see a lot of value in the Service Delivery approach.) I will cover the service delivery processes in a future article.

Service Desk

An effective service desk can be a great asset to any team, including an XP team. Getting accurate feedback on issues your users are having can only benefit your development efforts and ultimately, the users themselves. The key here is to make sure that the service desk is well-prepared to accept responsibility for support calls on your app.

I strongly recommend that you start working with the service desk at least six weeks before your first application release. If the service desk is mature, they'll have job aids for capturing app support needs. These will provide the minimum initial information needed for the knowledge base. The service desk personnel will augment that knowledge base over time with whatever solutions, rumors, superstitions and folk remedies they come up with. Be sure you have access to the knowledge base, so you can help weed out the "false solutions."

You also want to get on the distribution list for ticket reports from the service desk. These will tell you what issues your users are encountering. Commonly recurring or high-impact issues should become cards for consideration in your next iteration. This feeds your interface to the Problem Management process.

If the service desk is not mature, you haven't prepared them well, or they do not perform resolution for application incidents, you will be looped in as part of the Incident Management process, below. This has some special challenges.

Incident Management

ITIL defines an "incident" as any disruption to the normal operation of a system or application.  This includes bugs, outages, and even "PEBKAC" problems.  The Incident Management process begins with notification of an incident.  This can be logged by the service desk in response to a user call.  It can even be automatically created by a monitoring system.  It ends when normal functioning of the system is restored.

Note that this does not include root cause analysis or correction!  Incident Management is all about restoring service.

Ideally, the service desk handles the entire Incident Management process and your team will not even need to be involved.  In less ideal cases, you may be called on to help resolve "novel" incidents--ones that do not have a solution in the service desk's knowledge base.

When incidents come into the development room, you have some negative forces to deal with. By definition, the incident needs to be resolved expeditiously, making it both interrupt driven and urgent. Therefore, every incident will automatically split a pair and take somebody off their card. This is damaging to flow.

In worse cases, the entire team may get derailed and start huddling around the incident. Fire-fighting is exciting quadrant I work. It's natural to get a rush from being the hero. The problem is obvious, though.  If the entire team is chasing the incident, nobody is making forward progress on the iteration. If you have a large user community or a lot of incidents, you can lose an entire day---or an entire iteration---before you realize it.

This can be exacerbated if your service desk never resolves application support incidents. In such cases, I recommend the "Designated Sacrifice" pattern. Assign one member of the team to handle the "Bat-Phone" calls and be the primary point of contact for incident resolution. This is a crappy job---you get pulled away constantly, can't maintain focus, get almost no card work done---so you'll want to rotate that position frequently. (On the other hand, there is that hero factor that provides some consolation.) Even doing it for one full iteration can be very demoralizing.

Problem Management

Recurring incidents can be identified as Problems that require correction. This is the job of the Problem Management process.

Identifying a Problem is often done by the service desk, but it can also come from other quarters. The decision about which Problems require correction often becomes very slow and bureaucratic. This is a process you want to work with very closely. Problem Management typically tolerates a much higher level or outstanding defects than an XP team wants to allow. I've seen teams get chewed out for fixing Problems that weren't scheduled to be addressed for a couple of iterations! Imagine how surreal that meeting feels!

Problem managers should be encouraged to write cards. Your team should even reserve a fraction of your velocity in each iteration just to handle Problems. You also need to communicate back to the problem managers when Problem cards are completed. Really good Problem Management identifies a few problem states such as "known problem", "known workaround", and "known solution". An XP team will typically move through these states pretty quickly.

Bear in mind that the ITIL definition of Problem Management is all about oversight, not the actual changes needed to fix the problem.  The actual changes are deployed as part of Release Management.

Change Management

No part of ITIL gives more people cold sweats than Change Management.  This is the process that so easily slips into heavyweight bureaucracy or, worse, meaningless CAB meetings.

Change Management as defined simply means tracking changes, their impact to configuration items, and ensuring that changes are applied in an orderly way.  It doesn't have to hurt.

In reality, however, XP teams will spend a lot of time preparing for change advisory board meetings. Beware: the XP team may get a bad reputation for creating "too much" change.

I recommend standardizing your change and deployment process. Get into a regular rhythm of releases and deployments so the CAB just knows to expect that every third Tuesday (or whenever), your team will have a deployment. Standardize the deployment mechanics and system impact statement so you can templatize and re-use your change requests. Familiarity will create confidence with the CAB. Constantly showing them change requests they've never seen before will raise their level of scrutiny.

Failed changes also trigger more scrutiny. Your XP team will have an advantage here, because your rigorous approach to automated testing will reduce the incidence of failed changes, right?

Configuration Management

Configuration Management is *not* the act of changing configuration items. It's the process for tracking planned, executed, and retired configurations. As you plan each release, you should identify the CIs that will be affected by the release.

In a well-executed ITIL rollout, configuration management is vital for change management, incident management, the service desk, and release management. In a poorly-executed ITIL rollout, configuration management doesn't exist, or it only addresses servers or network devices.

CM should cover servers, network topology, applications, business processes, documentation, and the dependencies among all of them. That way, proposed changes to one CI (e.g., upgrade to front-end firewalls) can be analyzed for its impact. This is CM nirvana, seldom achieved.

The XP team should have an advantage here again, because you've already broken story cards down to tasks at the beginning of an iteration. That means you already know which applications and servers will be changed in that iteration. Roll up a few iterations into a release, and the CIs affected by the release should be well known.

On the other hand, if you've taken XP to its "no documentation" extreme, then you will not have tracked the CIs touched by each iteration. This underscores a common misinterpretation of XP; it doesn't eschew all documentation, just the documentation that doesn't add value from the customer's perspective. So, does tracking changes against CIs add value from the customer's perspective? Not directly, no. There is an indirect benefit, in that the customer will receive better uptime and performance, but that may seem remote to the team. The best I can say is that this is one place where you'll have to chalk it up to "necessary overhead".

Release Management

This is an easy one to integrate with your XP team. Release Management dovetails quite naturally with XP's release planning cycle. Engage early, though, because the ITIL process will likely require longer lead times than your team is used to.