Don’t Hit the (Software) Wall!

April 17, 2012

Since the Boston Marathon was run yesterday, I thought this would be an appropriate metaphor.

Imagine running a 26.2 mile race… It’s a long race, but you’re prepared. You’ve run 20 miles every Sunday along with hill workouts and a variety of other distances during the rest of each week to be in the best shape possible. You start the race a little faster than you anticipated based on your split, but you shrug it off. You feel comfortable, so you keep going. You even ignore the early water stations. It’s a crisp, fall day, so heat is not a factor. Everything is in place for a great run.

Almost. The events described above were real. They happened do me. And I should have done things differently. I should have backed off my early pace when I got my split. I should have been taking in water. But when you are young and stupid (I was 20 years old at the time)…

I hit the wall. I started feeling odd at 18 miles, and by mile 20 I was done. The loss of energy and fatigue was sudden, and I had NOTHING left. I walked the remaining 6 miles, trying to find the strength to run every now and then, but I couldn’t. I finished in the middle of the pack, disappointed that I had run myself into the ground. I knew better, but in the moment of the race I thrust common sense aside and went for it.

Despite knowing better, organizations routinely run into the software wall all the time. We’re guilty of approaching work in the same way that has proven not work more often than it does, but expecting a different result.

We start out great. We feel good. Our emphasis is on “productivity,” measured by how fast we can crank out features. Early on, we meet our schedule. Our estimates may have been off, but this is a critical project (aren’t they all?), so management decrees that the team must put in overtime to compensate. This position is justified because the team isn’t meeting its commitment, right?

The problem is, overtime starts taking its toll. A week or two of overtime to get over a hump is one thing, but continual overtime coupled with strong schedule pressure starts driving behaviors that can kill you later. And it won’t be obvious, not at first. But the problems will sneak up on you.

Studies have demonstrated that projects with aggressive schedules and overtime have significantly more defects than other, more sustainable projects. (Check out the paper, Impact of Overtime and Stress on Software Quality by Balaji Akula & James Cusick.)

The danger for software projects lies deeper than the defect counts. There’s more to it than tired people making simple mistakes. Demands of continual overtime for weeks on end, plus relentless schedule pressure create compromises that shouldn’t exist.

Developers get too busy to conduct design or code reviews, or to refactor code when they should. If its 2:00 am and you’re faced with checking in code now so that you can get some sleep before starting on that next feature first thing in the morning or unit testing it a little more, or refactoring the code like you know that you should, what call are you going to make?

Code review at 2:00 am? Not a chance. Code review the next day? Not likely either, the schedule doesn’t permit it. Developers start hoping that they can “come back later to clean up the code.” The big danger is that code craftsmanship takes a back seat, which does not reveal itself as a problem until later in the project. But you will get warning signs.

One is that features will start taking even longer to add because the code base is becoming more complex – and unmanageable. And when one feature is updated, one or two other things break. As defects increase, even more time is diverted away from adding features because developers need to fix more and more problems. And the schedule starts to slide out even further.

Eventually you manage to deliver (most likely not when you thought you would), but the system has become more complicated than it should. If you leave things as they are, features take more time than they should to add, and you’ll always be faced with quality issues. If this goes unchecked, you will hit the wall:

The wall that you hit is the developers telling you – after asking them to add a new feature or two – that, “We can’t add any more features until we rewrite the entire system.” Hitting the wall hurts.

Next post, I’ll illustrate how agile development is a “go slower to go faster” approach that prevents us from hitting the wall.