Volume 24 Number 12
Code Cleanup - Using Agile Techniques to Pay Back Technical Debt
By David Laribee | December 2009
In every codebase, there are the dark corners and alleys you fear. Code that’s impossibly brittle; code that bites back with regression bugs; code that when you attempt to follow, will drive you mad.
Ward Cunningham created a beautiful metaphor for the hard-to-change, error-prone parts of code when he likened it to financial debt. Technical debt prevents you from moving forward, from profiting, from staying “in the black.” As in the real world, there’s cheap debt, debt with an interest lower than you can make in a low-risk financial instrument. Then there’s the expensive stuff, the high-interest credit card fees that pile on even more debt.
Technical debt is a drag. It can kill productivity, making maintenance annoying, difficult, or, in some cases, impossible. Beyond the obvious economic downside, there’s a real psychological cost to technical debt. No developer enjoys sitting down to his computer in the morning knowing he’s about to face impossibly brittle, complicated source code. The frustration and helplessness thus engendered is often a root cause of more systemic problems, such as developer turnover— just one of the real economic costs of technical debt.
Every codebase I’ve worked on or reviewed contains some measure of technical debt. One class of debt is fairly harmless: byzantine dependencies among bizarrely named types in stable, rarely modified recesses of your system. Another is sloppy code that is easily fixed on the spot, but often ignored in the rush to address higher-priority problems. There are many more examples.
This article outlines a workflow and several tactics for dealing with the high-interest debt. The processes and practices I’ll detail aren’t new. They are taken straight from the Lean/Agile playbook.
The Case for Fixing Debt
The question, “should we fix technical debt,” is a no-brainer in my book. Of course you should. Technical debt works against your goals because it slows you down over time. There’s a well-known visualization called the cost of change curve (see Figure 1), that illustrates the difference between the 100-percent-quality-test-driven approach and the cowboy-coder-hacking-with-duct-tape approach.
Figure 1 Cost of Change Curve
The cost of change curve illustrates that high quality, simple, and easy to follow designs may cost more initially, but incur less technical debt—subsequent additions and modifications to the code are less costly over time. In the quality curve (blue), you can see the initial cost is higher, but it’s predictable over time. The hack-it curve (red) gets a lower cost of entry, but future development, maintenance, and the total cost of owning a product and its code becomes ever more expensive.
Ward Cunningham’s First Law of Programming” (c2.com/cgi-bin/wiki?FirstLawOfProgramming) states, “lowering quality lengthens development time.”
“Quality software takes the least amount of time to develop. If you have code that is simple as possible, tests that are complete and a design that fits just right, additions and changes happen in the fastest possible way because the impact is lowest. Consequently, if you hack something out, the more you hack the slower you go because the cost of addition or change grows with each line of code.”
Simply put, technical debt will decrease the throughput of your team over time.
One of my great joys in software development is relishing the feeling of raw productivity. The competing and converse feeling, for me at least, is pain. It’s painful when I’m not productive and it’s pain that robs me of potential productivity, the so-called “good days at work.” There are many sources of pain in software development, but none more obvious than a rigid and chaotic codebase. This psychological effect takes a toll on team morale which, in turn, causes productivity to lag.
In order to fix technical debt, you need to cultivate buy-in from stakeholders and teammates alike. To do this, you need to start thinking systemically. Systems thinking is long-range thinking. It is investment thinking. It’s the idea that effort you put in today will let you progress at a predictable and sustained pace in the future.
Perhaps it’s easiest to explain systems thinking with an analogy. I live in downtown Atlanta, Georgia, in a quaint little neighborhood called Inman Park. I’m mostly very happy there. I reserve, however, some irritation related to the seemingly complete ignorance of urban planning. The streets in Atlanta are byzantine, maze-like, madness-provoking. When you miss your turn, you can’t simply loop back. If you do, you’ll be sent on a spiraling path to who-knows-where. There seems to be little rhyme or reason to the planning of roads in this otherwise very pleasant corner of the world.
Contrast this with the orderly streets and avenues of Manhattan in New York City (most of it, anyway). It’s as if a Marine Corps drill instructor designed the city. Avenues stretch the length of the island, north to south, and streets form tidy, latitudinal markers down its length. Furthermore, both streets and avenues are named in numerical sequence: First Avenue, Second Avenue, 42nd Street, 43rd Street, and so on. You’ll rarely walk more than a block in the wrong direction.
What are the root causes for the difference between Atlanta and New York City in this dimension of comparison?
In Atlanta the streets were formed by cattle wearing down paths. You heard me right, cattle paths. Some need arose to frequently move between the urban center and the suburbs, at which point some cowboy thought, “golly, wouldn’t it be easiest to turn these here cattle paths into roads?”
The New York State Legislature applied vision and forethought to the design of the ever-growing and largest city in the state. They chose the gridiron plan, with orderly, predictable streets and avenues. They were thinking of the future.
This story gets to the essence of systems thinking. While legislative processes are slow, investment in time and commitment to vision pays the greatest dividend for the lifetime of a system. True you’ll have to deal with crazy cabs on the mean streets of Manhattan, but you’ll be able to find your way around in a matter of days. In Atlanta, it’s been a year of getting lost, and I thank the system thinkers responsible for the Global Positioning System (GPS) each and every day.
Products over Projects
The idea that you have a development team that completes a project then throws it over the wall to a maintenance team is fundamentally flawed. Make no mistakes, you are working on a product, and if it succeeds, it’s going to live a long, long time.
If you have even a couple of years of experience as a professional developer, you’ve probably experienced the increasing gravity effect. You develop a piece of software that isn’t meant to last or be complicated or change. And six months later, what are you doing? Modifying it? Extending it? Fixing bugs?
Useful software has a sometimes nasty habit of sticking around for a very long time. It’s up to you to pick the metaphor you want to roll with. Will you tend a forest of beautiful California Redwoods, living entities enduring the centuries and reaching the highest heights, or will you allow the relentless Kudzu vine to starve your forest of light?
At this point, I hope I’ve convinced you that technical debt can take an awful toll on both your mental health and your customer’s bottom line. I hope, also, that you accept the need to take a longer-range view on the products you’re creating.
Now let’s figure out how you can dig yourself out of this hole.
No matter the shop, my sense is that the basic workflow for tackling technical debt—indeed any kind of improvement—is repeatable. Essentially, you want to do four things:
- Identify where you have debt. How much is each debt item affecting your company’s bottom line and team’s productivity?
- Build a business case and forge a consensus on priority with those affected by the debt, both team and stakeholders.
- Fix the debt you’ve chosen head on with proven tactics.
- Repeat. Go back to step 1 to identify additional debt and hold the line on the improvements you’ve made.
It’s worth mentioning, for the software process nerds out there, that this workflow is adapted from a business management approach called the Theory of Constraints (ToC) created by Eliyahu Goldratt (goldrattconsulting.com). ToC is a systems-thinking model that provides a framework for improving the overall throughput of the system. This is a gross simplification, but ToC is predicated on the idea that a system (a manufacturing facility, for example) is only as productive as its biggest bottleneck. Value, such as a feature request or automobile or any sellable item, is conceived of, designed, produced, and deployed. A feature may be requested by a customer, internal or external, and that feature flows through your business (the system), transforming from an idea to a tangible result. What happens when these features pile up in front of your quality assurance team? What happens when there’s more demand for development than a development team can fulfill? You get a bottleneck and the whole system slows down.
It’s very likely that you have many areas of debt—many bottlenecks—in your codebase. Finding the debt that slows you down the most will have the greatest net effect on increasing your throughput. Understanding, then tackling debt and resulting improvements as a team—as a system—is the most effective way to make positive change, because more eyes and hands on the code equates to less risk and better designs.
Identify Areas of Debt
It’s important that you be able to point at the problem areas. If you haven’t been keeping track of them on a wiki or a shared list or in code comments, your first task is to find the debt.
If you’re working on a team, I suggest calling a meeting to develop a concrete list of the top areas of debt in your code. An exhaustive list isn’t important. Focus on capturing the big-ticket items. This meeting is your first opportunity, as a leader on your team, to start forging consensus. A more-than-simple majority of members should agree and understand an item for it to make the list.
Once you have the list, make it durable. Create a wiki topic, write it on a whiteboard (with “DO NOT ERASE” written prominently in one corner), or whatever works in your situation. The medium you choose should be visible, permanent and easy to use. It should be in your face on a regular basis. You need to return to this list and groom it. Human beings have a limited amount of short-term memory, so I suggest keeping a list of between five and nine of the most bothersome items. Don’t worry so much about keeping an inventory—important items will surface again if they’re really, well, important.
Using Metrics to Find Trouble Areas
Sometimes it’s hard to find debt, especially if a team is new to a codebase. In cases where there’s no collective memory or oral tradition to draw on, you can use a static analysis tool such as NDepend (ndepend.com) to probe the code for the more troublesome spots.
Tools are, at best, assistive or perhaps even a second choice. Tools won’t tell you what to do. They will, however, give you inputs to decision-making. There is no single metric for code debt, but people who work on a product day in and day out can surely point to those dark corners that cause the most pain. Static analysis tools will tell you where you have implementation debt. Sadly, they will not tell you where you have debt due to factors like poor naming, discoverability, performance, and other more qualitative design and architectural considerations.
Knowing your test coverage (if you have tests) can be another valuable tool for discovering hidden debt. Clearly, if there’s a big part of your system that lacks solid test coverage, how can you be certain that a change won’t have dramatic effects on the quality of your next release? Regression bugs are likely to appear, creating bottlenecks for QA and potential embarrassment and loss of revenue due to customer-found defects.
Use the log feature of your version control system to generate a report of changes over the last month or two. Find the parts of your system that receive the most activity, changes or additions, and scrutinize them for technical debt. This will help you find the bottlenecks that are challenging you today; there’s very little value in fixing debt in those parts of your system that change rarely.
You might have a bottleneck if there’s only one developer capable of dealing with a component, subsystem, or whole application. Individual code ownership and knowledge silos (where “Dave works on the Accounts Receivables module”—now there’s a painful memory), can block delivery if that person leaves the team or has a pile of other work to do. Finding places in your project where individual ownership is happening lets you consider the benefits and scope of improving the design so other individuals can share the load. Eliminate the bottleneck.
There are tremendous benefits that derive from the eXtreme Programming practice of collective ownership (extremeprogramming.org/rules/collective.html). With collective ownership, any developer on your team is allowed to change any code in your codebase “to add functionality, fix bugs, improve designs or refactor. No one person becomes a bottleneck for changes.”
Ah! There’s that word again, “bottleneck.” By enabling collective ownership, you eliminate the dark parts of your system that only a single programmer—who may walk off the job or get hit by a bus—knows about. There is less risk with a codebase that’s collectively owned.
In my experience, the design is also much better. Two, three, or four heads are almost certainly better than one. In a collectively owned codebase, a team design ethos emerges and supplants individual idiosyncrasies and quirks.
I call collective code ownership a practice, but collective ownership is really an emergent property of a well-functioning team. Think about it—how many of you show up and work on “your code” versus code shared by an entire team? What are often called teams in software development are really workgroups with an assignment editor where programming tasks are doled out based on who’s worked on a particular feature, subsystem or module in the past.
Prioritize as a Team
I’ve said before that it’s important you involve the whole team in efforts to improve. As an Agile Coach, I hold closely to the mantra that people support a world they help to create. If you don’t have a critical mass of support, an effort to foster a culture of continuous improvement can be very difficult to get off the ground, much less sustain.
Obtaining consensus is key. You want the majority of team members to support the current improvement initiative you’ve selected. I’ve used with some success Luke Hohmann’s “Buy a Feature” approach from his book Innovation Games (innovationgames.com). I’ll attempt a gross over-simplification of the game, and urge you to check out the book if it seems like something that’ll work in your environment.
- Generate a short list (5-9 items) of things you want to improve. Ideally these items are in your short-term path.
- Qualify the items in terms of difficulty. I like to use the abstract notion of a T-shirt size: small, medium, large or extra-large (see the Estimating Improvement Opportunities sidebar for more information on this practice).
- Give your features a price based on their size. For example, small items may cost $50, medium items $100, and so on.
- Give everyone a certain amount of money. The key here is to introduce scarcity into the game. You want people to have to pool their money to buy the features they’re interested in. You want to price, say, medium features at a cost where no one individual can buy them. It’s valuable to find where more than a single individual sees the priority since you’re trying to build consensus.
- Run a short game, perhaps 20 or 30 minutes in length, where people can discuss, collude, and pitch their case. This can be quite chaotic and also quite fun, and you’ll see where the seats of influence are in your team.
- Review the items that were bought and by what margins they were bought. You can choose to rank your list by the purchased features or, better yet, use the results of the Buy a Feature game in combination with other techniques, such as an awareness of the next release plan.
Estimating Improvement Opportunities
I mentioned estimating debt items or improvement opportunities roughly in terms of T-shirt sizes. This is a common technique used in Agile development methodologies. The idea is that you’re collecting things in terms of relative size. The smalls go together as do the mediums, larges, and so on.
It’s not super-important that you bring a lot of accuracy to the table here. Remember: these are relative measures and not commitments. You want to get a rough idea of the difficulty, and the theory is that after estimating a number of items, things will start to even out. Even though one medium item actually takes a pair of developers two weeks to complete while another takes a month, on average a medium will take about three weeks.
Over time, however, you’ll start to gather good examples of what a large or small item really is, and this will aid you in future estimates because you’ll have a basis of comparison. I’ve used several examples of the various sizes in the past as an aid for estimating a new batch of work to good effect.
This can be a tough pill to swallow for management. They’ll initially want to know exactly how long a thing might take and, truth be told, you might need to invest more time in a precise, time-based estimate.
Sell the Plan
Now that you’ve got a plan, it’s time to communicate the value of eliminating debt to your project sponsors. In reality, this step can happen in parallel with identification. Involve your customer from the very beginning. After all, development of the plan is going to take time, effort, and (ultimately) money. You want to avoid, at all costs, questions about whose time and dime you spent developing a cohesive plan.
Any successful and sustained effort to remove a large amount of debt absolutely requires the support of your project’s financiers and sponsors. The folks who write the checks need to understand the investment you’re making. This can be a challenge; you’re asking people to think in long range, into the future, and to move away from the buy now, pay later mentality. The explanation “just because” simply doesn’t cut it.
The problem with this is that executives will inevitably ask, “Aren’t you professionals?” You might feel put against the ropes when probed along these lines. After all, weren’t they paying you, the pro, to deliver a quality product on time and in budget?
This is a tough argument to counter. I say don’t bother. Have the courage and honesty to present the facts as they are. This seemingly risky approach boils down to the very human issues of accountability and trust.
Couch your argument like this: you’ve fielded successful software in the requested amount of time for a provisioned amount of money. In order to achieve this you’ve had to make compromises along the way in response to business pressures. Now, to go forward at a predictable and steady rate, you need to deal with the effects of these compromises. The entire organization has bought them, and now it’s time to pay back.
Your next challenge is to prove to non-technical folks where the debt is causing the most damage. In my experience, business executives respond to quantitative, data-driven arguments supported by “numbers” and “facts.” I put numbers and facts in quotes because we all really know we’re living in a relative world and no single number (cyclomatic complexity, efferent coupling, lines of code, test coverage, what have you) sells a change. Compounding this difficulty, you’ll need to communicate the areas of biggest drain in economic terms: why is this slower than you’d like; why did this feature cost so much?
Evidence DEFEATS Doubt
When building your case, there’s an immensely useful tool from the Dale Carnegie management training system embodied in a pithy phrase, “evidence defeats doubt.” As is common with such management systems (and our discipline in general), the DEFEATS part is an acronym. I’ll detail some of the ways in which this applies to software development. Note, however, that I’ve omitted the second E which stands for Exhibit because it seems to repeat the first E, which stands for Example.
D is for Demonstration. There’s nothing better than show and tell and this is what the demonstration is all about. If you’re tracking your velocity, this should be easy. Show the dip over time (see Figure 2) while drawing the connection to increasingly inflexible and hard-to-change code. Once you sell, you need to keep selling.
Figure 2 Tracking Development Velocity
If you’re using an Agile process such as Scrum or eXtreme Programming, customer feedback events are an essential practice. At the end of an iteration, demonstrate new features to your customer. While the quality and quantity of features will dip when you encounter the technical debt tar pits and while you ramp up your improvement efforts, you should be able to demonstrate gains over time. Less debt means greater output and greater output yields more stuff to demonstrate.
As the idiom goes, before you criticize someone, walk a mile in their shoes. If you have a more-technical manager, encourage her to work with developers on some of the more difficult sections of the codebase to develop empathy for the difficulty of change. Ask her to look at some code. Can she follow it? Is it readable? There’s no quicker way to win your champion.
E is for Example. There’s nothing like a concrete example. Find some stories or requirements that were either impossible to complete because of technical debt, or created significant regression. Pick a section of code that’s unreadable, byzantine, riddled with side-effects. Explain how these attributes of the code led to a customer-found defect or the need for massive effort from QA.
Another powerful tool the Agile processes give you is the retrospective. Choose a story that went south in the last couple of iterations and ask the question “why?” Get to the root cause of why this particular story couldn’t be completed, took twice as long as your average story, or spanned more than a single iteration. Often, inflexible software will be the culprit or perhaps you had to revert changes because regression bugs were insurmountable. If you find the last “why” ends up being a technical debt-related reason, capture the analysis in a short, direct form. It’s another feather in your cap, another point in your argument.
F is for Fact. Facts are very easy to come by. Did you release a project on time? What was the post-release defect rate? What is the team’s average velocity over time? Were customers satisfied with the software as delivered? These are the kind of facts you’ll want to bring to the business table, and I believe it’s these facts that speak most directly to business-minded.
Collaboration is a key element here. As a developer, you can more readily supply technical facts. Look for assistance from the people that own the budgets. Chances are they’ll have a much clearer picture and easier access to the business facts that demonstrate the damage that technical debt is causing.
A is for Analogy. I find this especially important. Business people sometimes find software development confusing, even esoteric. If you go to your sponsors with talk of coupling and cohesion and Single Responsibility Principle, you stand a very good chance of losing them. But these are very important concepts in professional software development and, ultimately, it’s how you’re building a data-driven case for tackling debt. My suggestion is to avoid jargon and explain these items with an analogy.
You could describe coupling as a house of cards, for example. Tell your sponsors that the reason your velocity has dropped is because making change to the code is like adding a wall, ceiling, or story to an already established and very elaborate house of cards: a surgical operation requiring an unusually steady hand, significant time and patience, and is ultimately an uncertain and anxiety-provoking event. Sometimes the house of cards collapses.
When employing metaphor and simile, it’s a good idea to state you are doing so. Justify your analogy with a brief description of the more-technical concept you are trying to convey. Using the house of cards example, you might say, “this is the effect that coupling has on our ability to respond to change and add new features.”
T is for Testimonial. Sometimes hearing the same message from a third party can have a more powerful effect. This third party may be an industry leader or a consultant. The reason their word might go farther than yours is that they’re perceived as an objective expert.
If you don’t have the money to hire an outside consultant, consider collecting anecdotes and insight freely available from industry thought leaders. While generic testimonials about so-called best practices are unlikely to seal the deal, they will add to the gestalt of your overall argument.
S is for Statistics. Numbers matter. There’s a common phrase in management, “if you can’t measure it, you can’t manage it.” I’m not sure this conventional wisdom applies wholly, but you can certainly present a case. Coupling and complexity are two metrics that can be used to show a root-cause relationship between a declining throughput (how much work is being delivered) and a codebase increasingly calcified with debt.
I find that composite statistics are usually the best bet here; it’s much easier to understand the importance of code coverage if you can overlay a code coverage metric that decreases over time with a decrease in velocity, thus implying, if not showing, a relationship.
Appoint a Leader
Your efforts to procure a green light for fixing technical debt will go a lot longer with an effective leader, a champion who can communicate in business terms and who has influence with the decision makers in your organization. Often, this will be your manager, her director, the CTO, the VP of Engineering, or someone in a similar position of perceived authority.
This brings up an interesting chicken and egg problem. How do you sell this person? The process of “managing up” is a developer’s responsibility, too. Your first challenge is to sell the seller. How exactly do you do that? Evidence defeats doubt!
So far I’ve covered identifying debt as a team and building a case for fixing that debt. I’ll reiterate: consensus among your team and buy-in with your customers are key factors in these steps.
Make the steps small and don’t invest a lot of time. The first time you identify debt, it will necessarily take longer than when you iterate over new opportunities for improvement, but when you build your case for management, only include those items you plan to work on. Keeping an eye on productivity can be a huge energy saver.
In a future issue I’ll look at the rest of the workflow, including tactics for eliminating debt, and I’ll cover how you can make this process iterative, and capturing the lessons learned from previous debt removal efforts.
Dave Laribee coaches the product development team at VersionOne. He is a frequent speaker at local and national developer events and was awarded a Microsoft Architecture MVP for 2007 and 2008. He writes on the CodeBetter blog network at thebeelog.com.
Thanks to the following technical experts for reviewing this article: Donald Belcham