The big rewrite

I remember once having two back-to-back clients who had just rewritten significant systems in their environment. I asked why they’d chosen to rewrite the system from scratch rather than just fixing them as they were.

At the first, they said “The old system was written in Java and Java is unusable so we rewrote it in C# and now it’s great.”

At the second, they said “The old system was written in C# and C# is unusable so we rewrote it in Java and now it’s great.”

These statements can’t both be true at the same time. In fact, both Java and C# are reasonable languages and either one would have done the job. It wasn’t the language/platform that was at fault, no matter how much these companies told themselves that.

Does this mean that the platform is never the problem? No, there are legitimate cases, usually when we’ve built on top of proprietary vendor systems, when ongoing licensing and/or support costs from that vendor make it untenable to continue funding that code base. These cases are exceptionally rare however - I’ve seen two in my entire career.

What’s significantly more common is that we ignored quality in the product for so long that eventually we gave up and decided to rewrite it from scratch. We often call this technical bankruptcy - that point where it’s so painful to work with the existing system that we give up.

You might be thinking that we would never consciously choose to ignore quality and usually we don’t. It’s all those unconscious points where we prioritize something else over quality, that gets us in trouble.

Any time we defer fixing bugs in favour of shipping new features, we’re ignoring quality
Any time we defer fixing technical debt, we’re ignoring quality
Any time we ignore feedback from customers about usability issues, we’ve ignored quality
Any time we have many items in progress (WIP) rather than staying focused on a few, we’re ignoring quality
Any time we have people going off and working by themselves, rather than actively collaborating, we’re ignoring quality

I’ve written about quality before, and today I want to stay more focused on the notion of a big rewrite.

I’ve seen cases where people are starting to talk about rewriting a system before the first system is even in production. They’ve made such a mess of the first product that they realize already that it’s not usable over the long haul and needs to be replaced.

Then what happens is that we rewrite the product on a new platform, and at the beginning, the new code is really easy to work with and easy to extend. However, by the time this new system is ready for production, it’s often in just as bad shape as the first system.

How could this possibly be? How could the new system be as bad as the one we’re replacing?

The simple answer is because all we changed was the code itself. We didn’t change our practices around quality. We didn’t change the way we write code. We didn’t do anything different on this second attempt than we had done on the first attempt. We might have a different design approach and a different language but all the things that led to poor quality code are still in place.

Since we never tried to improve the original code in place, we never developed the practices that lead to high quality code. We just did the same thing all over again, and got the same result.

Like the quote that is often attributed to Einstein: “The definition of insanity is doing the same thing over and over again and expecting different results”

If we want better results, we need to build the skills that will lead to those better results. Throwing away a large code base and replacing it, does not teach the skills for building high quality code.

To develop those skills that will lead to better quality, we need to actively improve the code we have. The next time someone proposes throwing some code away and rewriting it, we need to say no. Improve it gradually instead.

We also need to be constantly practicing the skills of improving code. We should be refactoring constantly to improve that code, not just touching it when there is a new feature to add.

The Scouting Rule says that we should always leave the campsite cleaner than when we arrived. Applied to software, that means always leaving the code a little bit better. Did you see some duplication? Remove it. Did you see poor variable naming? Fix it.

If we do this regularly then we’ll never get to the point of a rewrite. I’ve seen ten year old code bases that were a pleasure to work with, and six month old code bases that were a disaster. Consider which one you want.