Test Driven Development (TDD) is often though of as a testing activity but it really isn’t. It’s a design activity that leaves a bunch of automated tests behind in its wake. It may sound as if we’re spending a lot of time on “tests” but we’re really focusing on the behaviour we want and allowing that behaviour to drive the design of our system.

The expected behaviour, specified in the form of tests, will drive the design of our code.

From a mechanical perspective, we write a test first and then write the production code that allows that test to pass. It sounds simple, and it really is, but there are some important subtleties.

Note that the testing community has long distinguished between testing (exploring the behaviour) and checking (following a list of instructions to verify an output). What we’re doing with TDD is very much checking and not testing but unfortunately, the agile community continues to use the words unit test so that’s what we’ll use here.

There are three main stages to the TDD workflow, that are commonly referred to as red-green-refactor. Each of those in turn has two parts and there are some steps outside that primary loop. The diagram below puts this all together.

TDD cycle

Start

Run all your tests up front. If any of them fail, you aren’t ready to start the TDD cycle. That tells you that your production code is broken and that needs to be fixed before we start adding anything new.

Note: We often talk about tests being broken and yet that’s very misleading. The tests are usually fine and if they’re failing (red), that’s telling you that your production code is broken. Getting all the tests back to green should be your top priority.

If any tests are ever failing then your highest priority should be getting them all back to a green state. Failing tests are feedback that something is broken and we should not continue to build on top of something that is broken.

Identify the smallest slice of behaviour that we want the code to exhibit. Typically this will be a single path through a single method. It might be a “happy path” where the code does as expected or it could be an error condition. It’s one specific slice, though, not a collection of behaviours all at once.

Red - Write a failing check

If we’re writing completely new functionality, the methods we want to test may not even exist yet and that’s OK. Write the test as if they did exist. For example, I can write this test even if the FizzBuzz class isn’t present. It won’t compile but at this point, I don’t care. What we’re focused on is how we choose to use the code, not how it’s going to be implemented.

Unit tests should verify behaviour, not implementation.

public class FizzBuzzTest {
    @Test
    void shouldReturn2For3() {
        assertEquals("2", FizzBuzz.convert(3));
    }
}

In the red step, we need to ensure that the code is failing for the right reason and at the moment it isn’t. It’s failing because the code doesn’t even compile. So add just enough production code to allow it to compile. Note that we’re not implementing the actual functionality so the test should still be failing.

public class FizzBuzz {
    public static String convert(int number) {
        return "";
    }
}

Run the tests again and verify that it’s failing for the right reason, which it is.

shouldReturn2For2 [X] expected: <2> but was: <>

We always want to run the tests at this point. All too often someone will write the test and then immediately move to writing the matching production code without having run it first. Some times the test doesn’t fail on the first attempt and that’s feedback from the system that we didn’t understand something or we made a mistake. Always run the test and watch it fail before moving on.

If the test passes on the first attempt, we’ve likely made a mistake.

Many people will have the tendency to want to create many failing tests up front. Don’t do this as it just makes so many things harder. There should only be one failing test at any given time. At any time through the process, if we find a whole bunch of tests broken at once, we need to go back and fix those before continuing.

We never have more than one failing test at a time.

Now that we have a single test that is failing, we can move on to green.

Green - Make it pass

Change the production code to make the automated test pass. Do this in the simplest way you can possibly think to do it.

public class FizzBuzz {
    public static String convert(int number) {
        return "2";
    }
}

All too often, we try to make the code perfect in this step and that’s wrong. The goal here is just to make the test pass, in the easiest way possible. We don’t care how pretty or how fast the code is, if it doesn’t work. Make it work before doing anything else.

A common mistake people make is thinking that “works” means it has to handle every possible case. The only cases we care about are the ones that we have tests for. In this case we only have one test and hard coding the return statement satisfies that one test. It clearly won’t work for any other number but we don’t care about other numbers at this point. We really don’t.

You Aren’t Going to Need It (YAGNI) is a common phrase from the XP community. It means that we only build what we need right now, not what we anticipate we might need tomorrow or next week or next year. If we don’t have a test for it, we don’t write it.

Run the check to verify that it really is working. Run all the tests, if you can, to ensure you didn’t break anything else as you were making this one work.

When all the tests are running green again, move on to refactor.

Refactor - Make it beautiful

Now that we have working code, let’s make it beautiful. Refactor it to keep the code clean. Remove duplication. Make the code easy to read. Make it something you can be proud of.

Refactoring is the act of changing the implementation without changing the behaviour. We’re improving the code in some way without changing how it behaves. If you’re changing behaviour at this point, you aren’t refactoring.

If you’re adding new functionality, you aren’t refactoring. At this step, we do not change behaviour.

Now run all the checks again to ensure that nothing got broken as we were cleaning the code. If something did get broken then fix it now, before we move on.

Check it in!

Yes, really check it in to version control. You now have a tiny sliver of production code that is working and verified. By definition, we wouldn’t be at this point if anything were broken. Check it in.

A surprising number of developers are really uncomfortable with checking in this frequently. Get over it. The more often we integrate with the rest of the code, the easier it will be. If you’re not checking in several times an hour then either you’re very new to the technique or you’re doing something wrong.

Start again

Start all over at the top. Find the next tiny slice of functionality and work through the cycle again.

You’re probably already thinking that hard coding the return of “2” is horrible so perhaps your next test will be for 3, to force to change that code. We must have a failing test before we change that production code. That’s the essence of “test driving”.

We do not change any production code until a test forces us to do that. If there’s something about the code that bothers you, write a test first, to force you to change that behaviour. Then change the behaviour to make the test pass.

Examples

Let’s look at a more complete example that cycles through the TDD cycle multiple times. We’ll use the Prime Factors kata and we’ll do it in multiple languages so you can see that while there are subtle variations based on the languages/frameworks, the flow is fundamentally the same for all of them.

Other benefits

While it may not have been obvious as we were stepping through the mechanics, TDD has given us a couple of benefits for free.

  1. Writing the test first has lower cognitive load1 than writing the tests after. As I mentioned above, we always want to test the expected behaviour and not the implementation. When testing after the fact, it’s actually quite difficult to ignore the implementation and that means that writing behaviour tests is cognitively harder than it should be.
  2. We get very high code coverage for no extra work. If we follow the TDD process strictly then we never write any production code until we have a failing test, resulting in coverage at, or very close to, 100%.
  3. Code written with TDD has fewer bugs. There are several studies that show that code written with TDD have a lower defect density (number of bugs per lines of code). One study showed reduction in defects ranging from 40% to 90%.

See also

  1. Cognitive load is an indication of how hard the brain works to perform specific actions. In this case, the existing implementation is an example of Extraneous Load; an outside distraction that adds to the problem. See this article for more.