Automated software testing, part 4: unit tests and test doubles

Navigation: part 1 | part 2 | part 3 | part 4 | part 5 | part 6

In part 3 I discussed some of the good habits surrounding unit testing, such as determining how much to cover in your tests and avoiding the temptation to test internal implementation details. In this part we'll continue to explore unit testing with a focus on verifying external interactions in a controlled and isolated environment.

First, I'd like to clarify some terminology. When I'm talking about "external interactions", I'm mostly referring to a unit interacting with another unit. Generally, this does not include a unit interacting with an underlying framework or language feature because most of the time that kind of interaction falls under the scope of "internal implementation details" and, as such, shouldn't matter to our tests.

Verify external interactions

In the last post, I showed how testing the outputs of a unit based on its inputs can work. An external interaction is just another type of input and/or output, so it's just as important to verify: units should be able to handle their respectively defined inputs and produce expected outputs, but you need to be certain of what those inputs and outputs actually are. While integration tests verify that multiple units are communicating correctly with each other, they are limited in two important ways: they cannot - and should not - see the exact data that flows between the units, and they cannot - and should not - cover all possible inter-unit communication scenarios. The second point there is especially important because it's easy to get stuck trying to create the "perfect" scenario where the data transforms and flows in just the right way to get to a very specific outcome, but that shouldn't be your focus when crafting integration tests. More on that in the next post.

Both of the above limitations can be addressed to a sufficient extent with unit testing. Instead of instantiating real dependencies, a unit test should create test doubles that sit in place of the real external units and act just enough like them that the unit you're testing doesn't know the difference. You're effectively isolating your unit, which allows you to verify the exact data that is input to and output from each unit. This gives you enough flexibility and control to cover primary interactions as well as edge case scenarios in regards to inputs and outputs.

Time formatting, revisited

It's time (no pun intended) to once again dive into some code and tests.

Relative time formatter v2

The relative time formatter was first introduced in part 3. In this upgraded version, the single "class" has been split up into three different ones - the primary formatter, the hour formatter, and the day formatter. While the first version could only format days, this one can format hours as well. Getting the desired result string from the relative time formatter still consists of instantiating a RelativeTimeFormatter and calling its format function. What's different is that you now need to pass an instance of the hour formatter and an instance of the day formatter to the primary formatter's constructor. I should also mention that this isn't necessarily idiomatic JavaScript. The concepts demonstrated and discussed here are generic and applicable to many languages and frameworks, which means we can't take advantage of some of the inherent strengths unique to JavaScript.

Looking at the test section, you'll notice that there are three test suites, since we now have three classes. If you click the run button, you'll see their outputs and that they all pass. The most interesting test to look at is the one that exercises the RelativeTimeFormatter itself. In addition to the mocked out clock that was discussed in the last post, it has an hourFormatterStub and a dayFormatterStub.

Setup, execution, verification

Let's look at the "should format less than 8 hours in the future using the hour formatter" test. After defining a couple of variables, we have the following line:


This line configures the hourFormatterStub to return expectedValue whenever that stub is called. Since expectedValue is defined earlier in the test as a generic string (that happens to look very different from what the real formatter might return) we can check for it later in the test to verify that this string was returned to us from our stubbed out hour formatter via the real RelativeTimeFormatter. Naturally, this line of code must be executed before we try calling RelativeTimeFormatter.format, since it sets up required functionality on a dependency. Hence, it could be called a setup step.

Once the necessary setup steps are taken, we call into our instance of the real RelativeTimeFormatter, and save the return value for later. This is the execution step.

After calling the format function on our instance of the RelativeTimeFormatter, we have some expect lines, which check that the stub was called the correct number of times, that its invocations contained the expected arguments, and finally, that the RelativeTimeFormatter returned what it got from the hourFormatterStub. Collectively, these lines could be called the test's verification step.

You may have noticed that the relative time formatter tests tend to conform to the setup-execution-verification structure. This is a pretty typical way to write unit tests that are more complex than a single-line return value verification test.

Refactoring tests

After writing your tests, you may notice that many of them have the same setup steps. That's a good indication that you should extract those steps into the beforeEach (or equivalent) of the test suite. Similarly, if common tear down steps are needed, those can go into afterEach. Such refactoring will simplify the tests, making them easier to read, modify, and debug. You can see how this common setup and tear down was done in the relative time formatter test suite. Common verification steps can also be refactored into separate functions that you call from your tests, but you should be careful with this, as it is entirely possible to over-optimize and make the tests more complicated instead of simplifying them.

An uncaught bug

Although all the tests pass, there is actually a major bug in the code of the new relative time formatter. The problem lies in the fact that the hour formatter and the day formatter take different arguments in their respective format functions: the hour formatter takes a single diffValue in milliseconds, while the day formatter takes now and target Date objects. Individually, these differences make sense, since the calculations that are performed are different between the hour formatter and the day formatter. However, the primary formatter that we've been testing with the hour and day formatter stubs isn't calling the day formatter correctly. We've assumed that the calls are identical and wrote the implementation as well as its tests with this faulty assumption in mind.

The important takeaway here is that this is not strictly a unit testing problem. While it's true that the "should format more than 8 hours [...] using the day formatter" tests are flawed, as is the implementation that is being tested, this kind of mistake is easy to make. An act as innocent as copy-pasting the previous hour formatter tests and trivially modifying them to use the day formatter would cause this problem, and copy-pasting is done a lot more often in the software development world than we like to admit.

The solution to the problem does not lie in disabling Ctrl-C and Ctrl-V. Rather, it is in testing the interactions between real units. This is integration testing, and it will catch such bugs. We'll discuss it in more detail in part 5.

comments powered by Disqus