Back to Basics: Why Unit Testing is Hard

Written By John Sonmez

More and more lately, I’ve been beginning to question the value of unit testing.  I’ve really been starting to wonder if all the work we put into being able to actually test at the unit level and the extra scaffolding we put into our applications to support it is worth the cost.

I’m not going to talk about that subject yet.  Instead, I want to look at some of the costs of unit testing and ask the question “why is unit testing hard?”

After all, if unit testing weren’t hard, we wouldn’t have to question whether or not it was worth it.  It makes sense then to look at first why it is hard and what makes it hard.

Before you start learning up any new skill or concept, I suggest you take a look at my course “10 Steps to Learn Anything Quickly”.

ar119675077156567

The ideal scenario

Unit testing itself is rather easy once you understand how to do it.  Even test driven or behavior driven development is easy one mastered… at least for the ideal scenario.

What is the ideal scenario then?

It is a unit test where the class under test has no external dependencies.

When a class we are writing unit tests for doesn’t have any external dependencies, we don’t need mocks or stubs or anything else.  We can just write code that tests our code.

Let’s look at an example of this.  Suppose, I had a class called Calculator.  This Calculator class has some very simple methods.  Specifically, let us talk about testing a method Add.  Add takes two single digit integers and returns the result.  If either integer passed in is more than a single digit, it throws an exception.

It is a pretty stupid method, with little use, but it will serve the point well here.

We can TDD or BDD this baby with minimal effort.

Let’s start thinking of test cases:

  • When I add 0 and a single digit number it should return the single digit number
  • When I add 0 and 0 it should return 0
  • When I add two single digit numbers it should return the sum of those numbers
  • When I add one two digit number it should throw an exception

Pretty easy to come up with test cases, just as easy to implement them:

We can then implement the code that will make this test pass pretty easily.  I won’t show it here since it is so trivial.

This is the ideal scenario, or what I will call Level 1 Unit Testing.

Level 1 Unit Testing is where we have a single class with no external dependencies and no state.  We are just testing an algorithm.

Taking it up a notch

The next level of unit testing is reached if we add state to the class under test.

Level 2 Unit Testing is where we have a single class with no external dependencies but it does have state.  We are setting up an object and testing it as a whole.

If we take our existing example, and now we want to add a new method called GetHistory, it is still not difficult to implement the tests, but it gets harder, because we have to make sure we are setting up some state for our object as part of the test.

Let’s look at one of the test cases we might implement for this functionality:

Again, not too hard.  But, state does make this a bit more difficult.  Here the value of a Behavior Driven Development (BDD) style of unit testing can be seen as it helps us to clearly divide the test up into the different parts we now have.

The real complexity we have added here is that we now have to deal with a setup step before we can execute our test.  BDD deals with this by having a special step for defining the context in which the actual test takes place.  It is called a few different things in different BDD circles, but let’s stick with AAA for this post, since it is easy to remember.

The major difference between Level 1 Unit Testing and Level 2 Unit Testing is that in Level 1, we were really testing only one method.  In Level 2 we are testing at the class level.  Really we could call Level 1 Unit Testing method testing, since the unit we are testing is the method.  The class that method existed in didn’t matter.

Enter dependencies

Let’s see what happens when we throw dependencies into the Calculator class.

Imagine that our Calculator class has to keep an audit trail of our calculations.  We have a service that we can use to put calculations from the Add method into a storage location, like a database and our GetHistory method can query the storage location for the history.

As I was thinking about this, an important point occurred to me.  Were this an integration test, our example test method above wouldn’t change at all.

But, as it turns out we are talking about unit tests here, we need to isolate the testing down to the class level.

So let’s think about what our test should do now.  Here are some possible tests we might have.

  • When I add two number the result is returned and the Store method is called on the StorageService with that result.
  • When I get the history, the RetrieveHistory method is called on the StorageService and it’s results are returned back.

Let’s see what one of these tests might look like:

I call this Level 3 Unit Testing.

Level 3 Unit Testing is when we have a single class with at least one external dependency, but it does not depend on its own internal state.

Things really start to get complicated here, because we have to start thinking not just about inputs and outputs and sequences, but now have to think about interactions.

It really starts to get blurry here about what the expectations of our unit tests should be.  In the example code above, do we need to check to make sure IsServiceOnline is called on the StorageService or do we only check that Store was called?

You’ll also notice here that we had to use a mock and pass our dependency into our class so that we could change its behavior.  Along with that came the burden of creating an interface, so that we could have a mock implementation.

If you’re paying attention right now, you may be thinking to yourself that the example is bad.  You may be thinking that the Calculator class now has two responsibilities.

  • It calculates things and return the result
  • It stores calculation results

Right you are, but we can’t wish away this problem.  Let’s suppose we refactor and move the StorageService dependency out of the Calculator class.  We have several options.  We could make a decorator and use it like this:

Or we could do something more like a mediator pattern like so:

However we attempt to solve this problem, we are still going to have to have some class that will have to have a mock in its unit test.

There is a simple fact that we cannot get around. If we are going to use the StorageService to store calculations, either Calculator will depend on it, or something else will depend on calculator and it.  There is no alternative to those two options.

There is another simple fact we can’t get around also.  If we are going to depend on another class in our unit test, we either need an interface that we can use for the mock class, or we need a mocking framework that will support mocking concrete classes.

So with Level 3 Unit Testing we are stuck with needing to mock at least one dependency and either creating a bogus interface, or using a mocking library that will let us mock concrete classes.

It gets worse

It only gets worse from here.  At Level 3 we didn’t worry about state inside our calculator class, we worried about an external dependency that pretty much handled state for us.  In many cases though we will have to worry about state and dependencies.

Level 4 Unit Testing is when we have a single class with at least one external dependency and depends on its own internal state.

In our calculator example, we can simply add the requirement that we only want to get the history for a particular session of calculations.  We need to keep track of the calculations so that we can ask the StorageService for the history for our session.

Let’s look at a possible test for that scenario:

Consider for a moment how fragile and complex this unit testing code is.  Consider how simple the functionality of our class is.

We have a major problem here.  Our unit testing code is more complex than the code it is testing!  It’s ok, if the unit testing code is more lines of code than the code it is testing, that is usually the case.  But, I consider it a big problem when our unit testing code is more complex, because you have to ask yourself the very real question.

Where is there more likely to be a bug?

I’m not saying anything yet

My point is not to make a point, at least not yet.  My real goal here is to help us to change the way we think about unit testing.

We need to stop asking the general question of whether not unit testing is worth the cost and instead ask the more specific question of what level of unit testing is worth the cost.

Level 3+ has a very steep cost as mocking is unavoidable and adds considerable complexity to even the most trivial of implementations.

From that we can draw a bit of wisdom.  If we are going to unit test we should strive to encapsulate as much of our pure logic into classes without dependencies and if possible without state.

The other thing to consider is that as the difficulty and complexity of the unit tests are increasing each level, the goal of the test and value starts to become lost also.

What I mean by this, is that when we start testing that our class properly calls another class with certain parameters, we are crossing over into testing the implementation details of the class.

If I say a class should be able to add 2 numbers and return the result.  I am not talking about how it has to do it.  As long as the result is correct, how doesn’t matter.

When I add a mock and say a class needs to add 2 numbers and store the result using a StorageService by calling the method Store on it, I have now tied how into the test.  Changing how breaks the test.

That’s all we are going to look at for now.  If you’ve read some of my other back to basics posts, you can see the progression up to this point.  I’ve been discounting using interfaces and dependency injection for the sake of unit testing, but I have yet to offer an alternative.  I still don’t.  To be honest, I don’t have one yet.  But, I do believe by breaking down this problem to its roots we can evaluate what we are doing and determine what our true problems are.

By the end of this series I hope to have a solution and a recommendation for tackling these kinds of problems.