Category Archives: Unit Testing

The More I Know, the Less I Know

I used to be very confident in my abilities as a software developer.

I used to be able to walk up to a group of software developers and tell them exactly what they were doing wrong and exactly what was the “right” way to do things.

I used to be sure of this myself.

confident thumb The More I Know, the Less I Know

It wasn’t even that long ago.  Heck, when I look at the blog posts I wrote 3 years ago I have to slap myself upside my head in realization of just how stupid I was.

Not only was my writing bad, but some of my thoughts seem so immature and uneducated that it feels like a completely different person wrote them.

And I wrote those posts back when I knew it all.

The more I learn, the less I know

Lately I’ve been running into situations more and more often where I don’t have a good answer for problems.

I’ve found myself much more often giving 3 pieces of advice attached with pros and cons rather than giving a single absolute—like I would have done perhaps 3 years ago.

I’ve been finding as I have been learning more and more (the past 3 years have been an accelerated pace of growth for me,) that I am becoming less and less sure of what I know and more and more convinced that there is no such thing as a set of best practices.

I’ve even spent some time postulating on whether or not commonly held beliefs of best practices would be thrown completely out the window given a significant enough motivation to succeed.

My point is that the more doors I open, the more aware I become of the multitude of doors that exist.

doors thumb The More I Know, the Less I Know

It is not just the realization of what I don’t know, but also the realization of weakness of the foundation I am already standing on.

Taking it out of the meta-physical

Let’s drop down out of the philosophical discussion for a bit and talk about a real example.

Perhaps the biggest quandary I struggle with is whether or not to unit test or practice TDD and its variants.

The 3 years ago know-it-all version of me would tell you emphatically “yes, it is a best practice and you should definitely do it all the time.”

The more pragmatic version of me today says, in a much more uncertain tone, “perhaps.”

I don’t want to delve into the topic in this post since I am sure I could write volumes on my ponderings in this area, but I’ve come to a conclusion that it makes sense to write unit tests for code that has few or no dependencies and that it does not make sense to do so for other code.

From that I’ve also derived that I should strive to write code that separates algorithms from coordinators.

I still even feel today that my advice is not wholly sound.  I am convinced it is a better approach than 100% TDD and units tests, or no TDD and unit tests, but I am not convinced there isn’t a deeper understanding and truth that supersedes my current thoughts on the matter.

As you can imagine this is quite frustrating and unsettling.

Silver bullets and best practices

What I am coming to realize more and more is that there are no silver bullets and more surprisingly there are not even any such things as best practices.

silverbullet thumb The More I Know, the Less I Know

Now I’ve heard the adage of there being no silver bullets so many times that it makes me physically sick when I hear someone say it, because it is so cliché.

But, I’ve had a hard time swallowing the no best practices pill.

I feel like when I abandon this ship then I am sailing on my life raft in the middle of a vast ocean with no sails and no sense of what direction to go.

A corner-stone of my development career has been in the learning, applying and teaching of best practices.  If these things don’t exist, have I just been pedaling snake oil and drinking it myself?

No.

Best practices are simply concrete applications of abstract principles in software development that we cannot directly grasp or see clearly enough to absolutely identify.

Breaking this down a bit, what I am saying is that best practices are not the things themselves to seek, but through the pursuit of best practices we can arrive at a better understanding of the principles that actually are unchanging and absolute.

Best practices are optimal strategies for dealing with the problems of software development based on a particular context.  That context is primarily defined by:

  • Language and technology choice
  • Meta-game (what other software developers and perceived best practices are generally in place and how software development is viewed and understood at a given time.)
  • Individual skill and growth (what keeps me straight might slow you down; depends on where you are in your journey.)

There is a gentle balance between process and pragmatism.

When you decide to make your cuts without the cutting guide, it can make you go faster, but only if you know exactly what you are doing.

Where I am now

Every time I open my mouth I feel like I am spewing a bunch of bull crap.

I don’t trust half of what I say, because I know so much of it is wrong.

Yet I have perhaps 10 times more knowledge and quite a bit more experience in regards to software development than I did just 3 years ago.

So what gives?

Overall, I think I am giving better advice based on more practical experience and knowledge, it is just that I am far more aware of my own short-comings and how stupid I am even today.

I have the curse and blessing of knowing that only half of what I am saying has any merit and the other half is utter crap.

Much of this stems from the realization that there are no absolute right ways to do things and best answers for many of the problems of software development.

I used to be under the impression that someone out there had the answer to the question of what is the right way to develop software.

clues thumb The More I Know, the Less I Know

I used to think that I was picking up bit of knowledge, clues, that were unraveling the mystery of software development.  That someday I would have all the pieces of understanding and tell others exactly how they should be developing software.

What I found instead was that not only does nobody know the “right” way to develop software, but that it is perhaps an unknowable truth.

The best we can do is try to learn from obvious mistakes we have made before, start with a process that has had some level of success, and modify what we do based on our observations.

We can’t even accurately measure anything about software development and to think we can is just foolishness.

From story points, to velocity, to lines of code per defect and so on and so forth, all of those things are not only impossible to accurately measure, but they don’t really tell us if we are doing better or not.

So, what is my point?

My point is simple.

I have learned that not only do I not have all the answers, but I never will.

What I have learned is always subject for debate and is very rarely absolute, so I should have strong convictions, but hold onto them loosely.

And most importantly, don’t be deceived into thinking there is a right way to develop software that can be known.  You can improve the way you develop software and your software development skills, but it will never be based on an absolute set of rules that come down from some magical process or technology.

If you like this post don’t forget to or subscribe to my RSS feed.

There Are Only Two Roles of Code

All code can be classified into two distinct roles; code that does work (algorithms) and code that coordinates work (coordinators).

The real complexity that gets introduced into a code bases is usually directly related to the creation of classes that group together both of these roles under one roof.

I’m guilty of it myself.  I would say that 90% of the code I have written does not nicely divide my classes into algorithms and coordinators.

Defining things a bit more clearly

Before I dive into why we should be dividing our code into clear algorithmic or coordinating classes, I want to take a moment to better define what I mean by algorithms and coordinators.

Most of us are familiar with common algorithms in Computer Science like a Bubble Sort or a Binary Search, but what we don’t often realize is that all of our code that does something useful contains within it an algorithm.

What I mean by this is that there is a clear distinct set of instructions or steps by which some problem is solved or some work is done.  That set of steps does not require external dependencies, it works solely on data, just like a Bubble Sort does not care what it is sorting.

Take a moment to wrap your head around this.  I had to double check myself a couple of times to make sure this conclusion was right, because it is so profound.

It is profound because it means that all the code we write is essentially just as testable, as provable and potentially as dependency free as a common sorting algorithm if only we can find the way to express it so.

What is left over in our program (if we extract out the algorithms) is just glue.

Think of it like a computer.  Computer electronics have two roles: doing work and binding together the stuff that does the work.  If you take out the CPU, the memory and all the other components that actually do some sort of work, you’ll be left with coordinators.  Wires and busses that bind together the components in the system.

Why dividing code into algorithms and coordinators is important.

So now that we understand that code could potentially be divided into two broad categories, the next question of course is why?  And can we even do it?

Let’s address why first.

The biggest benefit to pulling algorithmic code into separate classes from any coordinating code is that it allows the algorithmic code to be free of dependencies.  (Practically all dependencies.)

Once you free this algorithmic code of dependencies you’ll find 3 things immediately happen to that code:

  1. It becomes easier to unit test
  2. It becomes more reusable
  3. Its complexity is reduced

A long time ago before mocks were widely used and IoC containers were rarely used, TDD was hard.  It was really hard!

I remember when I was first standing on the street corners proclaiming that all code should be TDD with 100% code coverage.  I was thought pretty crazy at the time, because there really weren’t any mocking frameworks and no IoC containers, so if you wanted to write all your code using TDD approaches, you’d actually have to separate out your algorithms.  You’d have to write classes that had minimal dependencies if you wanted to be able to truly unit test them.

Then things got easier by getting harder.  Many developers started to realize that the reason why TDD was so hard was because in the real world we usually write code that has many dependencies.  The problem with dependencies is that we need a way to create fake versions of them.  The idea of mocking dependencies became so popular that entire architectures were based on the idea and IoC containers were brought forth.

mp900175522 thumb There Are Only Two Roles of CodeWe, as a development community, essentially swept the crumbs of difficult unit testing under the rug.  TDD and unit testing in general became ubiquitous with writing good code, but one of the most important values of TDD was left behind, the forced separation of algorithmic code from coordinating code.

TDD got easier, but only because we found a way to solve the problems of dependencies interfering with our class isolation by making it less painful to mock out and fake the dependencies rather than getting rid of them.

There is a better way!

We can still fix this problem, but we have to make a concerted effort to do so.  The current path of least resistance is to just use an IoC container and write unit tests full of mocks that break every time you do all but the most trivial refactoring on a piece of code.

Let me show you a pretty simple example, but one that I think clearly illustrates how code can be refactored to remove dependencies and clearly separate out logic.

Take a look at this simplified calculator class:

 public class Calculator
    {
        private readonly IStorageService storageService;
        private List<int> history = new List<int>();
        private int sessionNumber = 1;
        private bool newSession;

        public Calculator(IStorageService storageService)
        {
            this.storageService = storageService;
        }

        public int Add(int firstNumber, int secondNumber)
        {
            if(newSession)
            {
                sessionNumber++;
                newSession = false;
            }

            var result = firstNumber + secondNumber;
            history.Add(result);

            return result;
        }

        public List<int> GetHistory()
        {
            if (storageService.IsServiceOnline())
                return storageService.GetHistorySession(sessionNumber);

            return new List<int>();
        }

        public int Done()
        {
            if (storageService.IsServiceOnline())
            {
                foreach(var result in history)
                    storageService.Store(result, sessionNumber);
            }
            newSession = true;
            return sessionNumber;
        }
    }

 

This class does simple add calculations and stores the results in a storage service while keeping track of the adding session.

It’s not extremely complicated code, but it is more than just an algorithm.  The Calculator class here is requiring a dependency on a storage service.

But this code can be rewritten to extract out the logic into another calculator class that has no dependencies and a coordinator class that really has no logic.

 public class Calculator_Mockless
    {
        private readonly StorageService storageService;
        private readonly BasicCalculator basicCalculator;

        public Calculator_Mockless()
        {
            this.storageService = new StorageService();
            this.basicCalculator = new BasicCalculator();
        }

        public int Add(int firstNumber, int secondNumber)
        {
            return basicCalculator.Add(firstNumber, secondNumber);
        }

        public List<int> GetHistory()
        {
            return storageService.
                   GetHistorySession(basicCalculator.SessionNumber);
        }

        public void Done()
        {
            foreach(var result in basicCalculator.History)
                storageService
                     .Store(result, basicCalculator.SessionNumber);

            basicCalculator.Done();
        }
    }

    public class BasicCalculator
    {
        private bool newSession;

        public int SessionNumber { get; private set; }

        public IList<int> History { get; private set; }

        public BasicCalculator()
        {
            History = new List<int>();
            SessionNumber = 1;
        }
        public int Add(int firstNumber, int secondNumber)
        {
            if (newSession)
            {
                SessionNumber++;
                newSession = false;
            }

            var result = firstNumber + secondNumber;
            History.Add(result);

            return result; ;
        }

        public void Done()
        {
            newSession = true;
            History.Clear();
        }
    }

 

Now you can see that the BasicCalculator class has no external dependencies and thus can be easily unit tested.  It is also much easier to tell what it is doing because it contains all of the real logic, while the Calculator class has now become just a coordinator, coordinating calls between the two classes.

This is of course a very basic example, but it was not contrived.  What I mean by this is that even though this example is very simple, I didn’t purposely create this code so that I could easily extract out the logic into an algorithm class.

Parting advice

I’ve found that if you focus on eliminating mocks or even just having the mindset that you will not use mocks in your code, you can produce code from the get go that clearly separates algorithm from coordination.

I’m still working on mastering this skill myself, because it is quite difficult to do, but I believe the rewards are very high for those that can do it.  In code where I have been able to separate out algorithm from coordination, I have seen much better designs that were more maintainable and easier to understand.

I’ll be talking about and showing some more ways to do this in my talk at the Warm Crocodile conference next year.

Back to Basics: Becoming BAT Man

In my Back to Basics post on my conclusions about blackbox automated tests (BATs) and unit testing, I said that we should:

Spend a majority of your effort, all the time you would have spent writing unit tests, instead writing what I will call blackbox automated tests or BATs.

In this post, I am going to outline what I think it takes for a team to go from BAT-less to BAT-man (or BAT-woman) without going batty.

I want to take a practical approach to getting from 0 blackbox automated tests to building a sustainable process of developing BATs as a integral component of the acceptance criteria for calling a backlog item “done.”

Let me paint a picture for you

human paint brush thumb Back to Basics: Becoming BAT Man

I want to paint the picture for you of the end goal that we are seeking to achieve, from start to finish, of a single backlog.

  1. A backlog is selected for work.
  2. Developers, QA and product owner work together to clearly define the acceptance criteria which will enable that backlog to be signed off or called “done.”
  3. QA and developers work together to transform the acceptance criteria into a high level description of BATs which will be created to verify each aspect of the acceptance criteria.
  4. Developer begins coding with this high level set of BATs in mind (BATs drive the development), QA begins developing BATs.
  5. As developer completes enough functionality to pass a BAT and not break existing BATs that functionality is checked into trunk.
  6. Each check-in causes the entire battery of BATs to be kicked off split across multiple VMs which run 100s of hours worth of automated tests in minutes.
  7. Once all BATs for a backlog are passing, that backlog is done and those BATs become part of the battery of tests which are run with each continuous integration build.

We should keep this simple process in mind and strive to reach it, knowing that when we are able to reach this process we will be able to have a HUGE amount of confidence in the functionality and non-regression of our system.

Put aside for a second your notion that unit testing is the correct thing to do.  Put aside the idea that blackbox automated testing is too hard and too fragile, and imagine the world of software development that flows like the picture I painted above.

Now listen up, because this part is important…

Before you put forth the effort to write one more unit test, before you dip your test double pen in the ink well of “mock,” make sure you are first taking efforts to develop a process to create BATs.

The value proposition

We really need to take the effort to understand the value proposition being presented here.

I’m going to use some real fake numbers to try and really convey this important point that I think is likely to be overlooked.

Consider first the amount of time that you spend doing two things:

  1. Creating infrastructure to allow your code to be unit tested in isolation.  (Dependency injection, interfaces, etc.)
  2. Time spent writing unit tests.

Now, before we go any further, I just want to reiterate.  I am not advocating completely abolishing the writing of unit tests.  See my original conclusions post, or my post on unit testing without mocks for a better understanding of what I am advocating in regards to unit testing.

So back to my point.  Think about all the time it takes to do this.  To properly isolate and unit test code most developers would probably estimate that for every hour worth of work writing the code there is about another hour to two hours worth of unit testing and unit test prep time.

There is a reason why unit tests take longer to write than the code they test; it takes MORE lines of code to test a line of code in isolation.

BATs developed using a good automation framework are just the opposite.  While unit test code might have a ratio of 4 lines of unit test code to 1 line of production code or a 4:1 ratio, BAT code has a completely opposite ratio.  It is very likely that 1 line of BAT code can test 20 lines or more of production code, a 1:20 ratio.  (Where do I get these numbers?  Looking at some of my real production code being tested with unit tests and BATs.)

Even if unit tests and BATs were equally effective in preventing the regression of your software system, and equally effective at providing an absolute assurance of meeting the acceptance criteria (which I would argue that BATs are much more effective at both), you can see easily that you are going to get a much higher return on your investment by investing your time in BATs vs. investing your time in unit tests.

I don’t make these statements lightly or in a theoretical tone.  I have real world experience with successfully implementing automation frameworks for writing BATs that reinforce these conclusions.

How to get there

If I have done my job, you are now at least convinced that getting to the point of having BATs for all backlogs is a good goal, but if you are like most people I talk to, you are skeptical of the costs and feasibility of doing it.

What you need is a good guide.  A step by step guide on how to do it.

I am going to conclude the Back to Basics posts and segue into a new series with the goal of providing a step by step guide to getting to fully automated blackbox testing.

Let’s take a look at the steps that I will cover in the next series of posts.

  1. Hiring an automation lead – it’s important to either find someone who’s done this before, or a developer resource to this role.
  2. Deciding on the underlying browser driver or browser automation tool – assuming web apps here, there are several to choose from, WatiX, Selenium and more.
  3. Designing an automation framework – building a framework tailored specifically to your application under test will create an effective domain specific language for testing your application.
  4. Creating your first smoke tests – building some smoke tests will help you prove and build out your framework and provide you the highest value tests first.
  5. Adding smoke tests to your build – with smoke tests being run as part of the build, you can immediately start seeing value.
  6. Adding BATs to your acceptance criteria – we need to start out slowly here, but eventually make all backlogs require BATs for each acceptance criterion.
  7. Scaling out – as you start to get a large amount of BATs you’ll need to figure out a way to virtualize more servers and parallelize the test runs to be able to run all the BATs in a short amount of time.
  8. Building a true DSL – once you get this far, it may be time to start thinking about creating a domain specific language that even business analysts can write tests with.
As always, you can subscribe to this RSS feed to follow my posts on Making the Complex Simple.  Feel free to check out ElegantCode.com where I post about the topic of writing elegant code about once a week.  Also, you can follow me on twitter here.

Back to Basics: Mock Eliminating Patterns

In my previous post I talked about unit testing without mocks.  I gave some examples of how I had done this in some of my real code from PaceMaker.

This time I want to take a look at some of the common patterns we can use to extract parts of our code into dependency-lite or dependency-less classes that we can unit test without needing mocks.

I’ll have to admit, I haven’t really be practicing this approach for a long time, so my list of patterns is probably lacking, but these are some of the ones I kept seeing as I was working through the code.

If you have some ideas for other patterns or know of some already or have better names than what I have chosen, please let me know.

Pattern 1: Pull out state

In many cases we can have complex objects that contain their own state often split across member variables.

If you find that a class has many methods that check some member variable and do one thing or another based on what value that member variable is set to, you might be able to benefit by extracting all the state changing logic into a separate class and creating events to notify the original class when the state changes.

We can then keep the new class as the single source of state for the class it came from and easily write dependency free (level 2) unit tests for the state class.

From the perspective of the class you are pulling the state out of, we might turn:

isSpecial = false;
timeToLive = 50;
timeToLove = 1
penguinsHaveAttacked = true;

public ApocalypseScenario CreateNew()
{
    if(isSpecial && timeToLive < timeToLove || penguinsHaveAttacked)
        return new ApocalypseScenario("Penguin Apocalypse, Code Blue");
}

Into:

private apocalypseStateMachine = new ApocalypseStateMachine();

public ApocalypseScenario CreateNew()
{
    if(apocalypseStateMachine.State == States.PENGUIN_SLAUGHTER)
        return new ApocalypseScenario("Penguin Apocalypse, Code Blue");
}

Obviously in this case there would be many ways the states get changed.  I haven’t included those here, but you can imagine how those would become part of the new state class.

attack penguin thumb Back to Basics: Mock Eliminating Patterns

Pattern 2: Method to class

Often I have found that I can’t find a way to extract all of the logic out of a class into one cohesive class, because of multiple interactions inside the class.

When I encounter this problem, I have found that I can usually identify a single large method that is using data from dependencies in the class to perform its logic.

These methods can usually be extracted to be their own new class.  We can pass in the data from the dependencies the method was originally using instead of using the dependencies directly.

Good candidates for this kind of pattern are methods which use multiple dependencies to get data in order to compute a single result.

When applying this pattern we may end up with a very small class, containing only 1 or 2 methods, but we are able to easily unit test this class with state and dependency free unit tests (level 1.)

Pattern 3: Pull data sources up

It is often the case that we have dependencies in a class that exist only to provide some sort of data to the actual logic in the class.

In these situations it is often possible to pull the source of the data up and out of the methods that use the data and instead pass in just the used data.

Performing this refactoring allows us to have methods in our class that do not use the dependency in the class directly, but instead have the data that dependency class provided passed in.

When we do this, it opens up the ability to write clean and simple unit tests without mocks for those methods in the class or to apply the “Method to class” pattern to extract the method into its own testable class.

As a first step to refactoring my code to make it easily unit testable, I will often scan the methods in the class to find methods that are only using data from dependencies, but not manipulating the state or sending commands to the dependencies.  As I am able to identify these methods, I can apply this pattern to remove the dependencies from the methods.

Pattern 4: Gather then use

Often we find that we cannot pull a dependency out of a method because that dependency is being manipulated inside of a loop or in the middle of some process.

In cases like these, we can often gather all the data that will be used to manipulate the dependency in some form of a collection and then reiterate through that collection to perform the manipulation on the dependency.

A good way to envision this is to think about two people mailing letters.  In one scenario we might have the first guy stuffing the envelope and the 2nd guy sealing the envelope.

We could carry on this process of 1st guy stuffs, 2nd guy seals for each letter we want to mail, but the process is dependent on both guys being present and working.

If we change it so that the first guy stuffs all the envelopes first and gives that stack of stuffed envelopes to the 2nd guy, they can do their jobs independently.

By applying the same idea to algorithms in our code, we can take a single method and break off the parts that don’t rely on dependencies to test independently or even move to their own classes.

As always, you can subscribe to this RSS feed to follow my posts on Making the Complex Simple.  Feel free to check out ElegantCode.com where I post about the topic of writing elegant code about once a week.  Also, you can follow me on twitter here.

Back To Basics: Unit Testing Without Mocks

In my last post, I revealed my conclusions regarding what to do instead of misusing IoC containers and interfaces all over your code mostly for the purpose of unit testing.

One of the approaches I suggested was to focus on writing level 1 or level 2 unit tests and occasionally level 3 unit tests with 1 or possibly 2 dependencies at most.

I want to focus on how this is possible, using real code where I can.

First let’s talk about the approach

When I first talk about writing mainly level 1 or level 2 unit tests, most people assume that I mean to cherry pick the few classes in your code base that already qualify and only write unit tests for those classes.

That is not at all what I am advocating.

Instead, the approach I am suggesting is to find ways to make most of the actual logic in your code become encapsulated into classes that depend mostly on data.

What I mean by this is that our goal should be to refactor or write our code in such a way that logic is grouped into classes that only depend on primitive types and data classes, not other classes that contain logic.

This of course is not fully achievable, because something will have to tie all of these logic containing classes together.  We we need these tie-together classes, but if we can make their job to simply execute commands and tie other classes together, we can feel pretty confident in not unit testing them, and we make our job a whole lot easier.

So to summarize, the basic strategy we are going to employ here is to eliminate the need for mocks by designing our classes to be much smaller, having much tighter roles and responsibilities, and operating on data that is passed in rather than manipulating methods on other classes.

There are many patterns we can use to achieve this goal.  I’ll show you an example, then I’ll try to cover some of the major patterns I have discovered so far.

A real world example

I recently released a Java based Android application called PaceMaker.  When I had started out building this application, I set out with the high and mighty goal of using Google Guice framework for dependency injection and BDD style unit tests using JMock to mock passed in dependencies.  It wasn’t a horrible approach, I wrote about it here.

What I found with this approach though was that I was spending a large amount of time creating mocks, and I wasn’t getting much benefit from it.  So, I had abandoned writing unit tests for the code all together, and I pulled out the now almost useless Guice.

The past couple of nights, I decided to use this project as a test project to demonstrate some of the ideas I have been talking about and thinking about.

I wanted to take my real code, refactor the logic out of it into small classes with little or no dependencies, and write BDD style unit tests that would be simple and easy to understand.

The big challenge here was trying to find a small enough piece of code to use an as example.  For this example, I am going to use a bit of code that I was using to generate the name of a file that I write to disk for saving the data from a run.

This code was originally inside of a presenter class that handled generating a file name to pass to a serializer class in order to serialize the run.

history detail thumb1 Back To Basics: Unit Testing Without Mocks

Here is the original private method that existed in the presenter.

private String getRunFileName()
{
     String completeFileName = StorageManager.getDataStorageDirectory();
     Location firstLocation = locationTracker
          .getLocations().iterator().next();
     Date firstPointTime = new Date(firstLocation.getTime());

     SimpleDateFormat dateFormat = new
         SimpleDateFormat("MMddyyyy_HHmmss");
     dateFormat.setTimeZone(TimeZone.getTimeZone("GMT"));
     String fileName = dateFormat.format(firstPointTime) + ".gpx";

     completeFileName = completeFileName + fileName;
     return completeFileName;
}

 

This method was a perfect candidate for some easily testable logic that could be put into its own class, but there are a few problems we should notice here.

  • We are dependent on StorageManager to get the data storage directory used as the base directory.
  • We are dependent on the locationTracker object to get the time of the first location.
  • There is some real logic here in the form of a transformation.  (It is important to note that we are dealing with logic, not just commands, because testing execution of commands is not as important as testing logic.)

My approach to this refactor is actually pretty simple.  The first thing we need to do is see the dependencies for what they are.  It looks like our logic is dependent on StorageManager and locationTracker, but in reality the logic is dependent on the string which is the base directory for the file and the time to use for the file name.

We can change this code to reflect that pretty easily.

 

private String getRunFileName(String baseDirectory, Calendar time)
{
     String completeFileName = baseDirectory;
     SimpleDateFormat dateFormat = new
         SimpleDateFormat("MMddyyyy_HHmmss");
     dateFormat.setTimeZone(TimeZone.getTimeZone("GMT"));
     String fileName = dateFormat.format(time.getTime()) + ".gpx";

     completeFileName = completeFileName + fileName;
     return completeFileName;
}

What we have done here is small, but it is critical.  We have eliminated dependencies that would otherwise have to be mocked to test this logic.  Sure, we will still need to use those dependencies to get the data to pass into this method, but we can leave that code in the presenter class and move this code into its own class.  The class will be small for now, but it will be easily testable with a level 1 unit test (our favorite kind.)

More examples

I didn’t cherry pick this example from the source code in PaceMaker, but I did cherry pick it for this blog post, because it was one of the shorter examples I could use.

I have several other areas of code in my presenter class in PaceMaker where I used a similar approach to extract out the logic, pull it into its own class with little or no dependencies and write unit tests for.

Here are two other examples:

  • Pulled the state logic for starting, stopping, pausing, resuming and calculating length of time paused during a run into its own class.  I elected to add one dependency (the presenter itself as an observer) to the refactored class in order to allow the class to notify the presenter when the state changed.  In C# I would have just used an event to do this, but in Java we use the observer pattern.
  • Pulled out the logic that created the GPX data files into a GPXDataModelBuilder which instead of depending on the LocationTracker class depended only on the data from that class.

The result ended up being very clear, easy to write, level 1 and level 2 unit tests with no mocking.  In addition, my class structure is now much more tightly cohesive, with a much tighter and clearer responsibility.  Before my presenter was doing several things, but now many of those things are broken up into very small testable classes.

In my next post, I’ll go into the patterns you can use to create classes that are able to be tested by level 1 and level 2 unit tests.

As always, you can subscribe to this RSS feed to follow my posts on Making the Complex Simple.  Feel free to check out ElegantCode.com where I post about the topic of writing elegant code about once a week.  Also, you can follow me on twitter here.

Back to Basics: Unit Testing, Automated Blackbox Testing, and Conclusions!

If you’ve been following me from the beginning of the Back to Basics series, you’ll know that I set out to reevaluate some of the commonly held truths of what best practices are, especially in regards to unit testing, dependency injection and inversion of control containers.

We’ve talked about what an interface is, cohesion and coupling, and even went a little bit off track to talk about sorting for a bit.

One of the reoccurring themes that kept showing up in most of the posts was unit testing.  I talked about why unit testing is hard, and I defined three levels of unit testing.

  • Level 1 – we have a single class with no external dependencies and no state.  We are just testing an algorithm.
  • Level 2 – we have a single class with no external dependencies but it does have state.  We are setting up an object and testing it as a whole.
  • Level 3 – we have a single class with at least one external dependency, but it does not depend on its own internal state.
  • Level 4 – we have a single class with at least one external dependency and depends on its own internal state.

Throughout this series I ended up tearing down using interfaces with only single non-unit test implementation.  I criticized the overuse of dependency injection for the sole purpose of unit testing.  I attacked a large portion of best practices that I felt were only really being used in order to be able to unit test classes in isolation.

But, I never offered a solution.  I told you what was bad, but I never told you what was good.

I said don’t create all these extra interfaces, use IoC containers all over your app, and mocks everywhere just for the purpose of being able to isolate a class you want to unit test, but when you asked me what to do instead, I said “I don’t know, I just know what we are doing is wrong and we need to stop.”

Well, that is no answer, but I intend to give one now.  I’ve been thinking about this for months, researching the topic and experimenting on my own.

cool experiment thumb Back to Basics: Unit Testing, Automated Blackbox Testing, and Conclusions!

I finally have an answer

But, before I give you it, I want to give you a little background on my position on the subject matter.

I come from a pretty solid background of unit testing and test driven development.  I have been preaching both for at least the last 7 years.

I was on board from the beginning with dependency injection and IoC containers.  I had even rolled my own as a way to facilitate isolating dependencies for true unit tests.

I think unit testing and TDD are very good skills to have.  I think everyone should learn them.  TDD truly helps you write object oriented code with small concentrated areas of responsibility.

But, after all this time I have finally concluded, for the most part, that unit tests and practicing TDD in general do more good for the coder than the software.

What?  How can I speak such blasphemy?

The truth of the matter is that I have personally grown as a developer by learning and practicing TDD, which has lead me to build better software, but not because the unit tests themselves did much. 

What happened is that while I was feeling all that pain of creating mocks for dependencies and trying to unit test code after I had written it, I was learning to reduce dependencies and how to create proper abstractions. 

I feel like I learned the most when the IoC frameworks were the weakest, because I was forced to minimize dependencies for the pain of trying to create so many mocks or not being able to unit test a class in isolation at all.

I’ve gotten to the point now where two things have happened:

  1. I don’t need the TDD training wheels anymore.  I don’t pretend to be a coding god or demi-god of some sort, but in general the code I write that is done in a TDD or BDD style is almost exactly the same as the code I write without it.
  2. The IoC containers have made it so easy to pass 50 dependencies into my constructor that I am no longer feeling the pain that caused my unit tests to cause me to write better code.

What I find myself ending up with now when I write unit tests is 70% mocking code that verifies that my code calls certain methods in a certain order.

Many times I can’t even be sure if my unit test is actually testing what I think it is, because it is so complex.

Umm, did you say you had an answer, dude?

Yes, I do have an answer.  I just wanted to make sure you understand where I am coming from before I throw out all these years of practical knowledge and good practices.

I am not the enemy.

My answer to the problem of what to do if you shouldn’t be using IoC containers and interfaces all over your code base just for the purpose of unit testing, is to take a two pronged approach.

2prong thumb Back to Basics: Unit Testing, Automated Blackbox Testing, and Conclusions!

  1. Mostly only write level 1 or level 2 unit tests.  Occasionally write level 3 unit tests if you only have 1 or possibly 2 dependencies.  (I’ll talk about more how to do this in my next post)
  2. Spend a majority of your effort, all the time you would have spent writing unit tests, instead writing what I will call blackbox automated tests or BATs.  (I used to call this automated functional tests, but I think that name is too ambiguous.)

I intend to drill really deep into these approaches in some upcoming posts, but I want to briefly talk about why I am suggesting these two things in place of traditional BDD or TDD approaches.

What are the benefits?

The first obvious benefit is that you won’t be complicating your production code with complex frameworks for injecting dependencies and other clever things that really amount to making unit testing easier.

Again, I am not saying you shouldn’t ever use dependency injection, interfaces or IoC containers.  I am just saying you should use them when they provide a real tangible value (which most of the time is going to require alternate non-unit test implementations of an interface.)

Think about how much simpler your code would be if you just went ahead and new’d up a concrete class when you needed it.  If you didn’t create an extra interface for it, and then pass it in the constructor.  You just used it where you needed it and that was that.

The second benefit is that you won’t spend so much time writing hard unit tests.  I know that when I am writing code for a feature I usually spend at least half the amount of time writing unit tests.  This is mostly because I am writing level 3 and level 4 unit tests, which require a large number of mocks.

Mocks kill us.  Mocking has a negative ROI.  Not only is creating them expensive in terms of time, but it also strongly couples our test classes to the system and makes them very fragile.  Plus, mocking adds huge amounts of complexity to unit tests.  Mocking usually ends up causing our unit test code to become unreadable, which makes it almost worthless.

I’ve been writing mocks for years.  I know just about every trick in the book.  I can show you how to do it in Java, in C#, even in C++.  It is always painful, even with auto-mocking libraries.

By skipping the hard unit tests and finding smart ways to make more classes only require level 1 and level 2 unit tests, you are making your job a whole lot easier and maximizing on the activities that give you a high ROI.  Level 1 and level 2 unit tests, in my estimation, give very high ROIs.

The thirds benefit is that blackbox automated tests are the most valuable tests in your entire system and now you’ll be writing more of them.  There are many names for these tests, I am calling them BATs now, but basically this is what most companies call automation.  Unfortunately, most companies leave this job to a QA automation engineer instead of the development teams.  Don’t get me wrong, QA automation engineers are great, but there aren’t many of them, good ones are very expensive, and the responsibility shouldn’t lie squarely on their shoulders.

BATs test the whole system working together.  BATs are your automated regression tests for the entire system.  BATs are automated customer acceptance tests and the ROI for each line for code in a BAT can be much higher than the ROI of each line of production code.

Why?  How is this even possible?  It’s all about leverage baby.  Each line of code in a BAT may be exercising anywhere from 5 to 500 lines of production code, which is quite the opposite case of a unit test where each line of unit test code might only be testing a 1/8th or 1/16th a line of production code on average (depending on code coverage numbers being reached.)

I’ll save the detail for later posts, but it is my strong opinion that a majority of a development teams effort should be put in BATs, because BATs

  • Have high value to the customer
  • Regression test the entire system
  • Have a huge ROI per line of code (if you create a proper BAT framework)

Imagine how much higher quality your software would be if you had a BAT for each backlog item in your system which you could run every single iteration of your development process.  Imagine how confident you would be in making changes to the system, knowing that you have an automated set of tests that will catch almost any break in functionality.

Don’t you think that is worth giving up writing level 3 and level 4 unit tests, which are already painful and not very fun to begin with to achieve?

In my future posts on the Back to Basics series, I will cover in-depth how to push more of your code into level 1 and level 2 unit tests by extracting logic out to separate classes that have no dependencies, and I will talk more about BATs, and how to get started and be successful using them.  (Hint: you need a good BAT framework.)

As always, you can subscribe to this RSS feed to follow my posts on Making the Complex Simple.  Feel free to check out ElegantCode.com where I post about the topic of writing elegant code about once a week.  Also, you can follow me on twitter here.

Back to Basics: Why Unit Testing is Hard

More and more lately, I’ve been beginning to question the value of unit testing.  I’ve really been starting to wonder if all the work we put into being able to actually test at the unit level and the extra scaffolding we put into our applications to support it is worth the cost.

I’m not going to talk about that subject yet.  Instead, I want to look at some of the costs of unit testing and ask the question “why is unit testing hard?”

After all, if unit testing weren’t hard, we wouldn’t have to question whether or not it was worth it.  It makes sense then to look at first why it is hard and what makes it hard.

ar119675077156567 thumb Back to Basics: Why Unit Testing is Hard

The ideal scenario

Unit testing itself is rather easy once you understand how to do it.  Even test driven or behavior driven development is easy one mastered… at least for the ideal scenario.

What is the ideal scenario then?

It is a unit test where the class under test has no external dependencies.

When a class we are writing unit tests for doesn’t have any external dependencies, we don’t need mocks or stubs or anything else.  We can just write code that tests our code.

Let’s look at an example of this.  Suppose, I had a class called Calculator.  This Calculator class has some very simple methods.  Specifically, let us talk about testing a method Add.  Add takes two single digit integers and returns the result.  If either integer passed in is more than a single digit, it throws an exception.

It is a pretty stupid method, with little use, but it will serve the point well here.

We can TDD or BDD this baby with minimal effort.

Let’s start thinking of test cases:

  • When I add 0 and a single digit number it should return the single digit number
  • When I add 0 and 0 it should return 0
  • When I add two single digit numbers it should return the sum of those numbers
  • When I add one two digit number it should throw an exception

Pretty easy to come up with test cases, just as easy to implement them:

[Test]
public void ZeroAndANumber_IsANumber()
{
   var calculator = new Calculator();
   var result =  calculator.Add(0, 5);
   
   Assert.AreEqual(5, result);
}

 

We can then implement the code that will make this test pass pretty easily.  I won’t show it here since it is so trivial.

This is the ideal scenario, or what I will call Level 1 Unit Testing.

Level 1 Unit Testing is where we have a single class with no external dependencies and no state.  We are just testing an algorithm.

Taking it up a notch

The next level of unit testing is reached if we add state to the class under test.

Level 2 Unit Testing is where we have a single class with no external dependencies but it does have state.  We are setting up an object and testing it as a whole.

If we take our existing example, and now we want to add a new method called GetHistory, it is still not difficult to implement the tests, but it gets harder, because we have to make sure we are setting up some state for our object as part of the test.

Let’s look at one of the test cases we might implement for this functionality:

[Test]
public void When3AddOperationsThenGetHistory_ShouldReturnThose3Results()
{
   var calculator = new Calculator();

   // Arrange   
   calculator.Add(1, 3);
   calculator.Add(2, 5);
   calculator.Add(3, 6);

   // Act
   var result = calculator.GetHistory();

   // Assert
   Assert.Equal(4, result[0]);
   Assert.Equal(7, resut[1]);
   Assert.Equal(9, result[2]);
   Assert.Equal(3, result.Count);
}

 

Again, not too hard.  But, state does make this a bit more difficult.  Here the value of a Behavior Driven Development (BDD) style of unit testing can be seen as it helps us to clearly divide the test up into the different parts we now have.

The real complexity we have added here is that we now have to deal with a setup step before we can execute our test.  BDD deals with this by having a special step for defining the context in which the actual test takes place.  It is called a few different things in different BDD circles, but let’s stick with AAA for this post, since it is easy to remember.

The major difference between Level 1 Unit Testing and Level 2 Unit Testing is that in Level 1, we were really testing only one method.  In Level 2 we are testing at the class level.  Really we could call Level 1 Unit Testing method testing, since the unit we are testing is the method.  The class that method existed in didn’t matter.

Enter dependencies

Let’s see what happens when we throw dependencies into the Calculator class.

Imagine that our Calculator class has to keep an audit trail of our calculations.  We have a service that we can use to put calculations from the Add method into a storage location, like a database and our GetHistory method can query the storage location for the history.

As I was thinking about this, an important point occurred to me.  Were this an integration test, our example test method above wouldn’t change at all.

But, as it turns out we are talking about unit tests here, we need to isolate the testing down to the class level.

So let’s think about what our test should do now.  Here are some possible tests we might have.

  • When I add two number the result is returned and the Store method is called on the StorageService with that result.
  • When I get the history, the RetrieveHistory method is called on the StorageService and it’s results are returned back.

Let’s see what one of these tests might look like:

[Test]
public void WhenAdding2NumbersAndServiceOnline_SumIsReturnedAndStored()
{
   // Arrange
   IStorageService storageServiceMock = Mocker.Mock<IStorageService>();
   storageServiceMock.Stub(service => service.IsServiceOnline())
          .Return(true);

    var calculator = new Calculator(storageServiceMock);   

   // Act
   var result = calculator.Add(3, 4);

   // Assert
   storageServiceMock.AssertWasCalled(service => service.Store(7);
   Assert.Equals(7, result);
}

I call this Level 3 Unit Testing.

Level 3 Unit Testing is when we have a single class with at least one external dependency, but it does not depend on its own internal state.

Things really start to get complicated here, because we have to start thinking not just about inputs and outputs and sequences, but now have to think about interactions.

It really starts to get blurry here about what the expectations of our unit tests should be.  In the example code above, do we need to check to make sure IsServiceOnline is called on the StorageService or do we only check that Store was called?

You’ll also notice here that we had to use a mock and pass our dependency into our class so that we could change its behavior.  Along with that came the burden of creating an interface, so that we could have a mock implementation.

If you’re paying attention right now, you may be thinking to yourself that the example is bad.  You may be thinking that the Calculator class now has two responsibilities.

  • It calculates things and return the result
  • It stores calculation results

Right you are, but we can’t wish away this problem.  Let’s suppose we refactor and move the StorageService dependency out of the Calculator class.  We have several options.  We could make a decorator and use it like this:

var calculator = new Calculator();
var StoringCalculator = new StoringCalculator(calculator);

 

Or we could do something more like a mediator pattern like so:

var calcMediator = new CalculatorMediator(calculator, storageService);

However we attempt to solve this problem, we are still going to have to have some class that will have to have a mock in its unit test. 

There is a simple fact that we cannot get around. If we are going to use the StorageService to store calculations, either Calculator will depend on it, or something else will depend on calculator and it.  There is no alternative to those two options.

There is another simple fact we can’t get around also.  If we are going to depend on another class in our unit test, we either need an interface that we can use for the mock class, or we need a mocking framework that will support mocking concrete classes.

So with Level 3 Unit Testing we are stuck with needing to mock at least one dependency and either creating a bogus interface, or using a mocking library that will let us mock concrete classes.

It gets worse

It only gets worse from here.  At Level 3 we didn’t worry about state inside our calculator class, we worried about an external dependency that pretty much handled state for us.  In many cases though we will have to worry about state and dependencies.

Level 4 Unit Testing is when we have a single class with at least one external dependency and depends on its own internal state.

In our calculator example, we can simply add the requirement that we only want to get the history for a particular session of calculations.  We need to keep track of the calculations so that we can ask the StorageService for the history for our session.

Let’s look at a possible test for that scenario:

[Test]
public void When3AddsThenGetHistory_ShouldReturnOnlyThose3Results()
{
   // Arrange   
   IStorageService storageServiceMock = Mocker.Mock<IStorageService>();
   storageServiceMock.Stub(service => service.IsServiceOnline())
          .Return(true);
   storageServiceMock.Stub(service => service.GetHistorySession(1))
          .Return(new List<int>{4, 7, 9});
   var calculator = new Calculator(storageServiceMock);


   calculator.Add(1, 3);
   calculator.Add(2, 5);
   calculator.Add(3, 6);

   // Act
   var result = calculator.GetHistory();

   // Assert
   storageServiceMock.AssertWasCalled(service => 
              service.GetHistorySession(1);
   Assert.Equal(4, result[0]);
   Assert.Equal(7, resut[1]);
   Assert.Equal(9, result[2]);
   Assert.Equal(3, result.Count);
}

 

Consider for a moment how fragile and complex this unit testing code is.  Consider how simple the functionality of our class is.

We have a major problem here.  Our unit testing code is more complex than the code it is testing!  It’s ok, if the unit testing code is more lines of code than the code it is testing, that is usually the case.  But, I consider it a big problem when our unit testing code is more complex, because you have to ask yourself the very real question.

Where is there more likely to be a bug?

I’m not saying anything yet

My point is not to make a point, at least not yet.  My real goal here is to help us to change the way we think about unit testing.

We need to stop asking the general question of whether not unit testing is worth the cost and instead ask the more specific question of what level of unit testing is worth the cost.

Level 3+ has a very steep cost as mocking is unavoidable and adds considerable complexity to even the most trivial of implementations.

From that we can draw a bit of wisdom.  If we are going to unit test we should strive to encapsulate as much of our pure logic into classes without dependencies and if possible without state.

The other thing to consider is that as the difficulty and complexity of the unit tests are increasing each level, the goal of the test and value starts to become lost also.

What I mean by this, is that when we start testing that our class properly calls another class with certain parameters, we are crossing over into testing the implementation details of the class.

If I say a class should be able to add 2 numbers and return the result.  I am not talking about how it has to do it.  As long as the result is correct, how doesn’t matter.

When I add a mock and say a class needs to add 2 numbers and store the result using a StorageService by calling the method Store on it, I have now tied how into the test.  Changing how breaks the test.

That’s all we are going to look at for now.  If you’ve read some of my other back to basics posts, you can see the progression up to this point.  I’ve been discounting using interfaces and dependency injection for the sake of unit testing, but I have yet to offer an alternative.  I still don’t.  To be honest, I don’t have one yet.  But, I do believe by breaking down this problem to its roots we can evaluate what we are doing and determine what our true problems are.

By the end of this series I hope to have a solution and a recommendation for tackling these kinds of problems.

As always, you can subscribe to this RSS feed to follow my posts on Making the Complex Simple.  Feel free to check out ElegantCode.com where I post about the topic of writing elegant code about once a week.  Also, you can follow me on twitter here.

Basic to Basics: What is Dependency Inversion? Is it IoC? Part 2

In my previous post on dependency inversion, I talked about what dependency inversion is and gave some examples in the real world.

This post is going to focus much more on the details and how it relates to code.

Back to your code…

Now let’s look at a code example to see how dependency inversion helps us out. Let’s say you are creating a high level module for parsing log files and storing some basic information into a database.

In this case you want to be able to handle several different log files from a number of different sources and write some common data they all share to a database.
One approach to this kind of problem is to have your module handle each kind of log file based on what kind of data and format it contains and where it is. Using this approach, in your module you would handle various kinds of log files based on the interface those individual log files present to you. (When I use interface here, I am not talking about the language construct, but the concept of how we interface with something.)

Using this approach, in our module we might have a switch statement or series of if-else statements that lead us to a different code path depending on what kind of log file we are processing. For one log file we might open up a file on disk, and read a line, then split that line based on some delimiter. For another perhaps we open a database connection and read some rows.

The problem is the log files are defining the interface our higher level code has to use. They are in effect “in control” of our code, because they are dictating the behavior our code must conform to.

We can invert this control, and invert the dependencies by specifying an interface that the log files we process must conform to. We don’t even have to use a language level interface.

We could simply create a data class called LogFile that is the input to our module. Anyone who wanted to use our module would first have to convert their files to our format.

We could also create an ILogFileSource interface that classes could implement to contain the logic of parsing log files from different sources. Our module would depend on ILogFileSource and specify what kind of methods and data it needs to parse the log files instead of the other way around.

The key point here is that our high level module should be controlling the interface (non language construct kind) that the lower level modules need to adhere to instead of being at the whim of the interfaces of each lower level module.

One way to think of this is that lower level modules provide a service to higher level modules. The higher level modules specifies the interfaces for that service and the lower level module provides that service.

room service thumb Basic to Basics: What is Dependency Inversion? Is it IoC? Part 2

One thing I want to point out in this example is that we knew there would be more than one log file source. If we were writing a log file parsing module that was only ever going to work against one source it might not be worth trying to invert this dependency because we wouldn’t see any benefit from doing so. It isn’t very hard for us to write out code as cleanly as possible working with one source and then refactor it later to invert the dependencies once we have additional sources.

Just because you can invert dependencies doesn’t mean you should.

In this case since we are always writing to a database, I don’t feel any particular need to invert our dependency on writing out the log files. However, there is some real value in encapsulating all of our code that interacts with the database into one place, but that is for another post.

Notice we haven’t talked about unit testing yet

 

You see the problem of dependency inversion and inversion of control has nothing specifically to do with unit testing.

Simply slapping an interface on top of a class and injecting it into another class may help with unit testing, but it doesn’t necessarily invert control or dependencies.

I want to use the log parsing example to illustrate my point. Let’s say we had created our log parser to have a switch statement to handle each type of log file, and now we want to unit test the code.

There is no reason why we can’t create IDatabaseLogFile, ICSVFileSystemLogFile, IEventLogLogFile and IAnNotReallyDoingIoCLogFile, pass them all into the constructor of our LogFileParser as dependencies and then write our unit tests passing in mocks of each.

That in an extreme example for sure, but the point is slapping an interface onto a class does not an IoC make.

We shouldn’t be trying to implement this principle to make it easier to write unit tests. Difficult to write unit tests should give us hints like:

  • Our class is trying to do too much
  • Our class has lots of different dependencies
  • Our class requires a lot of setup to do work
  • Our class is just like this other class that does the same thing only for a different input

All of these kinds of hints tell us that we might want to invert control and invert dependencies to improve the overall design of our class, not because it makes it easier to test. (Although it should also make it easier to test.)

Ok, ok, so is dependency inversion the same as inversion of control or what?

 

Short answer: yes.

It depends on what you mean by control. There are three basic “controls” that can be inverted.

  1. The control of the interface. (How do these two systems, modules, or classes, interact with each other and exchange data?)
  2. The control of the flow. (What controls the flow the program? This control inversion happens when we go from procedural to event driven.)
  3. The control of dependency creation and binding. (This is the kind of inversion of control IoC containers do. This inversion is passing the control of the actual creation of and selection of dependencies to a 3rd party which is neutral to either of the other 2 involved.)

Each of these 3 is a specific form of dependency inversion and may even involve multiple kinds of dependencies being inverted.

So when someone says “inversion of control”, you should be thinking “what control is being inverted here?”

Dependency inversion is a principle that we use in architecting software.

Inversion of control is a specific pattern that is applied to do so.

Most people only think of inversion of control as #3 above, inverting the control of dependency creation and bind. This is where IoC containers and dependency injection take root.

What can we learn from this?

My goal is that we stop grouping the concepts of inversion of control and dependency inversion automatically with dependency injection.

We have learned that dependency inversion is the core principle that guides many of the other practices that have derived from it.

Whenever we apply a pattern we should be looking for the core principle it is tied to and what problem it is helping us solve.

With this base understanding of dependency inversion and inversion of control, we have the prerequisite knowledge to look at dependency injection and understand better what specific problem it tries to solve. (Which I will cover in another post.)

As always, you can subscribe to this RSS feed to follow my posts on Making the Complex Simple.  Feel free to check out ElegantCode.com where I post about the topic of writing elegant code about once a week.  Also, you can follow me on twitter here.

The Purpose of Unit Testing

I was reminded yesterday that there are still many people out there who still don’t really understand the purpose of unit testing.

A funny shift happened in the last 5 or so years.

About 5 years ago, when I would suggest TDD or just doing some unit testing when creating code, I would get horrible responses back.  Many developers and managers didn’t understand why unit testing was important and thought it was just extra work.

More recently when I have heard people talking about unit testing, almost everyone agrees unit testing is a good idea, but not because they understand why, but because it is now expected in the programming world.

Progress without understanding is just moving forward in a random direction.

trashtime thumb The Purpose of Unit Testing

Getting back to the basics

Unit testing isn’t testing at all.

Unit testing, especially test driven development, is a design or implementation activity, not a testing activity.

You get two primary benefits from unit testing, with a majority of the value going to the first:

  1. Guides your design to be loosely coupled and well fleshed out.  If doing test driven development, it limits the code you write to only what is needed and helps you to evolve that code in small steps.
  2. Provides fast automated regression for refactors and small changes to the code.

I’m not saying that is all the value, but those are the two most important.

(Unit testing also gives you living documentation about how small pieces of the system work.)

Unit testing forces you to actually use the class you are creating and punishes you if the class is too big and contains more than one responsibility.

By that pain, you change your design to be more cohesive and loosely coupled.

You consider more scenarios your class could face and determine the behavior of those, which drives the design and completeness of your class.

When you are done, you end up with some automated tests that do not ensure the system works correctly, but do ensure the functionality does not change.

In reality, the majority of the value is in the act of creating the unit tests when creating the code.  This is one of the main reasons why it makes no sense to go back and write unit tests after the code has been written.

The flawed thinking

Here are some bad no-nos that indicate you don’t understand unit testing:

  • You are writing the unit tests after the code is written and not during or before.
  • You are having someone else write unit tests for your code.
  • You are writing integration or system tests and calling them unit tests just because they directly call methods in the code.
  • You are having QA write unit tests because they are tests after all.

Unit tests are a lot of work to write.  If you wanted to cover an entire system with unit tests with a decent amount of code coverage, you are talking about a huge amount of work.

If you are not getting the first primary value of unit testing, improving your design, you are wasting a crap load of time and money writing unit tests.

Honestly, what do you think taking a bunch of code you already wrote or someone else did and having everyone start writing unit tests for it will do?

Do you think it will improve the code magically just by adding unit tests without even changing the code?

Perhaps you think the value of having regression is so high that it will justify this kind of a cost?

I’m not saying not to add unit tests to legacy code.  What I am saying is that when you add unit tests to legacy code, you better be getting your value out of it, because it is hard work and costs many hours.

When you touch legacy code, refactor that code and use the unit tests to guide that refactored design.

Don’t assume unit tests are magic.

unicornmagic thumb The Purpose of Unit Testing

Unit tests are like guidelines that help you cut straight.  It is ridiculous to try and add guidelines to a word-working project after you have already cut the wood.

Living Dangerously: Refactoring without a Safety Net

It’s usually a good idea to have unit tests in place before refactoring some code.

I’m going to go against the grain here today though and tell you that it is not always required.

Many times code that should be refactored doesn’t get refactored due to the myth that you must always have unit tests in place before refactoring.

In many cases the same code stays unimproved over many revisions because the effort of creating the unit tests needed to refactor it is too high.

I think this is a shame because it is not always necessary to have unit tests in place before refactoring.

manonwire3 thumb Living Dangerously: Refactoring without a Safety Net

Forgoing the safety net

If you go to the circus, you will notice that some acts always have a safety net below because the stunt is so dangerous that there is always a chance of failure.

You’ll also notice that some acts don’t have a safety net because even though there is risk of danger, it is extremely small, because of the training of the performers.

Today I’m going to talk about some of the instances where you don’t necessarily need to have a safety net in place before doing the refactor.

Automatic refactoring

This is an easy one that should be fairly obvious.  If you use a modern IDE like Visual Studio, Eclipse, or IntelliJ, you will no doubt have seen what I call “right-click refactor” options.

Any of these automatic refactors are pretty much safe to do anytime without any worry of changing functionality.  These kinds of automated refactors simply apply an algorithm to the code to produce the desired result and in almost all cases do not change functionality.

These refactoring tools you can trust because there is not a chance for human error.

Any time you have the option of using an automatic refactoring, do it!  It just makes sense, even if you have unit tests.  I am always surprised when I pair up with someone and they are manually refactoring things like “extract method” or “rename.”

Most of the time everything you want to do to some code can be found in one of the automatic refactoring menus.

Small step refactors

While not as safe as automatic refactors, if you have a refactor that is a very small step, there is a much higher chance your brain can understand it and prevent any side effects.

A good example of this would be my post on refactoring the removal of conditions.

The general idea is that if you can make very simple small steps that are so trivial that there is almost no chance of mistake, then you can end up making a big refactor as the net effect of those little changes.

This one is a judgment call.  It is up to you to decide if what you are doing is a small step or not.

I do find that if I want to do a refactor that isn’t a small step refactor, I can usually break it down into a series of small steps that I can feel pretty confident in.  (Most of the time these will be automated refactors anyway.)

Turning methods into classes

I hate huge classes.  Many times everyone is afraid to take stuff out of a huge class because it is likely to break and it would take years to write unit tests for that class.

One simple step, which greatly improves the architecture and lets you eventually create unit tests, is to take a big ol’ chunk of that class, move it to a new class, and keep all the logic in there exactly how it is.

It’s not always totally clean, you might have to pass in some dependencies to the new method or new class constructor, but if you can do it, it can be an easy and safe refactor that will allow you to write unit tests for the new class.

Obviously this one is slightly more dangerous than the other two I have mentioned before, but it also is one that has a huge “bang for your buck.”

Unit tests, or test code themselves

Another obvious one.  Unless you are going to write meta-unit tests, you are going to have to live a little dangerously on this one.  You really have no choice.

I think everyone will agree that refactoring unit tests is important though.   So, how come no one is afraid to refactor unit tests?

I only include this example to make the point that you shouldn’t be so scared to refactor code without unit tests.  You probably do it pretty frequently with your unit tests.

I’m not advocating recklessness here

I know some of you are freaking out right now.

Be assured, my message is not to haphazardly refactor code without unit tests.  My message is simply to use temperance when considering a refactor.

Don’t forgo a refactor just because you are following a hard and fast rule that you need unit tests first.

Instead, I am suggesting that some refactorings are so trivial and safe that if it comes between the choice of leaving the code as it is because unit testing will take too long, or to refactor code without a safety net, don’t be a… umm… pu… wimp.  Use your brain!

Things that will bite you hard

There are a few things to watch out for, even with the automatic refactoring.  Even those can fail and cause all kinds of problems for you.

Most of these issues won’t exist in your code base unless you are doing some crazy funky stuff.

  • If you’re using dynamic in C#, or some kind of PInvoke, unsafe (pointer manipulation) or COM interop, all bets are off on things like rename.
  • Reflection.  Watch out for this one.  This can really kick you in the gonads.  If you are using reflection, changing a method name or a type could cause a failure that is only going to be seen at runtime.
  • Code generation.  Watch out for this one also.  If generated code is depending on a particular implementation of some functionality in your system, refactoring tools won’t have any idea.
  • External published interfaces.  This goes without saying, but it is so important that I will mention it here.  Watch out for other people using your published APIs.  Whether you have unit tests or not, refactoring published APIs can cause you a whole bunch of nightmares.

This list isn’t to scare you off from refactoring, but if you know any of the things in this list are in your code base, check before you do the refactor.  Make sure that the code you are refactoring won’t be affected by these kinds of things.