Category Archives: Legacy Code

Tie Your Shoes and Pull Up Your Pants

What slows down the development of software?

Think about this question for a bit.  Why is it that as most software evolves it gets harder and harder to add features and improve its structure?

Why is it that tasks that would have at one point been simple are now difficult and complex?

Why is it that teams that should be doing better over time seem to get worse?

mp900430685 thumb Tie Your Shoes and Pull Up Your Pants

Seeking answers

Don’t feel bad if you don’t have an immediate answer to those questions.  Most software practitioners don’t.  They are hard questions after all.

If we knew all the answers, we wouldn’t really have these problems to begin with.

Regardless though, you’ll find many managers, business owners, customers and even software developers themselves looking for the answers to these questions, but often looking in the wrong place.

Process is almost always the first to be blamed. It stands to reason that a degradation of process or problems with the software development process are slowing things down.

Often there is some merit to this proposition, but I’ve found that it is often not the root cause. If your team is not sitting idle and the work that is important is being prioritized, chances are your process is not slowing you down.

Now don’t get me wrong here.  I am not saying that these are the only two important aspects to judge a software development process, but I am saying that if generally your team is working hard on important stuff most of the time, you can’t magically improve process to the point of increasing productivity to any considerable order of magnitude.  (In most cases.)

Often questions are asked like:

  • Should we pair program or not pair program?
  • Should we be using Scrum instead of Kanban?
  • Should we be changing the way we define a backlog?
  • Should we use t-shirt sizes or story points or make all backlogs the same size?
  • Do we need more developers or more business analysts?
  • Do we need to organize the team differently?

Now these are all great questions that every software project should constantly evaluate and ask themselves, but I’ve found over and over again that there is often a bigger problem staring us in the face that often gets ignored.

The code!

mp900289113 thumb Tie Your Shoes and Pull Up Your Pants

Let’s do a little experiment.

Forget about process.  Forget about Scrum and backlogs and story points and everything else for a moment.

You are a developer.  You have a task to implement some feature in the code base.  No one else is around, there is no process, you just need to get this work done.

It might help to think about a feature you recently implemented or one that you are working on now.  The important thing with this experiment is that I want to take away all the other “stuff” that isn’t related directly to designing and implementing that feature in the code base.

You will likely come to one of these conclusions:

1. The feature is easy to implement, you can do it quickly and know where to go and what to modify.

Good!  That means you don’t really have a problem.

2. It is unclear what to do.  You aren’t sure exactly what you are supposed to implement and how it fits into the way the system will be used.

In this case, you may actually have somewhat of a process problem.  Your work needs to be more clearly defined before you begin on it.  It may be that you just need to ask more questions.  It may be that half baked ideas are ending up in your pipeline and someone needs to do a bit more thinking and legwork, before asking a developer to work on them.

3. Its hard to change the code.  You’ve got to really dig into multiple areas and ask many questions about how things are working or are intended to work before you can make any changes.

This is the most likely case.  Actually usually a combination of 2 and 3.  And they both share a common problem—the code and system do not have a design or have departed from that design.

I find time and time again with most software systems experiencing a slow down in feature development turnaround that the problem is the code itself and the system has lost touch with its original design.

You only find this problem in successful companies though, because…

Sometimes you need to run with your shoelaces untied

I’ve consulted for several startups that eventually failed.  There was one thing in common with those startups and many other startups in general—they had a well maintained and cared for codebase.

I’ve seen the best designs and best code in failed startups.

This seems a bit contradictory, I know, but let me explain.

The problem is that often these startups with pristine and well maintained code don’t make it to market fast enough.  They are basically making sure their shoes laces are nicely tied as they stroll down the block carefully judging each step before it is taken.

What happens is they have the best designed and most maintainable product, but it either doesn’t get out there fast enough and the competition comes in with some VB6 app that two caffeine fueled not-really-programmers-but-I-learned-a-bit-of-code developers wrote overnight or they don’t actually build what the customer wants, because they don’t iterate quick enough.

Now am I saying that you need to write crap code with no design and ship it or you will fail?

Am I saying that you can’t start a company with good software development practices and a clean well maintainable codebase and succeed?

No, but what I am saying is that a majority of companies that are successful are the ones that put the focus on customers and getting the product out there first and software second.

In other words if you look at 10 successful companies over 5 years old and look at their codebase, 9 of them might have some pretty crappy or non-existent architecture and a system that departed pretty far from the original design.

Didn’t you say something about pulling up your pants?

pants around ankles thumb Tie Your Shoes and Pull Up Your Pants

Ok, so where am I driving at with all this?

Time for an analogy.

So these companies that are winning and surviving past year 5, they are usually running.  They are running fast, but in the process of running their shoelaces come untied.

They might not even notice the shoelaces are untied until the first few times they step on one and trip.  Regardless they keep running.  And to some degree, this is good, this is what makes them succeed when some of their failed competitors do take the time to tie their shoelaces, but those competitors end up getting far behind in the race.

The problem comes pretty close to after that 5 year mark, when they want to take things to the next level.  All this time they have been running with those shoelaces untied and they have learned to do this kind of wobble run where they occasionally trip on a shoe lace, but they try to keep their legs far enough apart to not actually step on a shoelace.

It slows them down a bit, but they are still running.  Still adding those features fast and furious.

After some time though, their pants start to fall down.  They don’t really have time to stop running and pull up those pants, so as they are running those pants slip further down.

Now they are really running funny.  At this point they are putting forth the effort of running, but the shoelaces and pants are just too much, they are moving quite slow.  An old woman with ankle weights power walks past them, but they can’t stop now to tie the shoelaces and pull up those pants, because they have to make up for the time they lost earlier when the pants first fell down.

At this point they start looking for ways to fix the problem without slowing down and pulling up the pants.  At this point they try running different ways.  They try skipping.  Someone gets the idea that they need more legs.

I think you get the idea.

What they really need to do at this point though is…

Stop running, tie your shoes and pull up your pants!

Hopefully you’ve figured out that this analogy is what happens to a mature system’s code base and overall architecture.

Over time when you are running so fast, your system ends up getting its shoelaces undone, which slows you down a little.  Soon, your system’s pants start to fall down and then you really start to slow down.

It gets worse and worse until you are moving so slow you are actually moving backwards.

Unfortunately, I don’t have a magic answer.  If you’ve gotten the artificial speed boost you can gain from neglecting overall system design and architecture, you have to pay the piper and redesign that system and refactor it back into an architecture.

This might be a complete rewrite, it might be a concerted effort to get things back on track.  But, regardless it is going to require you to stop running.  (Have you ever tried to tie your shoelaces while running?)

Don’t feel bad, you didn’t do anything wrong.  You survived where others who were too careful failed.  Just don’t ignore the fact that your pants are at your ankles and you are tripping over every step, do something about it!

Going Backwards to Go Forwards

I worked on an interesting problem this week that might have looked like I was running around in circles if you just looked at my SVN commits.

The problem, and the eventual solution, reminded me of an important part of software development—of building anything really.

Sometimes you must tear it down!

No really, sometimes you build a structure only to tear it down the very next day.

It’s not a mistake.  It is intentional and productive and if you are not doing it, you very well might be making a real mistake.

Skeptical?

data and spock thumb Going Backwards to Go Forwards

That is highly illogical

Imagine for a moment that you are tasked with the job of repairing the outside walls of a 2 story building.

There are of course many ways you could go about doing something like this.

  • Use a ladder and just reach the part of the wall the ladder allows you to.  Then move that ladder, repeating the process as needed to complete the repair to the entire wall.
  • Try to lower yourself down from different windows to reach as much of the wall as possible.
  • Tear down the entire building and rebuild the building and walls.

I am sure there are plenty of other methods besides what I listed here.

Yet, a very simple approach would be to build a scaffolding.

A scaffolding is basically a temporary construction used just to help repair or initially build a building which is built for the very purpose of eventually being torn down.

Software is different, we don’t contend with physics!

You are absolutely right!

We contend with a much more powerful force…

LOGIC!

Conceptually anything you can create in software could be created without any kind of software scaffolding.  Unfortunately though, the complexities of the logic of a system and our abilities as humans to only contain so much of it in our brains impose a very real limitation on our ability to even see what the final structure of a metaphysical thing like software should be.

So what am I saying here then?

I’m just saying that it sometimes helps to remember that you can temporarily change some code or design in a way that you know will not be permanent to get you to a place in the codebase where your head is above the clouds and you can look down and see the problem a little better.

Just like there are physical ways to repair a wall on a 2 story building that don’t involve wasting any materials by building something that will be torn down, there are ways to do build software without ever building scaffoldings, but in either case it is not the most efficient use of time or effort.

Don’t be afraid to take a blind hop of faith

Source control is awesome!

Source control lets us make changes to a code base, track those changes and revert them if needed.

Working on a code base without source control is like crawling into a small crevice in an underground cave that has just enough room to fit your shoulders—you are pretty sure you can go forward, but you don’t know if you can crawl back out.

morrils cave crevice thumb Going Backwards to Go Forwards

But, with source control, it is like walking around a wide open cave charting your trail on a map as you go.  You can get back to where you want to any time.

So as long as you have source control on your side, and especially if you are using a distributed source control where you can make local commits all day long, you don’t have to be afraid to step out on a software ledge and see where things go.

Oftentimes I will encounter some problem in code that I know needs to be fixed and I know what is wrong with it, but I just don’t know how to fix it.

When I encounter a problem like this, I will often have an idea of at least one step I could take that should take me in the direction I believe the code needs to go.

A real example

To give you a real example, I was working recently on some code that involved creating a lot for a product.  (Think about the lot number you might see on a box of Aspirin that identifies what batch of ingredients it came from and what machine produced it when.)

kas 9012 thumb Going Backwards to Go Forwards

Up to the point in which I had been working with my teammate on refactoring this code, there had only been only one way to produce the lots.

We were adding some code that would add an additional way to create the lots and the labels for those lots.  We also knew that there would be more additional ways in the future that would need to be added.

Well, we knew we wanted to have a generic way of specifying how lots should be created and labeled, but we didn’t know what the code that would do that would look like.

We could have rewritten the entire lot handling code and made it generic, but it was quite complex code and that would be equivalent to tearing down a building to fix a wall.

It was apparent there were some parts of the existing code that were specific to the way that the current lots were being produced, so we knew those parts of the code had to be changed.

So what did we do?

We started with what we knew needed to be changed and what we thought would move us in the direction we wanted to go, and we built some structures to allow us to refactor those parts of the code, knowing that we would probably end up deleting those structures.

The scaffolds that we created allowed us to have a midway point in the long journey across the Pacific ocean of complexity in our refactor trip.

The point here is that we had to take a bit of hop blindly in a direction we thought things needed to go and part of that hop involved creating some scaffolding type structures to get that code to a place where it still worked and we could examine it again for the next refinement.

The final refinements ended up deleting those scaffolds and replacing them with a more elegant solution, but it is doubtful we would have been able to see the path to the elegant solution without building them in the first place.

Show me the code!?

You may wonder why I am talking about code in such an abstract way instead of just showing you a nice little “scaffold pattern” or a real code example.

I’m not going to show the code, because it isn’t relevant to my point and it is so situational that it would detract from what I am trying to say here.

The point is not what our problem or solution was, but how we got there.

There isn’t a “scaffolding” pattern that you can just apply to your code, rather it is a tool that you can use and shouldn’t be afraid to use in order to move your code forward to a better design.  (To its happy place.)

Refactoring Switches to Classes

I’ve talked about refactoring switch statements several times before.

I’ve even created a defaultable dictionary for refactoring a switch statement into a dictionary of actions.

This time, I am going to talk about refactoring switches when you have switch statements operating on the same set of data, but have different actions in different circumstances.

switch thumb Refactoring Switches to Classes

First let’s recap

When I talked about refactoring switches before, we were mainly dealing with a single switch statement somewhere in code.

In the case where you have only a single switch statement, or multiple switch statements that do the same thing based on the data, using a dictionary is still a great way to go.

However, there are going to be circumstances where you are going to be switching on the same data, but in different contexts.  In these cases, you will want to perform different actions.

Let’s look at an example.

// In fighting code
switch(classType)
{
    case WARRIOR:
          swingSword();
          break;
    case MAGE:
          castSpell();
          break;
    case THIEF:
          backstab():
          break;
}

// In wear armor code
switch(classType)
{
    case WARRIOR:
          return CAN_WEAR;
    case MAGE:
          return isConsideredLightArmor(armor);
    case THIEF:
          if(isSneaking)
              return NOT_NOW;
          return isConsideredLightArmor(armor);
}

In this example, we are switching on the same enumeration, but we are doing it in different locations of the code.

Using a dictionary would not work well here because we would need multiple dictionaries.

We still don’t want to leave this as it is though, because the code is pretty messy and fragile.

mage thumb Refactoring Switches to Classes

Separation of concerns

The problem is the code that contains these switch statements has too much responsibility.  It is being asked to handle logic for each one of our character class types.

What we need to do to improve this code is refactor the enumerations into their own classes.  Each switch statement will become a method that will be implemented by our enumeration based class.

If we are using Java, we can use Java’s enumeration implementation that allows for methods on an enumeration.  If we are using a language like C#, we still have to map the enumeration value to each class.

Let’s start by making our classes.

First we need a base class, or interface.

public interface CharacterClass
{
    void Attack();
    ArmorResponse WearArmor(armor);
}

Now we can create classes that implement this interface, that contain the logic that was in each switch statement.

public class Warrior : CharacterClass
{
    void Attack()
    {
       swingSword();
    }

    ArmorResponse WearArmor(armor)
    {
       return CAN_WEAR;
    }
}

public class Mage : CharacterClass
{
    void Attack()
    {
       castSpell();
    }

    ArmorResponse WearArmor(armor)
    {
       return isConsideredLightArmor(armor);
    }
}

public class Thief : CharacterClass
{
    void Attack()
    {
       backstab();
    }

    ArmorResponse WearArmor(armor)
    {
       if(isSneaking)
           return NOT_NOW;
       return isConsideredLightArmor(armor);
    }
}

Next we can map our enumeration to our class.

public Dictionary characterDictionary =
    new Dictionary {
    { WARRIOR, new Warrior() },
    { MAGE, new Mage() },
    { THIEF, new Thief() }
};

We could also get rid of the enumeration if we wanted, and just create the appropriate class.  It will depend on what your existing code looks like.

No more switches!

Now let’s take a look at what we end up with in the two locations where we had switches.

// In fighting code
myCharacter.Attack();

// In wear armor code
var armorResponse =  myCharacter.WearArmor(armor);

If we want to add a new character class type, we just add a new class that implements the CharacterClass interface and put a mapping in our dictionary, or in our character initialization code.

If we end up having other places in our logic where different character class types should have different behavior, we just add a method to our CharacterClass interface and implement it in any classes that implement CharacterClass.

Our code is much more maintainable, and easier to understand.

The Purpose of Unit Testing

I was reminded yesterday that there are still many people out there who still don’t really understand the purpose of unit testing.

A funny shift happened in the last 5 or so years.

About 5 years ago, when I would suggest TDD or just doing some unit testing when creating code, I would get horrible responses back.  Many developers and managers didn’t understand why unit testing was important and thought it was just extra work.

More recently when I have heard people talking about unit testing, almost everyone agrees unit testing is a good idea, but not because they understand why, but because it is now expected in the programming world.

Progress without understanding is just moving forward in a random direction.

trashtime thumb The Purpose of Unit Testing

Getting back to the basics

Unit testing isn’t testing at all.

Unit testing, especially test driven development, is a design or implementation activity, not a testing activity.

You get two primary benefits from unit testing, with a majority of the value going to the first:

  1. Guides your design to be loosely coupled and well fleshed out.  If doing test driven development, it limits the code you write to only what is needed and helps you to evolve that code in small steps.
  2. Provides fast automated regression for refactors and small changes to the code.

I’m not saying that is all the value, but those are the two most important.

(Unit testing also gives you living documentation about how small pieces of the system work.)

Unit testing forces you to actually use the class you are creating and punishes you if the class is too big and contains more than one responsibility.

By that pain, you change your design to be more cohesive and loosely coupled.

You consider more scenarios your class could face and determine the behavior of those, which drives the design and completeness of your class.

When you are done, you end up with some automated tests that do not ensure the system works correctly, but do ensure the functionality does not change.

In reality, the majority of the value is in the act of creating the unit tests when creating the code.  This is one of the main reasons why it makes no sense to go back and write unit tests after the code has been written.

The flawed thinking

Here are some bad no-nos that indicate you don’t understand unit testing:

  • You are writing the unit tests after the code is written and not during or before.
  • You are having someone else write unit tests for your code.
  • You are writing integration or system tests and calling them unit tests just because they directly call methods in the code.
  • You are having QA write unit tests because they are tests after all.

Unit tests are a lot of work to write.  If you wanted to cover an entire system with unit tests with a decent amount of code coverage, you are talking about a huge amount of work.

If you are not getting the first primary value of unit testing, improving your design, you are wasting a crap load of time and money writing unit tests.

Honestly, what do you think taking a bunch of code you already wrote or someone else did and having everyone start writing unit tests for it will do?

Do you think it will improve the code magically just by adding unit tests without even changing the code?

Perhaps you think the value of having regression is so high that it will justify this kind of a cost?

I’m not saying not to add unit tests to legacy code.  What I am saying is that when you add unit tests to legacy code, you better be getting your value out of it, because it is hard work and costs many hours.

When you touch legacy code, refactor that code and use the unit tests to guide that refactored design.

Don’t assume unit tests are magic.

unicornmagic thumb The Purpose of Unit Testing

Unit tests are like guidelines that help you cut straight.  It is ridiculous to try and add guidelines to a word-working project after you have already cut the wood.

Refactoring Static Methods Step-Wise vs Wrapping and Delegating

In working with legacy code, I often come across the problem of having to refactor classes that contain static methods or are entirely static methods.

I talked about refactoring helper classes before, but this is slightly different.

In this case I want to talk about refactoring classes that you want to keep around, but have all or many static members.  A good example of this is some kind of service class that returns data from the database.

It’s not always very clear whether that kind of class really is some sort of helper class.  It is a bit of a judgment call.  If you do find a helper class though, don’t just leave it there.

So, basically if you have determined the class you are working with is going to stay, but it does have static methods and you need to get rid of them because you are doing dependency injection or mocking, read on.

Defining the two approaches

What do I mean by step-wise refactoring?

Here is the basic outline of the first approach:

  1. Make the method you need to be non-static, non static.
  2. Add an interface with just that one method in it.
  3. Implement the interface.
  4. Change the references to that method to use the interface instead.

Let’s take a look at an example:

public static class LostOrderService
{
    public static IEnumerable<Orders> GetLostOrders(int customerId)
    {
    }

    public static IEnumerable<Customer> GetCustomerWithLostOrders()
    {
    }
}

If we are interested in the GetLostOrders method, we can apply steps 1-3 to get:

public interface ILostOrderService
{
    IEnumerable<Orders> GetLostOrders(int customerId);
}

public class LostOrderService : ILostOrderService
{
    public IEnumerable<Orders> GetLostOrders(int customerId)
    {
    }

    [Obsolete("If you touch this refactor to make it not static.
                      Be a man, do the right thing.")]
    public static IEnumerable<Customer> GetCustomerWithLostOrders()
    {
    }
}

Now we can go in and change references in our code for just that one method.

public void ProcessLostOrders()
{
    var orders = LostOrderService.GetLostOrders(_customerId);
    foreach(var order on orders)
    {
        ReprocessOrder();
    }

}

// Refactored becomes:

public void ProcessLostOrders(ILostOrderService lostOrderService)
{
    var orders = lostOrderService.GetLostOrders(_customerId);
    foreach(var order on orders)
    {
        ReprocessOrder();
    }

}

Now let’s look at the 2nd technique, wrapping and delegating.  Here is the outline of the wrapping and delegating approach:

  1. Create a wrapper class that’s going to be used to wrap the static classes calls.
  2. Implement all the methods in the static class as non-static methods in the wrapper class.  Each method just delegates to the static method in the static class.
  3. Create an interface which contains all the methods.
  4. Have the wrapper class implement the interface.

Here is an example of doing it this way, given the same original code as the first example:

public interface ILostOrderService
{
    IEnumerable<Orders> GetLostOrders(int customerId);
    IEnumerable<Customer> GetCustomerWithLostOrders()
}

public class LostOrderServiceWrapper : ILostOrderService
{
    public IEnumerable<Orders> GetLostOrders(int customerId)
    {
        LostOrderService.GetLostOrders(customerId);
    }

    public IEnumerable<Customer> GetCustomerWithLostOrders()
    {
        LostOrderService.GetCustomerWithLostOrders();
    }
}

The references to the LostOrderService will be refactored exactly the same as in the first example, so I won’t include it here.

You can see in this example, we didn’t touch LostOrderService itself.  Except you probably want to put an Obsolete attribute on the class to tell users to not use this, but use the wrapper class instead.

Which is more bettah?

I’ve tended to use the wrapping and delegating approach in the past, but I am starting to think the step-wise approach is better for a few reasons.

  • The step-wise approach is a bit more obvious to someone later using the class.  When you wrap and delegate, someone has to know there is a wrapper class.  With the step-wise approach, there is no choice.
  • With the step-wise approach, you are actually getting rid of the bad and evil static methods.  When you wrap and delegate, you are still leaving them there, just hiding them behind a wrapper.

To me, the wrapping approach feels more like I am working around things rather than cleaning them up.  I also feel like someone can see what I started and pick it up from there step-by-step.  Where with the wrapping approach, the mess may never get cleaned up, because there is a workaround.

Where wrap and delegate shines

There is a place that wrap and delegate wins hands down though.

If you don’t have control over the source code of the static class or static calls, you cannot do the step-wise approach.

The wrap and delegate approach can be a lifesaver when you are dealing with static references in your code to an external library that you cannot change.  You can simply wrap the external library calls and instead reference the wrapper in your code.  Now you can actually unit test that code.

Anytime you are using an external library, you should consider putting some kind of protective wrapper around it.  You never know when you may want to replace it or upgrade the library.  You don’t want to go hunting through all your code looking for references.

So, while either way will work, I prefer to use the step-wise method if I have access to the source code of the static class.

What do you think?  Do you have any other solutions?

How to Refactor the Helper Class

In my previous post, I posed the question Should I Leave that Helper Class?  Hopefully I’ve convinced you that you should not leave, but should refactor the helper class.

jabbahelper How to Refactor the Helper Class

Now, I’m going to detail some of the techniques I have used to eliminate helper classes in legacy code.

First, let’s set a ground rule: We are not going to just jump into legacy code and eliminate helper classes for the heck of it.  Why?

  1. It doesn’t have a good return on investment (ROI) for the time you spend doing it.
  2. Your manager or general overlord will probably look at you with a disapproving frown, since you’re not adding any tangible value to the product.
  3. If you break something, you will give refactoring a bad name, and be shunned by other developers.  You will have to wear a big red scarlet “R”.
  4. It is not exactly fun.  I mean, it shouldn’t be fun…  Let me put it this way.  If this kind of thing is fun for you, then I’ve got a bunch of  other “fun” stuff you can do around my house.

So what are we going to do then?  We are going to refactor the helper class into real classes or existing classes when we are modifying or adding functionality to it.  Let’s get started…

Modifying a method

If you have to modify a method that is in a helper class, the very first step is to move the logic as it is into a concrete class that we can write a unit test for.  Here is an example:

public static int getDependantCount(MonsterObject stuff)
{
    int dependantCount = 0;
    List<Relation> relations = stuff.getPerson().getRelations();
    foreach(Relation relation in relations)
    {
        if(RelationshipHelper.isDependant(stuff.getPerson(), relation))
        {
            dependantCount++;
        }
    }

    return dependantCount;
}

Looking at this example, the first thing we need to do is to figure out what real class this helper class’s method belongs to.  (Quick side note here:  notice the helper method in question also uses another helper class.  This is likely to be the case in the real world.)

One technique I use to figure this out is to look and see what data this helper method is using.  In this simple example it is pretty obvious that the data it is operating on belongs to Person, even though the method is passing in MonsterObject. Usually the correct place to move a helper method is the place where you will maximize the amount of this operators that are used in the method.

In this case, let’s move the helper method to person.  Here is what it would look like after:

private int getDependantCount()
{
    int dependantCount = 0;
    List<Relation> relations = this.getRelations();
    foreach(Relation relation in relations)
    {
        if(RelationshipHelper.isDependant(this, relation))
        {
            dependantCount++;
        }
    }

    return dependantCount;
}

Notice what we did here?

  • We eliminated a parameter being passed in.
  • We replaced a bunch of calls with this dot.
  • We made the method non-static and private.
  • And of course we moved it onto person, where it belonged.

We still have a reference to the helper method it was originally calling, but we can eliminate that later down the road.  If this logic ends up being complex, we might have a DependantCounter class that takes in a list of relations that our Person method instantiates and calls in order to get the person count.

Our next step here is to write a unit test that tests the current functionality, then check in our code.  Finally, after we have that done, we can write a unit test that will fail for the changes we want to make to the method, and then modify the method.

It is much cleaner and easier to do things this way, and we have just eliminated a method in a helper class!

Adding a method

Adding a method to a helper class is much easier.  JUST DON’T DO IT!!!  Instead figure out what data that method is going to operate on and move it to the class that contains that data.

If the functionality you are going to add is large and seems to have its own responsibility, then go ahead and create a new class.

As you are modifying code and bringing the helper methods into real classes, or adding new methods in classes that would have been in helper classes by convention, you may start to see some of the classes these methods are being moved into grow.   That is ok, you are discovering that you need more classes.  A helper class is not a spill over class for long methods that you don’t want to put into the class they actually belong in.  Instead, the appropriate thing to do is to break up the class based on responsibility.

To stick with the current example, imagine a Person class that has some data on it for money.  Perhaps there is a private variable called cashOnHand.  As you add to the class you may end up bringing in data on their savings account, their outstanding loans.  You might bring in methods that operate on their savings account information and their cash on hand.  It is ok and good to discover that Person becomes a separate thing than a person’s financial data.  At that point you might create a class called Financials and a person would have a reference to it.

Refactoring helper classes is about figuring out where things belong.

It is just like cleaning out the junk drawer in your kitchen.  You have to go through each item and find out where its real home should be.  If there isn’t one, you might have to make one.

If a method operates on a piece of data, it belongs as close to that data as possible.  Don’t try and tackle the huge helper class all at once, but rather eliminate the helper class piece by piece as you change or add functionality.

Should I Leave That Helper Class?

The project I am working on is riddled with “helper” classes.  What is a helper class?

Good question.  I don’t really know.  Neither does the helper class.

When you ask the helper class, what do you do… he half smiles, looks down at his over-sized feet and replies with a squirrely “stuff”.

How to identify helper classes

There are a few common attributes we can look at that will tell us if we are dealing with a helper class, in no particular order:

  • Doesn’t have a clear responsibility of any kind.
  • Doesn’t hold any of its own state data.
  • Has mostly or all static methods.
  • Class name ends in helper.  (This is a good tip off!)
  • If it does get newed up somewhere, it gets passed all around afterwards.
  • Lives in a package or namespace called “utilities”.

A helper class is a class that contains auxiliary methods for other classes, but isn’t really a thing in and of itself.  A helper class is the opposite of object oriented programming. I wrote about the dangers of static methods before, and helper classes usually are the result of proliferation and breeding of static methods.

We are going to skip going any further into why they are bad and go straight into the burning question…  When you see one of these in your code base…

Should you just leave it there?

just say no Should I Leave That Helper Class?

(The above picture means “No”)

When you see a car accident on the freeway that no one has reported, should you just drive on and not dial 911?

When you see an old woman being beaten on the street, should you walk right on by?

When you open your fridge, and you open the vegetable drawer and you see rotting cucumber mush in a bag, do you just forget you ever saw it?

spoiledrotten cucumber09 Should I Leave That Helper Class?

I’m not suggesting you should start diving into your legacy code base and start removing all the helper methods right now.  But what I am saying is that if you are working inside of a helper method to change some functionality and you think it is ok to just add one more method using some lame excuse like “it’s the convention,” I’d like to take a big boat paddle and teach you some single responsibility.  Don’t be part of the problem.  Be part of the solution.

Here are some lame excuses for leaving helper classes and propagating them:

  • I am just making a small change to the code.
  • I don’t want to break this stuff that is already working.
  • I am just following the convention of the architecture.
  • I don’t understand how it works.
  • There is no class this functionality belongs to.
  • I’m a lazy bastard and I don’t care about making the world a better place.
  • The world is going to end in 2012 anyway.

If you’re using one of these lame excuses… STOP IT!  3000 line helper classes weren’t born overnight.  Some idiot first created the class, then more idiots added methods to it.  Don’t be just another idiot.  I implore you.  We have enough.

John, I want to do the right thing… help me.

What?  You do?  I’ll assume you are being sincere… even though I have my reservations.

First take this oath.  Place your hand on The Art of Computer Programming and repeat after me.

I, <your name>, solemnly swear to not propagate the aberration or pure evil and generally sucky code known as the helper class.

I promise to uphold the values of single responsibility, data abstraction, and the open closed principle.

I will vanquish helper classes, and helper methods and properly put them in associated classes where they belong, under no less penalty than having my arms and legs removed with a butter knife.

Welcome initiates, in my next post I’ll tell you some techniques I use to eliminate helper classes.