Category Archives: Refactoring

Going Backwards to Go Forwards

I worked on an interesting problem this week that might have looked like I was running around in circles if you just looked at my SVN commits.

The problem, and the eventual solution, reminded me of an important part of software development—of building anything really.

Sometimes you must tear it down!

No really, sometimes you build a structure only to tear it down the very next day.

It’s not a mistake.  It is intentional and productive and if you are not doing it, you very well might be making a real mistake.

Skeptical?

data and spock thumb Going Backwards to Go Forwards

That is highly illogical

Imagine for a moment that you are tasked with the job of repairing the outside walls of a 2 story building.

There are of course many ways you could go about doing something like this.

  • Use a ladder and just reach the part of the wall the ladder allows you to.  Then move that ladder, repeating the process as needed to complete the repair to the entire wall.
  • Try to lower yourself down from different windows to reach as much of the wall as possible.
  • Tear down the entire building and rebuild the building and walls.

I am sure there are plenty of other methods besides what I listed here.

Yet, a very simple approach would be to build a scaffolding.

A scaffolding is basically a temporary construction used just to help repair or initially build a building which is built for the very purpose of eventually being torn down.

Software is different, we don’t contend with physics!

You are absolutely right!

We contend with a much more powerful force…

LOGIC!

Conceptually anything you can create in software could be created without any kind of software scaffolding.  Unfortunately though, the complexities of the logic of a system and our abilities as humans to only contain so much of it in our brains impose a very real limitation on our ability to even see what the final structure of a metaphysical thing like software should be.

So what am I saying here then?

I’m just saying that it sometimes helps to remember that you can temporarily change some code or design in a way that you know will not be permanent to get you to a place in the codebase where your head is above the clouds and you can look down and see the problem a little better.

Just like there are physical ways to repair a wall on a 2 story building that don’t involve wasting any materials by building something that will be torn down, there are ways to do build software without ever building scaffoldings, but in either case it is not the most efficient use of time or effort.

Don’t be afraid to take a blind hop of faith

Source control is awesome!

Source control lets us make changes to a code base, track those changes and revert them if needed.

Working on a code base without source control is like crawling into a small crevice in an underground cave that has just enough room to fit your shoulders—you are pretty sure you can go forward, but you don’t know if you can crawl back out.

morrils cave crevice thumb Going Backwards to Go Forwards

But, with source control, it is like walking around a wide open cave charting your trail on a map as you go.  You can get back to where you want to any time.

So as long as you have source control on your side, and especially if you are using a distributed source control where you can make local commits all day long, you don’t have to be afraid to step out on a software ledge and see where things go.

Oftentimes I will encounter some problem in code that I know needs to be fixed and I know what is wrong with it, but I just don’t know how to fix it.

When I encounter a problem like this, I will often have an idea of at least one step I could take that should take me in the direction I believe the code needs to go.

A real example

To give you a real example, I was working recently on some code that involved creating a lot for a product.  (Think about the lot number you might see on a box of Aspirin that identifies what batch of ingredients it came from and what machine produced it when.)

kas 9012 thumb Going Backwards to Go Forwards

Up to the point in which I had been working with my teammate on refactoring this code, there had only been only one way to produce the lots.

We were adding some code that would add an additional way to create the lots and the labels for those lots.  We also knew that there would be more additional ways in the future that would need to be added.

Well, we knew we wanted to have a generic way of specifying how lots should be created and labeled, but we didn’t know what the code that would do that would look like.

We could have rewritten the entire lot handling code and made it generic, but it was quite complex code and that would be equivalent to tearing down a building to fix a wall.

It was apparent there were some parts of the existing code that were specific to the way that the current lots were being produced, so we knew those parts of the code had to be changed.

So what did we do?

We started with what we knew needed to be changed and what we thought would move us in the direction we wanted to go, and we built some structures to allow us to refactor those parts of the code, knowing that we would probably end up deleting those structures.

The scaffolds that we created allowed us to have a midway point in the long journey across the Pacific ocean of complexity in our refactor trip.

The point here is that we had to take a bit of hop blindly in a direction we thought things needed to go and part of that hop involved creating some scaffolding type structures to get that code to a place where it still worked and we could examine it again for the next refinement.

The final refinements ended up deleting those scaffolds and replacing them with a more elegant solution, but it is doubtful we would have been able to see the path to the elegant solution without building them in the first place.

Show me the code!?

You may wonder why I am talking about code in such an abstract way instead of just showing you a nice little “scaffold pattern” or a real code example.

I’m not going to show the code, because it isn’t relevant to my point and it is so situational that it would detract from what I am trying to say here.

The point is not what our problem or solution was, but how we got there.

There isn’t a “scaffolding” pattern that you can just apply to your code, rather it is a tool that you can use and shouldn’t be afraid to use in order to move your code forward to a better design.  (To its happy place.)

Types of Duplication in Code

One of the biggest reasons to refactor code is to eliminate duplication.  It is pretty easy to introduce duplication in our code either unintentionally or because we don’t know how to prevent or get rid of it.

The three types of duplication

I’ve found that there are three basic types of duplication that we can eliminate from our code that successfully build on each other.

  • Data
  • Type
  • Algorithm

Most developers tend to get stuck at the data level, but in this post, I will show you how to recognize type and algorithm duplication and refactor it out of your code.

duplicate content thumb Types of Duplication in Code

Data duplication

The most basic type of duplication is that of data.  It is also very easily recognizable.

Take a look at these methods:

public Position WalkNorth()
{
   var player = GetPlayer();
   player.Move("N");
   return player.NewPosition;
}

public Position WalkSouth()
{
   var player = GetPlayer();
   player.Move("S");
   return player.NewPosition;
}

public Position WalkEast()
{
   var player = GetPlayer();
   player.Move("E");
   return player.NewPosition;
}

public Position WalkWest()
{
   var player = GetPlayer();
   player.Move("W");
   return player.NewPosition;
}

 

Pretty easy to see here what needs to be refactored.

Most developers don’t need any help to realize that you should probably refactor this code to a method like the following:

public Position Walk(string direction)
{
   var player = GetPlayer();
   player.Move(direction);
   return player.NewPosition;
} 

 

In this example data is duplicated.  To be specific the string data of the direction passed into move is duplicated.  We can eliminate that duplication by creating a method that parameterizes the differences represented by that data.

Type duplication

Now, data duplication is where a majority of developers stop, but we can go much farther than that.  In many cases the difference between two methods is only the type in which they operate on.

With the use of generics in C#, we can refactor out type and parameterize this concept as well.

Look at this example:

public int FindIntMatch(int i)
{
   var match = (int)container.Get(i);
   return match;
}

public string FindStringMatch(string s)
{
   var match = (string)container.Get(s);
   return match;
}

 

Here we have two method that do pretty much the same thing, but they just differ on the type they operate on.  Generics gives us the ability to actually refactor out that type information just like we would with data.

public T FindMatch(T t)
{
   var match = (T)container.Get(t);
   return match;
}

 

By refactoring to the above method we have eliminated duplication. We have achieved this by refactoring out type.

Algorithm duplication

Without a good understanding of delegates and functional programming, few developers ever even consider refactoring out algorithm duplication, but it can be done fairly easily.

Take a look at this example:

public void GoForRun()
{
   GetDressed();
   Run();
   Shower();
}

public void LiftWeights()
{
   GetDressed();
   Lift();
   Shower();
}

 

It is a pretty basic example, but it highlights the kind of duplication that I often see left in many code bases.  Delegates in C# allow us to treat functions like data.  With this ability we can easily refactor out the commonality in these two method to get something like this:

public void DoFitnessActivity(Action activity)
{
   GetDressed();
   activity();
   Shower();
}

 

We could have also refactored out this duplication by using an abstract base class and having the inherited classes definite their own fitness activity, but using delegates creates a much simpler approach and casts the problem in the same light as refactoring any other type of data.

Combining them together

Often I find that several different types of duplication are present across several methods in a class.

When this is the case, it is often possible to apply data, type and algorithm duplication refactoring techniques to find the most simple and elegant solutions.

I’ve also found this is a skill that must be practiced.  When I first really started using generics and delegates in C#, I had a hard time finding uses for them, because I could not easily recognize the patterns of duplication that called for them.  But, I found over time that it became easier and easier to recognize where these techniques could be applied to reduce duplication in my methods.

I’ve also found the key to eliminating duplication is sometimes to first exaggerate it.  Often I will purposely take two methods that I know have some duplication and make them look even more duplicated in order to be able to clearly see where the duplication lies.  I may do several small refactoring steps to get to the point where it is easy to identify what data, type or algorithm is being repeated.

Small Refactorings are OK

Many programmers seems to get caught up on the idea of refactoring.

Most of us are familiar with the Boy Scout rule which says:

Always leave code better than when you found it

But do you actually apply it in your day to day work?

I’ve found that for myself the answer to this question is sometimes “no.”

boyscout thumb Small Refactorings are OK

Why we don’t follow the rule

Personally, I know that there are many reasons why I have failed to follow the Boy Scout rule in my own day to day coding activities.

How often do you say to yourself something like:

“Yeah, I’ve just got to get this code checked in.  I don’t have time to clean it up.”

Or

“Refactoring this correctly would take too long, and I want to make sure I do it right.”

At first glance, these seem like perfectly valid excuses, but the problem is that the cumulative effect of this kind of thinking is exactly what causes code rot.

“Code rot” is when code from your application begins to become brittle and hard to maintain.

As software developers we should really strive to prevent code rot, and regularly refactoring and cleaning up code, is like brushing the teeth of our application.

There are definitely a large number of excuses I come up with for not refactoring code, but I would say that the number one mental block is this idea of perfection and needing to do it right.

Small refactorings are good!

One thing I try to tell myself is that small refactorings are good and I don’t need to solve the whole problem all at once.

We shouldn’t let the fact that we can’t completely clean up a section of code or refactor it to the final structure we want, prevent us from putting that code on a bus headed that direction.

Many programmers tend to have the perfect solution mindset which requires us to find the 100% best solution to a problem and think that the 95% effective one is no good.

This mindset can be a huge stumbling block to productivity and it can also be a big hindrance to keeping our campsite clean.

It is often helpful to embrace that a series of small changes can be more beneficial than one large change of the same resulting magnitude, even if the small changes end up requiring more total work.

The reason for this is two-fold:

  1. Big changes rarely actually get done, so they are put off
  2. Small changes usually are more natural and evolve the code in an organically correct direction.

Going backwards to go forwards

I even find that many times I take one step backwards in order to go two forwards.  Refactoring sometimes has to just progress naturally as you make something clearer, only to undue it a bit later with another change that ends up making more sense once you can actually see the problem being solved more clearly.

Think about solving a Rubix Cube.  If you have ever attempted to solve one of these things, you know that sometimes you have to wreck that perfect wall of green in order to start getting the white blocks in place.  Many times it is impossible to solve a Rubix Cube without traversing backwards in the solution first.

rubix thumb Small Refactorings are OK

The point is, don’t be afraid to get out there and make something clearer, or go a direction that seems like it will at least improve the code.

You don’t have to find the perfect design pattern to refactor the code into in order to start making changes.

  • Start by changing this variable name to be a bit more clear.
  • Extract those lines of code into a method that makes that functionality more clear.
  • Get rid of some duplication here and there.

The active code reader

When I am in my “zone” and doing things right, I am even refactoring code while I am reading it.

There is no better way to understand some code than to refactor it.

Think about it, how do we learn?

We read something or are taught it and then we rephrase it differently to confirm understanding.

“Let me get this straight, are you saying… blah blah blah?”

“Oh, now I get it, if I do blah and blah then blah blah?”

Why shouldn’t we do this with code?

I know some of you are really scared by this idea, and you’re saying “nope, don’t just go touching code you don’t understand, John.  You are not getting anywhere near my code base.”

But, give it a shot, what is the worst that is going to happen?  You are going to refactor yourself into a dead end and have to revert your changes?

More likely than not, you will end up learning the code better and improving it.  Two for one!

One more analogy then I’m done

I promise!

Ever solved a crossword puzzle?

Did you sit there and immediately fill in all the answers one by one?

Perhaps you filled in some answers that you knew.  Perhaps the short ones first, then you went back over all the clues again and suddenly found that with some letters filled in you could better guess the clues.

Most likely you made several sweeps like this until you finally solved the puzzle or gave up in disgust wondering why you wasted an hour of your life and who the heck studies books on geography that could actually solve this puzzle.

Do you think it would be any different with code?  Making small refactorings is like filling in the clues you know the answer to in a crossword puzzle.  As you keep refactoring, the answers to other puzzles about the code and which way it should go become clearer and clearer.

Don’t try and solve the whole puzzle one by one in a single pass.

Refactoring Switches to Classes

I’ve talked about refactoring switch statements several times before.

I’ve even created a defaultable dictionary for refactoring a switch statement into a dictionary of actions.

This time, I am going to talk about refactoring switches when you have switch statements operating on the same set of data, but have different actions in different circumstances.

switch thumb Refactoring Switches to Classes

First let’s recap

When I talked about refactoring switches before, we were mainly dealing with a single switch statement somewhere in code.

In the case where you have only a single switch statement, or multiple switch statements that do the same thing based on the data, using a dictionary is still a great way to go.

However, there are going to be circumstances where you are going to be switching on the same data, but in different contexts.  In these cases, you will want to perform different actions.

Let’s look at an example.

// In fighting code
switch(classType)
{
    case WARRIOR:
          swingSword();
          break;
    case MAGE:
          castSpell();
          break;
    case THIEF:
          backstab():
          break;
}

// In wear armor code
switch(classType)
{
    case WARRIOR:
          return CAN_WEAR;
    case MAGE:
          return isConsideredLightArmor(armor);
    case THIEF:
          if(isSneaking)
              return NOT_NOW;
          return isConsideredLightArmor(armor);
}

In this example, we are switching on the same enumeration, but we are doing it in different locations of the code.

Using a dictionary would not work well here because we would need multiple dictionaries.

We still don’t want to leave this as it is though, because the code is pretty messy and fragile.

mage thumb Refactoring Switches to Classes

Separation of concerns

The problem is the code that contains these switch statements has too much responsibility.  It is being asked to handle logic for each one of our character class types.

What we need to do to improve this code is refactor the enumerations into their own classes.  Each switch statement will become a method that will be implemented by our enumeration based class.

If we are using Java, we can use Java’s enumeration implementation that allows for methods on an enumeration.  If we are using a language like C#, we still have to map the enumeration value to each class.

Let’s start by making our classes.

First we need a base class, or interface.

public interface CharacterClass
{
    void Attack();
    ArmorResponse WearArmor(armor);
}

Now we can create classes that implement this interface, that contain the logic that was in each switch statement.

public class Warrior : CharacterClass
{
    void Attack()
    {
       swingSword();
    }

    ArmorResponse WearArmor(armor)
    {
       return CAN_WEAR;
    }
}

public class Mage : CharacterClass
{
    void Attack()
    {
       castSpell();
    }

    ArmorResponse WearArmor(armor)
    {
       return isConsideredLightArmor(armor);
    }
}

public class Thief : CharacterClass
{
    void Attack()
    {
       backstab();
    }

    ArmorResponse WearArmor(armor)
    {
       if(isSneaking)
           return NOT_NOW;
       return isConsideredLightArmor(armor);
    }
}

Next we can map our enumeration to our class.

public Dictionary characterDictionary =
    new Dictionary {
    { WARRIOR, new Warrior() },
    { MAGE, new Mage() },
    { THIEF, new Thief() }
};

We could also get rid of the enumeration if we wanted, and just create the appropriate class.  It will depend on what your existing code looks like.

No more switches!

Now let’s take a look at what we end up with in the two locations where we had switches.

// In fighting code
myCharacter.Attack();

// In wear armor code
var armorResponse =  myCharacter.WearArmor(armor);

If we want to add a new character class type, we just add a new class that implements the CharacterClass interface and put a mapping in our dictionary, or in our character initialization code.

If we end up having other places in our logic where different character class types should have different behavior, we just add a method to our CharacterClass interface and implement it in any classes that implement CharacterClass.

Our code is much more maintainable, and easier to understand.

A Simple Wrapper To Make Things More Fluent

This post is really a continuation from my last post on using a method that takes an Action to address cross cutting concerns, like logging, without having to go to a full blown AOP implementation.

Someone mentioned in the comments that it wasn’t very clear exactly what was going on with the final code.  I tend to agree that this:

LogOnError(_riceCooker.Cook);

… is not very clear.

Really, there are two problems with this code that I can see.

  1. It is not clear what this is going to do or whether or not LogOnError or Cook is the method we are concerned about.
  2. It’s not very self-discoverable at all.  If we had a library of useful wrapper methods like these, we wouldn’t have a good intellisense way to know what they are.

I can solve both of those issues, but doing so starts to move us into a weird zone where I am not quite sure I feel comfortable.  But, nevertheless, in the name of science…

wrapped thumb A Simple Wrapper To Make Things More Fluent

Let’s start backwards

Liking fluent interfaces, here is the kind of syntax that I would prefer to be able to use:

Wrapper.Wrap(_riceCooker.Cook).With.LogOnError();

It is a little bit longer syntax, but I like it for a few reasons:

  1. It clearly indicates what is going on here.  We are wrapping a method call using a wrapper.  We are wrapping with a method called LogOnError.
  2. You get intellisense all the way.  The correct implementation of this, should let me Type With + ‘.’ and then see a list of all the possible wrapping methods I have implemented.  This makes the wrapping set of methods self-discoverable.

I really like the idea of being able to easily change the functionality of the wrapping just by changing the last part of the line.  For example, if we had implemented a wrapping method that was LogAndAbortOnError(), we could change our code to use that pretty easily.

Wrapper.Wrap(_riceCooker.Cook).With.LogAndAbortOnError();

If we implement this correctly, intellisense will give us our options.

Making it so

Creating a fluent syntax in C# can often involve quite a bit of magic and voodoo.  I always like to gather my reagents before embarking on such a journey.

So grab a live chicken, a stapler, and a sharp knife and let’s go!

First step, let’s simplify this.  The With is nice, but it is just for flow, we don’t really need it.  So let’s figure out how to implement our syntax without the With and add it in afterwards.

Wrapper.Wrap(_riceCooker.Cook).LogOnError();

First the easy way.

  1. Create a static Wrapper class with a Wrap method that takes an Action and returns an Action.  (We’ll use this to convert whatever we pass in to an Action, so that we can use a Lambda expression or any method call there.)
  2. Create a static extension method that operates on an Action.  Call it LogOnError.
public static class Wrapper
{
    public static Action Wrap(Action action)
    {
        return action;
    }

    public static void LogOnError(this Action action)
    {
        try
        {
            action();
        }
        catch (Exception exception)
        {
            // Log the exception here
        }
    }
}

 

Not too bad.  Not a large amount of magic going on here.  Just using an extension method.

But, we already have a problem.  Using a plain old Action is going to give us too many choices in the intellisense drop down.  It could make it hard to know what our real options are and when we try and add the With syntax later we will need to use a property off of an object we return from the Wrap method.

Making it better

We can fix this by actually wrapping the Action with a custom type that we can add our methods to.

Instead of Wrap returning an Action, it will return a WrappedAction.

public static class Wrapper
{
    public static WrappedAction Wrap(Action action)
    {
        return new WrappedAction() {Action = action};
    }

    public static void LogOnError(this WrappedAction wrappedAction)
    {
        try
        {
            wrappedAction.Action();
        }
        catch (Exception exception)
        {
            // Log the exception here
        }
    }

    public class WrappedAction
    {
        public Action Action { get; set; }
    }
}

Looking better.  Now when we put a ‘.’ at the end of our Wrap call we only see LogOnError as an option.

We can be sure now that if we create an extension method for a WrappedAction, we will make sure that method is self-discoverable.  Before, the generic Action extension method could make our method show up places that we don’t want it to and can get lost in the other methods on Action.

Making it done

The last thing we need to do is add the With.

Ideally, when we hit the ‘.’ on the end of the Wrap method, we want to see With as an option.  When we hit the ‘.’ on the end of the With property, we want to see LogOnError as an option.

In order to accomplish this we need to:

  1. Add a With property to the WrappedAction.
  2. Have the With property be of a new type (WrappedActionTarget) so that we can add our extension methods for that new type.
  3. Change the extension method to operate on the new type.

Here is what we end up with:

public static class Wrapper
{
    public static WrappedAction Wrap(Action action)
    {
        return new WrappedAction() { Action = action };
    }

    public static void LogOnError(this WrapperMethodTarget target)
    {
        try
        {
            target.WrappedAction.Action();
        }
        catch (Exception)
        {
            Console.Write("Logging this error");
        }
    }
}

public class WrapperMethodTarget
{
    public WrappedAction WrappedAction { get; set; }
}

public class WrappedAction
{
    public Action Action { get; set; }

    public WrapperMethodTarget With
    {
        get
        {
            return new WrapperMethodTarget() { WrappedAction = this };
        }
    }
}

 

Now we can use the syntax of:

Wrapper.Wrap(_riceCooker.Cook).With.LogOnError();

We can move that LogOnError method out to another class, or create new extension methods somewhere else.  I just put it in there to avoid creating another class.

Is this really practical?

I don’t know.  To be honest, I was playing around with creating extension methods that work on Actions and I came up with this way to use them.

I could see making a wrapping library that had different kinds of ways you would wrap method calls built into it.  It could allow you to specify how you log in a configuration and then you would get all of this common stuff automatically.

Even if it is not practical, it’s pretty fun, and it demonstrates the power of Action, or rather functional programming in general.

Living Dangerously: Refactoring without a Safety Net

It’s usually a good idea to have unit tests in place before refactoring some code.

I’m going to go against the grain here today though and tell you that it is not always required.

Many times code that should be refactored doesn’t get refactored due to the myth that you must always have unit tests in place before refactoring.

In many cases the same code stays unimproved over many revisions because the effort of creating the unit tests needed to refactor it is too high.

I think this is a shame because it is not always necessary to have unit tests in place before refactoring.

manonwire3 thumb Living Dangerously: Refactoring without a Safety Net

Forgoing the safety net

If you go to the circus, you will notice that some acts always have a safety net below because the stunt is so dangerous that there is always a chance of failure.

You’ll also notice that some acts don’t have a safety net because even though there is risk of danger, it is extremely small, because of the training of the performers.

Today I’m going to talk about some of the instances where you don’t necessarily need to have a safety net in place before doing the refactor.

Automatic refactoring

This is an easy one that should be fairly obvious.  If you use a modern IDE like Visual Studio, Eclipse, or IntelliJ, you will no doubt have seen what I call “right-click refactor” options.

Any of these automatic refactors are pretty much safe to do anytime without any worry of changing functionality.  These kinds of automated refactors simply apply an algorithm to the code to produce the desired result and in almost all cases do not change functionality.

These refactoring tools you can trust because there is not a chance for human error.

Any time you have the option of using an automatic refactoring, do it!  It just makes sense, even if you have unit tests.  I am always surprised when I pair up with someone and they are manually refactoring things like “extract method” or “rename.”

Most of the time everything you want to do to some code can be found in one of the automatic refactoring menus.

Small step refactors

While not as safe as automatic refactors, if you have a refactor that is a very small step, there is a much higher chance your brain can understand it and prevent any side effects.

A good example of this would be my post on refactoring the removal of conditions.

The general idea is that if you can make very simple small steps that are so trivial that there is almost no chance of mistake, then you can end up making a big refactor as the net effect of those little changes.

This one is a judgment call.  It is up to you to decide if what you are doing is a small step or not.

I do find that if I want to do a refactor that isn’t a small step refactor, I can usually break it down into a series of small steps that I can feel pretty confident in.  (Most of the time these will be automated refactors anyway.)

Turning methods into classes

I hate huge classes.  Many times everyone is afraid to take stuff out of a huge class because it is likely to break and it would take years to write unit tests for that class.

One simple step, which greatly improves the architecture and lets you eventually create unit tests, is to take a big ol’ chunk of that class, move it to a new class, and keep all the logic in there exactly how it is.

It’s not always totally clean, you might have to pass in some dependencies to the new method or new class constructor, but if you can do it, it can be an easy and safe refactor that will allow you to write unit tests for the new class.

Obviously this one is slightly more dangerous than the other two I have mentioned before, but it also is one that has a huge “bang for your buck.”

Unit tests, or test code themselves

Another obvious one.  Unless you are going to write meta-unit tests, you are going to have to live a little dangerously on this one.  You really have no choice.

I think everyone will agree that refactoring unit tests is important though.   So, how come no one is afraid to refactor unit tests?

I only include this example to make the point that you shouldn’t be so scared to refactor code without unit tests.  You probably do it pretty frequently with your unit tests.

I’m not advocating recklessness here

I know some of you are freaking out right now.

Be assured, my message is not to haphazardly refactor code without unit tests.  My message is simply to use temperance when considering a refactor.

Don’t forgo a refactor just because you are following a hard and fast rule that you need unit tests first.

Instead, I am suggesting that some refactorings are so trivial and safe that if it comes between the choice of leaving the code as it is because unit testing will take too long, or to refactor code without a safety net, don’t be a… umm… pu… wimp.  Use your brain!

Things that will bite you hard

There are a few things to watch out for, even with the automatic refactoring.  Even those can fail and cause all kinds of problems for you.

Most of these issues won’t exist in your code base unless you are doing some crazy funky stuff.

  • If you’re using dynamic in C#, or some kind of PInvoke, unsafe (pointer manipulation) or COM interop, all bets are off on things like rename.
  • Reflection.  Watch out for this one.  This can really kick you in the gonads.  If you are using reflection, changing a method name or a type could cause a failure that is only going to be seen at runtime.
  • Code generation.  Watch out for this one also.  If generated code is depending on a particular implementation of some functionality in your system, refactoring tools won’t have any idea.
  • External published interfaces.  This goes without saying, but it is so important that I will mention it here.  Watch out for other people using your published APIs.  Whether you have unit tests or not, refactoring published APIs can cause you a whole bunch of nightmares.

This list isn’t to scare you off from refactoring, but if you know any of the things in this list are in your code base, check before you do the refactor.  Make sure that the code you are refactoring won’t be affected by these kinds of things.

Explaining What Action And Func Are

In C#, Action and Func are extremely useful tools for reducing duplication in code and decreasing coupling.

It is a shame that many developers shy away from them because they don’t really understand them.

Adding Action and Func to your toolbox is a very important step in improving your C# code.

It’s not really that hard to understand what they do and how to use them, it just takes a little patience…

A simple way of thinking about Action<>

Most of us are pretty familiar with finding sections of repeated code, pulling that code out into a method and making that method take parameters to represent the differences.

Here is a small example, which should look pretty familiar:

public void SteamGreenBeans()
{
    var greenBeans = new GreenBeans();
    Clean(greenBeans);
    Steam(greenBeans, Minutes.Is(10));
    Serve(greenBeans);
}

public void SteamCorn()
{
    var corn = new Corn();
    Clean(corn);
    Steam(corn, Minutes.Is(15));
    Serve(corn);
}

public void SteamSpinach()
{
    var spinach = new Spinach();
    Clean(spinach);
    SteamVegetable(spinach, Minutes.Is(8));
    Serve(spinach);
}

Each one of these methods pretty much does the same thing.  The only difference here is the type of vegetable and the time to steam it.

It is a simple and common refactor to refactor that code to:

public void SteamGreenBeans()
{
   SteamVegetable(new GreenBeans(), 10);
}

public void SteamCorn()
{
    SteamVegetable(new Corn(), 15);
}

public void SteamSpinach()
{
    SteamVegetable(new Spinach(), 8);
}

public void SteamVegetable(Vegetable vegetable, int timeInMinutes)
{
    Clean(vegetable);
    Steam(vegetable, Minutes.Is(timeInMinutes));
    Serve(vegetable);
}

Much better, now we aren’t repeating the “actions” in 3 different methods.

Now let’s imagine we want to do something more than steam.  We need to be able to fry or bake the vegetables.  How can we do that?

Probably we will have to add some new methods for doing that.  So we will end up with something like this:

public void SteamVegetable(Vegetable vegetable, int timeInMinutes)
{
    Clean(vegetable);
    Steam(vegetable, Minutes.Is(timeInMinutes));
    Serve(vegetable);
}

public void FryVegetable(Vegetable vegetable, int timeInMinutes)
{
    Clean(vegetable);
    Fry(vegetable, Minutes.Is(timeInMinutes));
    Serve(vegetable);
}

public void BakeVegetable(Vegetable vegetable, int timeInMinutes)
{
   Clean(vegetable);
   Bake(vegetable, Minutes.Is(timeInMinutes));
   Serve(vegetable);
}

Hmm, lots of duplication again.  No problem.  Let’s just do what we did to the first set of methods and make a CookVegetable method.  Since we always clean, then cook, then serve, we should be able to just pass in the method of cooking we will use.

Oh wait, how do we do that?  We can’t just extract out Bake or Fry or Steam, because the Bake, Fry and Steam methods are logic and not data.

Unless… unless we can make them data.  Can we do that?

We sure can, check this out:

public void SteamVegetable(Vegetable vegetable, int timeInMinutes)
{
    CookVegetable(vegetable, Steam, timeInMinutes);
}

public void FryVegetable(Vegetable vegetable, int timeInMinutes)
{
    CookVegetable(vegetable, Fry, timeInMinutes);
}

public void BakeVegetable(Vegetable vegetable, int timeInMinutes)
{
    CookVegetable(vegetable, Bake, timeInMinutes);
}

public void CookVegetable(Vegetable vegetable,
   Action<Vegetable, int> cookingAction,
   int timeInMinutes)
{
    Clean(vegetable);
    cookingAction(vegetable, Minutes.Is(timeInMinutes));
    Serve(vegetable);
}

We got rid of the duplicated code the same way we did when we did our first refactor, except this time we parameterized method calls instead of data.

If you understood this, you understand Action.  Action is just a way of treating methods like they are data. Now you can extract all of the common logic into a method and pass in data that changes as well as actions that change.

Congratulations, you are doing the strategy pattern without having to create an abstract base class and a huge inheritance tree!

So when you see Action, just think “ah, that means I am passing a method as data.”

It really is as simple as that.

Action<Vegetable, CookingTime> translated to English is: “A method that takes a Vegetable and a CookingTime as parameters and returns void.”

What about Func<>?

If you understand Action, you understand Func.

Func<X, Y, Z> translated to English is: “A method that takes an X, and a Y as parameters and returns a Z”.”

The only difference between Action and Func is that Func’s last template parameter is the return type.  Funcs have non-void return values.

Bonus: Predicate is a Func that always returns a boolean.

That’s all there is to it.  There really isn’t a need to know much more than that to make sure of Action and Func in order to start using them.

Further Resources for Action and Func

If you are interested in some other ways to apply Action and Func, here are some posts I have written which focus on them.

Clean Code, Saves Money or Is Art?

Lately there has been a lot of chatter about whether writing clean code actually saves money or whether it is more about art, i.e. making things pretty.

(See John MacIntyre’s post here if you are interested, and Uncle Bob’s response here.)

Well, as Forest Gump would say,  “Maybe it’s both.”

gump thumb Clean Code, Saves Money or Is Art?

How can it be both?

I think overall writing clean code saves you money to build software.  (Unless the time you are going to spend maintaining the software is minimal or non-existent.)

The reason why it saves you money is where the both part comes into play.

If we just extracted the money part from the practice of writing clean code, we can make a pretty solid argument that overall it saves money by looking at what costs the most time and money in software development.

Go ahead take a guess.  What do you think it is?

That’s right.  Fixing a production bug. It may take awhile to write unit tests.  It may take awhile to refactor you code to be “clean.” But, if spending an extra 3 hours on a 3 hour coding task ends up saving you just 1 production issue, you’ve made your time back and then some.

The actual time expense of a production issue, all the way down the line, from project managers to QA to programmers to back to QA and back to deployment can easily cost 10 or more hours per issue, easily.

It’s really hard to argue against that logic by itself, but there is another element that comes into play here.

The human element.

You see it’s not all about dollars and money and what makes logical or practical sense when you throw humans into the mix.  I believe if the non-human benefits writing clean code didn’t save you any money, overall it would still save you money.

Let me explain.

No one takes pride in crap

I don’t care if the software works or if it looks pretty on the outside.  The person maintaining that code is going to have a different view of it.

If the internal code is crap, if it is nothing to feel good about, if it is a big pile of spaghetti code, it is going to severely demotivate developers.

And what do demotivated developers do?

All kinds of horrible things.  They look for new jobs.  They write more crap code.  They waste time and stall.  They do what they have to to get by until they can either get the heck out of this stupid profession, or that dream job comes along where they can write ASP.NET MVC code using an auto-mocking container and BDD.

Sometimes you have to ask yourself, is it really that much more difficult to maintain that existing VB6 app than if it were converted to C#?  Seems like it shouldn’t be, right?  It definitely is easier to maintain a nicely built C# application, but there is a compelling reason why those applications eventually get rewritten, even though they really might not need to.

Developers working on old crusty applications just are not motivated to do so.  Developers like new shiny technology.  They like to feel like they are learning and expanding their skills, not just maintaining status quo.

So even though rewriting that VB6 application doesn’t really make practical sense… even though your metrics and charts tell you that you shouldn’t do it…  When you do rewrite that application you will find this magical hidden cost savings, because suddenly developers won’t drag their feet trying to fix a bug or add a feature to “that crappy old VB6 app.”  To the customer that application might even look exactly the same, but to the developers working on it, it will be a brand new shiny toy.

Writing clean code is about more than saving money

So you see, in the theoretical there are argument about whether refactoring that code will actually be a positive return on investment or negative.  I’ll argue for the positive on purely practical and chartable theory every time, but when you throw in the human art element, it is no contest.

If your code  base is something that your developers take pride in, you will see huge savings in time, because your developers will be significantly motivated to make it even better.

Clean code begets clean code.

So when I say it is both, I’m not just being non-committal or luke-warm about the topic.  The human element makes that fact that clean code is art important to saving money.

As always, you can subscribe to this RSS feed to follow my posts on Making the Complex Simple.  Feel free to check out ElegantCode.com where I post about the topic of writing elegant code about once a week.  Also, you can follow me on twitter here.

Switch is Just a Fancy If Else

Sorry to rain on your parade.  I know that you just refactored that series of if-else statements into one switch statement and you’re feeling like you did your good deed for the day.

Take a moment to rest on your laurels before I tear your laurels to shreds.

Go ahead, I’ll wait.

Okay, now let’s get down to business.

Saying the same thing a different way

The problem with changing if-else statements into a switch statement is that nothing really has changed besides the dialect in which it was said.

It’s all apples!

apples thumb Switch is Just a Fancy If Else

Now, switch is slightly better than if-else for maintenance purposes, but not much better.  Both constructs (switch and if-else) try to represent some form of data (a mapping) as code.

Now data and code are not clearly delineated.  But, in general we have a concept of what constitutes data and what constitutes code.

Data is some information that does not contain in itself logic.

Code is some information that is primarily instructions and logic.

For this reason, I say that switch statements and if-else statements are both trying to treat data as logic.  They are both code constructs that mix in data elements with logic elements and do not clearly separate the two, although a switch statement does a slightly better job of it.

Consider the most basic form of an if-else and switch statement.  The form in which you directly map one piece of data to another piece of data.

If “A” then return “kittens”, If “B” then return “puppies”…

Switch (data1), case “A”: return “kittens”, case “B”: return “puppies”…

When I rewrite it in this form, the parallel becomes much more clear.   We can also very easily separate logic from data.

Data:

  • ”A”
  • ”B”
  • ”kittens”
  • ”puppies”

Logic:

  • Map something of dataset1 to something of dataset2.
  • Return a mapped value.

The problem with either of these forms, (the switch or the if-else), is that both of them tend to mix logic and data together.

Why is it bad to mix logic and data?

Let me ask you a question.  Which is more likely to change?

Hopefully it is data and not logic.

Consider the Single Responsibility Principle (SRP.)

A module should have one and only one reason to change.  Ideally, we want to separate out logic from data so that we can change the two independently.

Sticking with our contrived example, suppose we wanted to make “A” map to “dinosaurs” instead of “kittens,” or we want to add more mappings.  We should be able to do this in a way that does not interfere with the logic at all or add new logic.

If we have an if-else structure, we will have to add more if-else constructs or change a data value that is inside of the method that is doing the logic.  The same applies for a switch statement.  We are not declaring our data one place and our logic another, they will be side by side, right next to each other in the code.

Suppose we want to change the logic so that instead of returning a value, we assign it to some variable, or reverse it.  Again, we are faced with changing logic in multiple places, in the same module as the data.

Consider the case where we want to read the data from a file or some other data source?  Is that possible with a switch or if-else structure?  Not really, because the data is essentially “hard-coded” into the logic.

What can we do about it?

So, by now hopefully you have gotten the point that your refactoring of the if-else into a switch statement didn’t really solve the basic problem of mixing data and logic together.

But now you have another problem.  You need to separate your data and logic so that they can change independently.  You want to be a good steward of the SRP.  You want to someday be able to read your data from a file or database, so that you won’t even have to recompile your code to change it.

On my next post, I will show you how to take switch statements and refactor them into a better form that separates data from logic.

I will talk about the different “forms” of the switch statement and why to choose a particular method of refactoring over another.

Don’t Chain Failure States in Returns

I had originally started writing this post thinking that I knew that answer to the problem I am about to demonstrate.

What surprised me is how much more clean a try catch solution seems to work than my original solution.

The unclean code

public void Process()
{
    foreach (string line in file)
    {
        string[] columns = line.split(',');
        foreach (string column in columns)
        {
            if (hasBadData(column))
            {
                LogErrorMessage();
                return;
            }

            Print(column);
        }
    }
}

I am using a short example here, so the refactoring might not really jump out to you as necessary, but let’s refactor it to be as clean as possible anyway.  (This comes from a much larger example where the refactoring is a must.)

Refactor 1: Returning booleans

public void Process()
{
    foreach (string line in file)
    {
        if (!ProcessLine(line))
        {
            return;
        }
    }
}

private bool ProcessLine(string line)
{
    string[] columns = line.split(',');
    foreach (string column in columns)
    {
        if (!ProcessColumn(column))
        {
            return false;
        }
    }
    return true;
}

private bool ProcessColumn(string column)
{
    if (hasBadData(column))
    {
        LogErrorMessage();
        return false;
    }

    Print(column);
    return true;
}

We’ve broken each loop into its own method and separated the handling of each line and each column in each line into its own method.  A good refactor that makes things a little more clear.

Notice the bool return types on the new methods and the special handling to return early if a result is false.  Most people when refactoring the first example (myself included) forget that the first example breaks out of both loops with an early return.

If you don’t take this into account, your behavior will be different.  When you refactor, your behavior had better not be different.

This is the crux of the problem.  Returning bools is bad.  It doesn’t make sense that our methods return bool.  What does the bool mean?  We happen to know it means success or failure, but will the next reader?  It seems very out of place, and look at the verbosity to handle the return types.

Refactor 2: Storing error state

This next refactor was what this post was going to be titled.  I was going to suggest in this post that you should not return boolean error states, but instead use the error state as part of your class.  I still think this is a decent solution.

private bool IsFatalError { get; set; };

public void Process()
{
    foreach (string line in file)
    {
        ProcessLine(line);
        if (this.IsFatalError)
        {
            return;
        }
    }
}

private void ProcessLine(string line)
{
    string[] columns = line.split(',');
    foreach (string column in columns)
    {
        ProcessColumn(column);
        if (this.IsFatalError)
        {
            return;
        }
    }
}

private void ProcessColumn(string column)
{
    if (hasBadData(column))
    {
        LogErrorMessage();
        this.IsFatalError = true;
        return;
    }

    Print(column);
}

What we have done here is stored some state information about our object instead of chaining result codes down the method calls.  This is a huge improvement, because our meaning is much more explicit and our object is holding its own state.

It still is quite verbose.  Why do we have to check for a fatal error everywhere?  Seems like we are likely to forget.

Still, much better than returning booleans.  If you are in a class don’t be afraid to manipulate the state of that class internally.  Many times you can eliminate return values and parameters passed into private methods in a class by simply making that data part of the class.

So, the verbosity was still bothering me.  So, I tried this:

Refactor 3: Exception handler flow control

public void Process()
{
    try
    {
        foreach (string line in file)
        {
            ProcessLine(line);
        }
    }
    catch (BadColumnDataException exception)
    {
        LogErrorMessage();
    }
}

private void ProcessLine(string line)
{
    string[] columns = line.split(',');
    foreach (string column in columns)
    {
        ProcessColumn(column);
    }
}

private void ProcessColumn(string column)
{
    if (hasBadData(column))
    {
        throw new BadColumnDataException();
    }

    Print(column);
}

This is surprisingly clean.  Notice that we don’t actually have to put if conditions in the loops to check whether or not we should return early?  Instead, if an exception is thrown, flow will jump immediately to the catch block where we will log the error message.

I’m a little bit uncomfortable to be using exception handling as a flow control mechanism, but given that:

  1. When encountering the error, we are aborting the normal flow of the program completely (breaking out of two loops).
  2. It is so much more clean and clear.

I think this is actually my preferred solution.  I normally would never use exceptions where I could reasonably detect the failure condition, but I may have to rethink that viewpoint.