# There Are Only Two Roles of Code

All code can be classified into two distinct roles; code that does work (algorithms) and code that coordinates work (coordinators).

The real complexity that gets introduced into a code bases is usually directly related to the creation of classes that group together both of these roles under one roof.

I’m guilty of it myself.  I would say that 90% of the code I have written does not nicely divide my classes into algorithms and coordinators.

## Defining things a bit more clearly

Before I dive into why we should be dividing our code into clear algorithmic or coordinating classes, I want to take a moment to better define what I mean by algorithms and coordinators.

Most of us are familiar with common algorithms in Computer Science like a Bubble Sort or a Binary Search, but what we don’t often realize is that all of our code that does something useful contains within it an algorithm.

What I mean by this is that there is a clear distinct set of instructions or steps by which some problem is solved or some work is done.  That set of steps does not require external dependencies, it works solely on data, just like a Bubble Sort does not care what it is sorting.

Take a moment to wrap your head around this.  I had to double check myself a couple of times to make sure this conclusion was right, because it is so profound.

It is profound because it means that all the code we write is essentially just as testable, as provable and potentially as dependency free as a common sorting algorithm if only we can find the way to express it so.

What is left over in our program (if we extract out the algorithms) is just glue.

Think of it like a computer.  Computer electronics have two roles: doing work and binding together the stuff that does the work.  If you take out the CPU, the memory and all the other components that actually do some sort of work, you’ll be left with coordinators.  Wires and busses that bind together the components in the system.

## Why dividing code into algorithms and coordinators is important.

So now that we understand that code could potentially be divided into two broad categories, the next question of course is why?  And can we even do it?

The biggest benefit to pulling algorithmic code into separate classes from any coordinating code is that it allows the algorithmic code to be free of dependencies.  (Practically all dependencies.)

Once you free this algorithmic code of dependencies you’ll find 3 things immediately happen to that code:

1. It becomes easier to unit test
2. It becomes more reusable
3. Its complexity is reduced

A long time ago before mocks were widely used and IoC containers were rarely used, TDD was hard.  It was really hard!

I remember when I was first standing on the street corners proclaiming that all code should be TDD with 100% code coverage.  I was thought pretty crazy at the time, because there really weren’t any mocking frameworks and no IoC containers, so if you wanted to write all your code using TDD approaches, you’d actually have to separate out your algorithms.  You’d have to write classes that had minimal dependencies if you wanted to be able to truly unit test them.

Then things got easier by getting harder.  Many developers started to realize that the reason why TDD was so hard was because in the real world we usually write code that has many dependencies.  The problem with dependencies is that we need a way to create fake versions of them.  The idea of mocking dependencies became so popular that entire architectures were based on the idea and IoC containers were brought forth.

We, as a development community, essentially swept the crumbs of difficult unit testing under the rug.  TDD and unit testing in general became ubiquitous with writing good code, but one of the most important values of TDD was left behind, the forced separation of algorithmic code from coordinating code.

TDD got easier, but only because we found a way to solve the problems of dependencies interfering with our class isolation by making it less painful to mock out and fake the dependencies rather than getting rid of them.

## There is a better way!

We can still fix this problem, but we have to make a concerted effort to do so.  The current path of least resistance is to just use an IoC container and write unit tests full of mocks that break every time you do all but the most trivial refactoring on a piece of code.

Let me show you a pretty simple example, but one that I think clearly illustrates how code can be refactored to remove dependencies and clearly separate out logic.

Take a look at this simplified calculator class:

``` public class Calculator
{
private List&lt;int&gt; history = new List&lt;int&gt;();
private int sessionNumber = 1;
private bool newSession;

public Calculator(IStorageService storageService)
{
this.storageService = storageService;
}

public int Add(int firstNumber, int secondNumber)
{
if(newSession)
{
sessionNumber++;
newSession = false;
}

var result = firstNumber + secondNumber;

return result;
}

public List&lt;int&gt; GetHistory()
{
if (storageService.IsServiceOnline())
return storageService.GetHistorySession(sessionNumber);

return new List&lt;int&gt;();
}

public int Done()
{
if (storageService.IsServiceOnline())
{
foreach(var result in history)
storageService.Store(result, sessionNumber);
}
newSession = true;
return sessionNumber;
}
}
```

This class does simple add calculations and stores the results in a storage service while keeping track of the adding session.

It’s not extremely complicated code, but it is more than just an algorithm.  The Calculator class here is requiring a dependency on a storage service.

But this code can be rewritten to extract out the logic into another calculator class that has no dependencies and a coordinator class that really has no logic.

``` public class Calculator_Mockless
{

public Calculator_Mockless()
{
this.storageService = new StorageService();
this.basicCalculator = new BasicCalculator();
}

public int Add(int firstNumber, int secondNumber)
{
}

public List&lt;int&gt; GetHistory()
{
return storageService.
GetHistorySession(basicCalculator.SessionNumber);
}

public void Done()
{
foreach(var result in basicCalculator.History)
storageService
.Store(result, basicCalculator.SessionNumber);

basicCalculator.Done();
}
}

public class BasicCalculator
{
private bool newSession;

public int SessionNumber { get; private set; }

public IList&lt;int&gt; History { get; private set; }

public BasicCalculator()
{
History = new List&lt;int&gt;();
SessionNumber = 1;
}
public int Add(int firstNumber, int secondNumber)
{
if (newSession)
{
SessionNumber++;
newSession = false;
}

var result = firstNumber + secondNumber;

return result; ;
}

public void Done()
{
newSession = true;
History.Clear();
}
}
```

Now you can see that the BasicCalculator class has no external dependencies and thus can be easily unit tested.  It is also much easier to tell what it is doing because it contains all of the real logic, while the Calculator class has now become just a coordinator, coordinating calls between the two classes.

This is of course a very basic example, but it was not contrived.  What I mean by this is that even though this example is very simple, I didn’t purposely create this code so that I could easily extract out the logic into an algorithm class.

I’ve found that if you focus on eliminating mocks or even just having the mindset that you will not use mocks in your code, you can produce code from the get go that clearly separates algorithm from coordination.

I’m still working on mastering this skill myself, because it is quite difficult to do, but I believe the rewards are very high for those that can do it.  In code where I have been able to separate out algorithm from coordination, I have seen much better designs that were more maintainable and easier to understand.

I’ll be talking about and showing some more ways to do this in my talk at the Warm Crocodile conference next year.

# Wrapping Callbacks

I’ve recently had the problem of trying to display a progress dialog when executing an asynchronous operation and to dismiss that progress dialog when the operation completes.

I wanted to build a way to do this that is generic to my application, so that it would work with any asynchronous operation in order to reduce duplication of writing progress dialog logic everywhere in the code.

## Looking at the problem

So here is some basic pseudo-code of the problem.

```public void DoCall()
{
ShowProgressDialog();
RemoteService.DoAsync(CallBackMethod);
}

public void CallBackMethod(Result result)
{
HideProgressDialog();
// Process result;
}
```

Now the problem with this code is that it would have to be repeated everywhere I want to make an asynchronous call and display a progress dialog.

If I want to do this in my application every time that I make an asynchronous call, I have to put this kind of code in many places in the application.

You can also see a mixing of responsibilities here.  We are handling UI related showing of a progress dialog split between an initial call and the callback.  It just seems a bit dirty to do things this way.

## Breaking it down

So how can we solve this problem?

Go ahead and think of some way that you might be able to solve this.

One of the best ways that I have found to solve any problem like this is to work backwards.

Let us assume that any syntax we can think of is possible, and then only change the ideal syntax if it proves to not be possible.

What do I mean by this?

Simple, let’s come up with the way we want this code to look and we will worry about implementing it later.

It would be nice to be able to execute an asynchronous method and display a progress dialog with a one-liner, like so:

```UIServiceCaller.ExecuteAsync(RemoteService.DoAsync, CallBackMethod);
```

If we could just do this wherever we want to make an asynchronous call and automatically have the progress dialog shown and dismissed, life would be wonderful.

## Solving the problem

In order to solve this problem, we can do something very similar to a decorator pattern.

We can basically just wrap our callback method with a new callback that adds the functionality of dismissing our progress dialog.

So the idea will be that we will do the following steps in our ExecuteAsync method:

1. Show the progress dialog
2. Wrap our callback with one that dismisses our dialog and then executes the callback.
3. Execute our asynchronous operation passing the new wrapped callback as the parameter.

Here is what this looks like in code:

```public static void ExecuteAsync(Action&lt;Action&lt;Result&gt;&gt; asyncMethod, Action callback)
{
ShowProgressDialog();

Action wrappedCallback = (r =&gt;
{
DismissProgressDialog();
callback(r);
}

asyncMethod(wrappedCallback);
}
```

This code actually looks more complicated than it really is.

You will need to understand how Action works though.  If you don’t, check out this post I did explaining Action and Func.

## Understanding the solution

It can be a bit hard to wrap your head around Action of type Action of type Result.

Let me translate this to make it a bit easier.

The first parameter to ExecuteAsync is a method that takes another method as a parameter (the callback,) which takes a Result object as a parameter.

So, the idea here is that we are passing in the method that is going to do the callback as the first parameter to your ExecuteAsync method.

Then we are passing in our actual callback as the 2nd parameter to ExecuteAsync.

The idea is that we are splitting apart the method and the callback it takes, so that we can insert our code to handle the dialog where we need to.

Now we can just call ExecuteAsync and pass it our method and its callback and progress dialog code will automatically be handled for us.

## Building on the idea

We can expand upon this concept in several ways.

One thing we could do is also apply error handling logic to the callback wrapper method.  We could preprocess the result in some way before passing it to the callback.  This could allow us to display an error message or even retry a number of times before executing the callback.

Another thing we could do is to change the signature of the ExecuteAsync method so that it takes a type T instead of Result; this would allow us to handle any kind of return type and method, and the ExecuteAsync method would work with just about any asynchronous method call.

# Making Switch Refactorings Better – Defaultable Dictionary

I’ve written before on the idea of refactoring a switch to a Map or Dictionary.

There is one major problem that I have been running into though.  Switch statements and dictionaries are not functionally equivalent for one major reason…

## Switches allow for default

I kept struggling with this when I would implement a dictionary to replace a switch.  How can I deal with the default case?

There are of course many ways to deal with the default case in a dictionary or map, but I didn’t really like any of the solutions because they either required me to remember to check to see if my entry was in the dictionary before looking it up, or to rely on catching an exception.

Let me give you an example:

```switch(vegetables)
{
case Vegetables.Carrot:
DoCarrotStuff();
break;

case Vegetables.Spinach:
EatIt();
break;

case Vegetables.Peas:
FeedToDog();
break;

default:
Butter();
Salt();
SprinkleCheese();
Eat();
}
```

Converting this to a dictionary we get something like:

```var vegetableActionMap = new Dictionary&lt;Vegetable, Action&gt;
{
{ Vegetable.Carrot, () =&gt; DoCarrotStuff() },
{ Vegetable.Spinach, () =&gt; EatIt() },
{ Vegetable.Peas, () =&gt; FeedToDog() }
}

Action result;
if(!vegetableActionMap.TryGetValue(vegetable, out result)
{
Butter();
Salt();
SprinkleCheese();
Eat();
}
else
result();
```

So clunky, just to handle the default case.

Would be much better do something like this:

```var vegetableActionMap = new Dictionary&lt;Vegetable, Action&gt;
{
{ Vegetable.Carrot, () =&gt; DoCarrotStuff() },
{ Vegetable.Spinach, () =&gt; EatIt() },
{ Vegetable.Peas, () =&gt; FeedToDog() }
}.WithDefaultValue( () =&gt; { Butter(); Salt(); SprinkleCheese(); Eat(); });

vegetableActionMap[vegetable]();

```

Well now you can!

Enter DefaultableDictionary!

Also the first thing I ever put on GitHub!

The idea is pretty simple, I am just creating a decorator for IDictionary.

The DefaultableDictionary has a constructor that takes an IDictionary and a default value.

It then delegates all of the methods to the passed in IDictionary reference.  For the methods that look up a value, it handles returning the default if the key doesn’t exist in the dictionary.

I created an extension method that lets you just put a .WithDefaultValue() on the end of your dictionary declaration in order to auto-magically give you a DefaultableDictionary back with that default value.

## Sleep well my friend

Knowing that you can not create a dictionary that has a default value which is returned instead of throwing an exception if the key passed in is not found.

I have no doubt that in 3rd world countries children are still starving, but in 1st world countries children with VS Express hacking away at iPhone applications using MonoTouch will not have to catch exceptions from dictionaries that do not know how to just return a default value.

So now there is no excuse!  Refactor those switches!

# Back to Basics: Mock Eliminating Patterns

In my previous post I talked about unit testing without mocks.  I gave some examples of how I had done this in some of my real code from PaceMaker.

This time I want to take a look at some of the common patterns we can use to extract parts of our code into dependency-lite or dependency-less classes that we can unit test without needing mocks.

I’ll have to admit, I haven’t really be practicing this approach for a long time, so my list of patterns is probably lacking, but these are some of the ones I kept seeing as I was working through the code.

If you have some ideas for other patterns or know of some already or have better names than what I have chosen, please let me know.

## Pattern 1: Pull out state

In many cases we can have complex objects that contain their own state often split across member variables.

If you find that a class has many methods that check some member variable and do one thing or another based on what value that member variable is set to, you might be able to benefit by extracting all the state changing logic into a separate class and creating events to notify the original class when the state changes.

We can then keep the new class as the single source of state for the class it came from and easily write dependency free (level 2) unit tests for the state class.

From the perspective of the class you are pulling the state out of, we might turn:

```isSpecial = false;
timeToLive = 50;
timeToLove = 1
penguinsHaveAttacked = true;

public ApocalypseScenario CreateNew()
{
if(isSpecial &amp;&amp; timeToLive &lt; timeToLove || penguinsHaveAttacked)
return new ApocalypseScenario(&quot;Penguin Apocalypse, Code Blue&quot;);
}
```

Into:

```private apocalypseStateMachine = new ApocalypseStateMachine();

public ApocalypseScenario CreateNew()
{
if(apocalypseStateMachine.State == States.PENGUIN_SLAUGHTER)
return new ApocalypseScenario(&quot;Penguin Apocalypse, Code Blue&quot;);
}
```

Obviously in this case there would be many ways the states get changed.  I haven’t included those here, but you can imagine how those would become part of the new state class.

## Pattern 2: Method to class

Often I have found that I can’t find a way to extract all of the logic out of a class into one cohesive class, because of multiple interactions inside the class.

When I encounter this problem, I have found that I can usually identify a single large method that is using data from dependencies in the class to perform its logic.

These methods can usually be extracted to be their own new class.  We can pass in the data from the dependencies the method was originally using instead of using the dependencies directly.

Good candidates for this kind of pattern are methods which use multiple dependencies to get data in order to compute a single result.

When applying this pattern we may end up with a very small class, containing only 1 or 2 methods, but we are able to easily unit test this class with state and dependency free unit tests (level 1.)

## Pattern 3: Pull data sources up

It is often the case that we have dependencies in a class that exist only to provide some sort of data to the actual logic in the class.

In these situations it is often possible to pull the source of the data up and out of the methods that use the data and instead pass in just the used data.

Performing this refactoring allows us to have methods in our class that do not use the dependency in the class directly, but instead have the data that dependency class provided passed in.

When we do this, it opens up the ability to write clean and simple unit tests without mocks for those methods in the class or to apply the “Method to class” pattern to extract the method into its own testable class.

As a first step to refactoring my code to make it easily unit testable, I will often scan the methods in the class to find methods that are only using data from dependencies, but not manipulating the state or sending commands to the dependencies.  As I am able to identify these methods, I can apply this pattern to remove the dependencies from the methods.

## Pattern 4: Gather then use

Often we find that we cannot pull a dependency out of a method because that dependency is being manipulated inside of a loop or in the middle of some process.

In cases like these, we can often gather all the data that will be used to manipulate the dependency in some form of a collection and then reiterate through that collection to perform the manipulation on the dependency.

A good way to envision this is to think about two people mailing letters.  In one scenario we might have the first guy stuffing the envelope and the 2nd guy sealing the envelope.

We could carry on this process of 1st guy stuffs, 2nd guy seals for each letter we want to mail, but the process is dependent on both guys being present and working.

If we change it so that the first guy stuffs all the envelopes first and gives that stack of stuffed envelopes to the 2nd guy, they can do their jobs independently.

By applying the same idea to algorithms in our code, we can take a single method and break off the parts that don’t rely on dependencies to test independently or even move to their own classes.

# Back To Basics: Unit Testing Without Mocks

In my last post, I revealed my conclusions regarding what to do instead of misusing IoC containers and interfaces all over your code mostly for the purpose of unit testing.

One of the approaches I suggested was to focus on writing level 1 or level 2 unit tests and occasionally level 3 unit tests with 1 or possibly 2 dependencies at most.

I want to focus on how this is possible, using real code where I can.

## First let’s talk about the approach

When I first talk about writing mainly level 1 or level 2 unit tests, most people assume that I mean to cherry pick the few classes in your code base that already qualify and only write unit tests for those classes.

That is not at all what I am advocating.

Instead, the approach I am suggesting is to find ways to make most of the actual logic in your code become encapsulated into classes that depend mostly on data.

What I mean by this is that our goal should be to refactor or write our code in such a way that logic is grouped into classes that only depend on primitive types and data classes, not other classes that contain logic.

This of course is not fully achievable, because something will have to tie all of these logic containing classes together.  We we need these tie-together classes, but if we can make their job to simply execute commands and tie other classes together, we can feel pretty confident in not unit testing them, and we make our job a whole lot easier.

So to summarize, the basic strategy we are going to employ here is to eliminate the need for mocks by designing our classes to be much smaller, having much tighter roles and responsibilities, and operating on data that is passed in rather than manipulating methods on other classes.

There are many patterns we can use to achieve this goal.  I’ll show you an example, then I’ll try to cover some of the major patterns I have discovered so far.

## A real world example

I recently released a Java based Android application called PaceMaker.  When I had started out building this application, I set out with the high and mighty goal of using Google Guice framework for dependency injection and BDD style unit tests using JMock to mock passed in dependencies.  It wasn’t a horrible approach, I wrote about it here.

What I found with this approach though was that I was spending a large amount of time creating mocks, and I wasn’t getting much benefit from it.  So, I had abandoned writing unit tests for the code all together, and I pulled out the now almost useless Guice.

The past couple of nights, I decided to use this project as a test project to demonstrate some of the ideas I have been talking about and thinking about.

I wanted to take my real code, refactor the logic out of it into small classes with little or no dependencies, and write BDD style unit tests that would be simple and easy to understand.

The big challenge here was trying to find a small enough piece of code to use an as example.  For this example, I am going to use a bit of code that I was using to generate the name of a file that I write to disk for saving the data from a run.

This code was originally inside of a presenter class that handled generating a file name to pass to a serializer class in order to serialize the run.

Here is the original private method that existed in the presenter.

```private String getRunFileName()
{
String completeFileName = StorageManager.getDataStorageDirectory();
Location firstLocation = locationTracker
.getLocations().iterator().next();
Date firstPointTime = new Date(firstLocation.getTime());

SimpleDateFormat dateFormat = new
SimpleDateFormat(&quot;MMddyyyy_HHmmss&quot;);
dateFormat.setTimeZone(TimeZone.getTimeZone(&quot;GMT&quot;));
String fileName = dateFormat.format(firstPointTime) + &quot;.gpx&quot;;

completeFileName = completeFileName + fileName;
return completeFileName;
}
```

This method was a perfect candidate for some easily testable logic that could be put into its own class, but there are a few problems we should notice here.

• We are dependent on StorageManager to get the data storage directory used as the base directory.
• We are dependent on the locationTracker object to get the time of the first location.
• There is some real logic here in the form of a transformation.  (It is important to note that we are dealing with logic, not just commands, because testing execution of commands is not as important as testing logic.)

My approach to this refactor is actually pretty simple.  The first thing we need to do is see the dependencies for what they are.  It looks like our logic is dependent on StorageManager and locationTracker, but in reality the logic is dependent on the string which is the base directory for the file and the time to use for the file name.

We can change this code to reflect that pretty easily.

```private String getRunFileName(String baseDirectory, Calendar time)
{
String completeFileName = baseDirectory;
SimpleDateFormat dateFormat = new
SimpleDateFormat(&quot;MMddyyyy_HHmmss&quot;);
dateFormat.setTimeZone(TimeZone.getTimeZone(&quot;GMT&quot;));
String fileName = dateFormat.format(time.getTime()) + &quot;.gpx&quot;;

completeFileName = completeFileName + fileName;
return completeFileName;
}
```

What we have done here is small, but it is critical.  We have eliminated dependencies that would otherwise have to be mocked to test this logic.  Sure, we will still need to use those dependencies to get the data to pass into this method, but we can leave that code in the presenter class and move this code into its own class.  The class will be small for now, but it will be easily testable with a level 1 unit test (our favorite kind.)

## More examples

I didn’t cherry pick this example from the source code in PaceMaker, but I did cherry pick it for this blog post, because it was one of the shorter examples I could use.

I have several other areas of code in my presenter class in PaceMaker where I used a similar approach to extract out the logic, pull it into its own class with little or no dependencies and write unit tests for.

Here are two other examples:

• Pulled the state logic for starting, stopping, pausing, resuming and calculating length of time paused during a run into its own class.  I elected to add one dependency (the presenter itself as an observer) to the refactored class in order to allow the class to notify the presenter when the state changed.  In C# I would have just used an event to do this, but in Java we use the observer pattern.
• Pulled out the logic that created the GPX data files into a GPXDataModelBuilder which instead of depending on the LocationTracker class depended only on the data from that class.

The result ended up being very clear, easy to write, level 1 and level 2 unit tests with no mocking.  In addition, my class structure is now much more tightly cohesive, with a much tighter and clearer responsibility.  Before my presenter was doing several things, but now many of those things are broken up into very small testable classes.

In my next post, I’ll go into the patterns you can use to create classes that are able to be tested by level 1 and level 2 unit tests.

# Solving Problems, Breaking it Down

Right before the holidays, I said that you had better learn how to solve programming problems.

This time I am going to try and give you some good tools to enable you to get good at solving programming problems.  (Really algorithm type problems specifically.)

## Common mistakes

When most programmers are given a programming problem in an interview, they make several key mistakes.  The most severe of those is the improper allocation of time.

If you have heard the saying “measure twice and cut once,” then you are probably familiar with the idea of spending upfront time to make sure something is done right, rather than diving right in.

The most common mistake I see when conducting interviews or watching someone try to solve a programming problem is they try to start writing code as soon as possible.

You must resist this urge.

You really want to make sure you take enough time to understand the problem completely before attempting to solve it.

Another big mistake is trying to over solve the solution on the first iteration.  Keep it simple, don’t try to get fancy.

## A simple set of steps

I am going to give you a simple set of steps to follow which you can use for any algorithm type programming problem.

1. Read the problem completely twice.
2. Solve the problem manually with 3 sets of sample data.
3. Optimize the manual steps.
4. Write the manual steps as comments or pseudo-code.
5. Replace the comments or pseudo-code with real code.
6. Optimize the real code.

As much as 70% of our time should be spent in steps 1-3.

Let’s look at each step.

## Read the problem completely twice

This is the single most important step.  You may even want to read the problem 3 or 4 times.

You want to make sure you completely understand the problem.  A good test of this is whether or not you can explain the problem to someone else.

I cannot over-emphasize how important this step is!

If you don’t understand the problem, you cannot solve it.  Do not worry about wasting time here, because the better you understand the problem, the easier it will be to solve it.

If you are given any examples along with the problem, make sure you have worked through the examples and understand why the answers are correct for each one.

## Solve the problem manually

I am going to tell you perhaps the biggest secret in programming.

“Nothing can be automated that cannot be done manually!”

Programming is automation plain and simple.  You may have the ability to skip the manual steps and jump directly to code, but there is a manual process which is the foundation of any code you write.

It is very important to solve the problem manually first, so that you know what you are going to automate, otherwise you are just slinging code around.  Which while can be fun, will make you look like an idiot in a programming interview and will probably cause you to sweat profusely.

I recommend that you solve the problem with at least three different inputs to make sure you really understand your solution and that it will work for more than one case.

I often use a Mathematical Induction approach if possible.  Using this approach I might try and solve for 1 first, then for 2, then for n.

Also don’t forget to look for corner cases and edge cases and do any examples for those kind of cases you can think of.

It’s very important that when you solve a problem manually, you recognize what your brain is actually doing to solve the problem.  You may need to write out all the things you are normally storing in your head.  You want to be aware of each step, it is easy to gloss over them.

Let’s look at a very basic example, reversing a string.

If I give you a string “Zebra”, and ask you to reverse it, most people will do the following manual steps.

• Write “Zebra” down.
• Start a new word, and put “a” as the first letter.  (Why –> because it is the last letter, we want to start here)
• Put “r” down as the 2nd letter.  (Why –> because it is the next letter backwards from the last letter we copied)
• Put “b” down as the 3rd letter.  (Why –> same as above)
• Etc

Notice how I write down each little step and why.

## Optimize the manual solution

People often don’t realize how valuable this step is.  It is much easier to rearrange and reconstruct and idea or algorithm in your head than it is in code.

It’s well worth the effort to try and optimize the actual solution or simplify it when it is still in the most easily malleable state.

What you want to do here is figure out if there is another way you can solve the problem easier, or if there are some steps you can cut our or simplify.

Let’s look at our string reversal example and see if we can simplify the steps.

We should be able to immediately recognize that we can use a loop here to reduce the manual steps.  Our duplicate why’s for most of our steps tell us that we are doing the same thing over and over for each step, just with different data.

1. Write “Zebra” down.
2. Start at the last letter in the word and create a new empty word.
3. Append the current letter to the new word
4. If there is a previous letter, make the previous letter the current letter and start back at 3.

Look how close we are getting to code at this point.  You should be tempted to actually write the code for this.  That is good, it tells you that you have solved and simplified the problem well.  Writing code should now become very easy.

Many times you can skip this step if you have a really good handle on the problem or your previous steps already created a detailed enough description of the solution that coding it is already a 1 to 1 translation.

If you are a beginner or struggle with these kinds of problems, I would go ahead and take the time to do this step anyway though.

What we want to do here is capture all the steps we created and now either put them into our editor as comments or write them as psuedo-code that we can translate to real code.

By doing this, we can know exactly what the structure of the code we are going to write is going to look like which makes the job of filling in the actual code later trivial.

Let’s look at some psudeo-code for reversing a string.

// NewWord = “”

// Loop backwards through word to reverse

//   NewWord += CurrentLetter

// Return NewWord

Pretty simple, but the key thing we have done here is outlined the structure of the code we will write to solve the problem.

## Replace comments with real code

This step should be extremely easy at this point.  If you have done all the other steps, this step involves no problem solving at all.

All we do here is take each comment and convert it into a real line of code.

Taking the string reversal, we might end up with something like this.

```String newWord =””
for(int index = oldWord.Length – 1; index &gt;= 0; index—)
newWord += oldWord[index];
return newWord;

```

1 for 1 translation of the comments we created above for real code.

If you struggle here, there are usually two possible reasons:

1. You didn’t break down the problem into small enough steps
2. You don’t know your programming language well enough to do the conversion

If you didn’t break the problem down enough, try going back to the second step and being as meticulous as possible.  Write out each and every single step.  I know it is a pain, but do it, believe me it will be worth the effort.

If you don’t know your programming language well enough to do the translation, you may need to brush up here on some basic constructs.  Any language you expect to be able to solve algorithm type problems in, you should know how to do the following things:

• Create a list
• Sort a list or array
• Create a map or dictionary
• Loop through a list, or dictionary
• Parse strings
• Convert from string to int, int to string, etc

If you don’t know how to do all of these things.  Stop what you are doing now and learn them. It’s not a very long list, and the benefits will be profound.

## Optimize the real code

Sometimes this step isn’t necessary, but it’s worth taking a look at your code and figuring out if you can cut out a few lines or do something simpler.

This is also a good place to make sure all your variables are named with long meaningful names.  I cannot stress enough how important having good names for your variables and methods is for helping the person evaluating your code to understand what you were trying to do.  This is especially important when you make a mistake!

I won’t give an optimization for our trivial example of a string reversal, but a word of advice here is not to get too tricky.  Just try to mainly simplify your code and get rid of duplication.

## A few final tips

If you follow this template for solving algorithm type problem, you should do very well in programming interviews, but the key to doing so is having confidence in this process.

The only way you are going to have confidence in this process is to practice it.  It takes a good amount of faith to believe that spending 70% of your 30 minutes to solve a problem just thinking about the problem and not writing any code is the right approach, so make sure you have that faith when you need it.

I’ve talked about using TopCoder to become a better programmer before, and I still recommend it.  Codility.com is another great site I have recently been introduced to.

There is one important step I did not include in the outline above, because I didn’t want to make the process any more complicated than it needed to be.

Many times you will find that a problem itself involves multiple large steps or is very complicated.  In those instances, you will want to try and find a way to cut the problem directly in half and then following the process above for each half.

This method of tackling a problem is called “divide and conquer" and is quite effective.  A good way to know where to break a problem in half is to think about what part of the problem if already given to you would make solving the rest easy.

# Solving Problems, You Better Learn How

It is astounding how many developers can write and maintain large enterprise systems dealing with all kinds of complex logic, database access, etc and cannot for the life of them solve a moderately difficult programming problem given in an interview in less than 30 minutes.

It is also astounding how many developers that can not write even a single line of code can fail the same problem exactly the same way.

Based on those two statements, your first assumption might be that asking someone to solve a programming problem as part of an interview would be a bad idea, because it isn’t really telling you if they are good and just crumble when asked to do something like this on the spot, or if they can’t program at all.

## Let’s look at some possible numbers to see why this is wrong

In poker you can use a combination of pot odds and the likely holdings of your opponents to deduce in many situations whether or not you should call a bet.

Pot odds is simply a ratio of how much money is in the pot (what you can win), versus how much it will cost you to call the bet.

For example, if a pot contained \$9 and I bet you \$1, you would be getting 10:1 odds on your money for a call (you have to risk an additional dollar to potentially win 10.)

So in that example, if you have a better than 10% chance of winning, you are making a profitable play by calling.

I don’t have the exact numbers (actually I am just guessing at them from experience), but here is a realistic estimate:

• Good programmers – 60% can solve the problems
• Great programmers – 90% can solve the problems
• Don’t know how to program, lied on my resume – 1% can solve the problems by some sort of divination or black magic, or memorizing sequences of symbols which solve the problem.

Given some statistics like this, if you have someone pass this kind of a test, you can calculate the rough probability of which of the three groups they belong to.

• 40% chance they are Good
• 59.5% chance they are Great
• .5% chance they are a black magic practicing witch or warlock
• 99.5% chance they are either good or great

So, even though you may be throwing away some good candidates and some great ones, the chance of you getting someone who doesn’t know how to code at all is almost reduced to nothing.  This is very important!

Why?  Because if you hire someone who doesn’t know how to code at all, it will cost you a huge amount of money, but if you pass over someone who is great, but still end up getting someone else who is good or great it doesn’t really cost you anything except in the rare case that the great programmer failed out of the test and you ended up hiring a good one instead.

Even still, that is a small loss versus hiring someone who can’t write a lick of code.

I’ve talked about how I think having code in an interview is a good thing before, but now the numbers are here to back me up.

More and companies are realizing the truth about this situation and companies like Codility are popping up to offer a service to test candidates for you.

## What this means for you?

You might not get asked to solve a problem or write code in your next interview, but there is a growing chance that you will.

You may think to yourself, ok even if I bomb the code writing part, I will do great on the rest of the interview and be able to convince them that I know how to write code, besides any company that would toss me out of the running for failing some stupid online test is a company I don’t want to work for anyway.

To that I say a company that doesn’t know how to do math is a company that you don’t want to work for. If my numbers are anything close to accurate, the mathematical best choice is to almost always throw out someone who fails the coding test.

The simple solution is to learn how to solve these kinds of problems.  Learn how to write an algorithm under pressure in 30 minutes.  It’s not that hard to practice and the skill will help you in many other areas of your career and probably your life in general.

I’ve written before on TopCoder as a great site to practice these kinds of problems, and if you haven’t checked it out that is a great place to start.

Another option is to get the book Programming Pearls.  It is a bit dated, but has some nice programming problems for you to try to solve.

I’ll follow up this post in the coming weeks with some hints and tips on how to take a programming problem, break it down and finally code it.  Learning how to do this properly can make some of the more difficult algorithm type problems very simple.

So get to it!  Start learning how to solve programming problems.  Reverse strings, sort linked lists, help grandma pick the highest yielding grandsons and a granddaughters to send Christmas cards to!

If you are following my back to basics series, don’t worry, it is still on.  I’ll probably pick back up with it after the holidays.

# Back To Basics: Sorting

Why is sorting so hard?

One of the most common misunderstandings and frustrations I see from developers is around sorting.

Almost every developer has faced needing to sort a list of things in some manner in their development careers.  Many developers end up fumbling through it, looking for an example from the web and then copying that example, not really knowing what they did or why it works.

If you fall into that category, or just want to know a little better how sort works, stay tuned, we are going to make it simple.

## Sorting algorithms?

Nope.

Great for computer science.  You should have a basic idea of how they work, but unless you are writing some low level bit twiddling code on an integrated circuit, you don’t need to know about them.

Let’s think of sorting algorithms as a black box.  You don’t need to know what the algorithm is or how it works, you just need to know that if you give the two things it requires it will sort your list for you.

What two things do all sorting algorithms, regardless of implementation need?

1. A list of things to sort
2. A method to call to tell if object A comes before or after object B, or whether they are the same.

That is all you need to know and give to a sorting algorithm implementation and it will do the rest.

Every modern programming language has at least one sorting algorithm and every single one of them has the exact two same inputs I mentioned above, a list to sort, and a comparison method.

It doesn’t matter if you are using C#, Java, Ruby or some other language, they all implement a sorting algorithm and need those two things.  And that is all they need from you.

The burden is off your shoulders.

## Step 1: Find the sorting method

I’m going to do better than teach you how to sort in each language.  I’m going to show you how to figure out how to sort in a language.

The first step in doing that regardless of the language, is to find the method that sorts things.

A good place to look is the list or collection object itself, since that is the most obvious place to put a sort routine.

When we look up List in C#, sure enough we find Sort on there.

Oh noes!  So confusing!  It has 3 sort methods!  Which one do I call?  Agh!

Don’t panic.  Remember what I said above.  Even though you see three signatures for the sort method, sort methods still only need two things.

Sort(), Sort(Comparison<T>), and Sort(IComparer<T>) still need just a list and a method to compare two of your objects.

We’ll dive into that more in a bit, now let’s look at Java.

If we look up ArrayList in Java, we don’t find a sort method.  Boo.  What were they thinking.  It’s ok though, googling for “Java sort” turns up Arrays.sort and Collections.sort.

Let’s look at Collections.sort.

This time we see two sort methods, sort(List list), and sort(List list, Comparator c).

Again, nothing to be afraid of since we know the two things all sorting algorithms require from us.

## Step 2: Find the two things

So if every sorting method needs the same two things from you, then all you need to do is figure out how to give it those two things.

In the C# and Java instances above, the first thing (the list of things to sort) is fairly obvious.  In C# we saw that the sort method was on the list class itself, so we don’t even need to pass it the list of things to sort, it still needs it, but it knows where to find it.

In the Java instance the first parameter to both sort methods is a list.

See how easy we just made things?  We are already 50% there.

So really the only question left is how to provide the method that compares two of our objects.

When we know what we are looking for, it is much easier to find and understand.

Let’s break down each instance and figure this out.

If we look at C#’s Sort() method, without reading the documentation, we can probably guess two things.

1. The list to sort is the list we are calling the sort method on.
2. The objects in the list provide the comparing method.

The first is obvious.  The second one is a little less obvious, but if you think about it, since the method takes no parameters the only possible thing that could be providing the method for comparison is the objects in the list.

If we look at the documentation for Sort(), we find that is true.  The objects in the list have to implement IComparableIComparable has one method, CompareTo, which takes one object and compares it to the other.  If you implement this interface on your object, you can sort a list of them.  Simple.

For Sort(Comparison<T>), we can easily deduce that what is passed in there must be the method to compare two objects, and sure enough it is.  If we want to use this method we just pass in a method of type Comparison<T> where T is the type of object we want to compare.

We can see that Comparison<T> is just a method that takes two of your objects and returns an int to indicate which is greater or if they are equal (1 = greater, 0 = equal, –1 less).

The final C# Sort(IComparer<T>) just takes an object that implements IComparer<T>.  Sure enough, IComparer<T> has one method that must be implemented, Compare.  That method has the same signature as Comparison<T> from above.

Now, let’s look at Java.  You’re going to find it is pretty much the same thing.

For sort(List list), we can easily see that the list to sort is passed in as a parameter, but once again we have to find where we supply the method to compare two of our objects.

Since it can’t be passed in, it must be… say it with me… on the object.  That is right!

Turns out we just have to have our objects implement Comparable and then they can be sorted.  When we look at the Comparable interface, we see that it just has one method, CompareTo(Object o) and that method just takes another object to compare our object to and returns 1 if ours is greater, 0 if it is the same, or –1 if ours is smaller.

Now let’s look at sort(List list, Comparator c).  In this case, we can see that the method to compare our two objects is passed in, but since we can’t pass methods in Java, Comparator is just an interface that we can implement that provides the comparison method.

If we look at the Comparator interface, we see it has one method we need to implement compare(Object o1, Object o2).  I’m sure you can guess what that method should return by now.

So regardless of the language, when you are looking at how to use a sorting algorithm, find the two things that it needs.  The list is usually obvious, but the method to compare is almost always implemented in one of three ways.

1. An interface the objects being compared implement.
2. A method that is passed in.
3. An object that is passed in that implements a comparison interface.

## Step 3: Implement the method

After you have figured out how to pass your list and your comparison method to the sorting algorithm, you need to actually write the comparison method.

If you learn to write a comparison method you can use it for any sorting algorithm, since all comparison methods do the exact same thing.

Fortunately, this is easy as well, because comparison methods always do the exact same thing.  They take two objects and return whether the first one is larger, smaller, or equal to the second.  That is it.

Don’t worry about sorting.  Remember what I told you about the sorting algorithm black box?  Let that sorting algorithm worry about sorting, all you have to do is decide whether 1 given item comes before a 2nd given item or if they are they same, nothing more.

Can you do that?

Sort people on age?

If person1.Age > person2.Age return 1

Else if person1.Age < person2.Age return –1

Else return 0

Psuedo-code, but pretty simple.

## Wrapping it up

If you can figure out what method you can use to sort in your language, find how to pass in the list of things to sort and provide the comparison method, and implement the comparison method, then you can sort!

Just remember to not get overwhelmed and think logically about the two things that all sorting methods need from you, and you’ll see how simple the whole sorting thing really is.

You can be the guy that everyone asks about sorting instead of the guy copying and pasting some sorting code from the internet that you aren’t really sure how it works.

By the way, in C#, I would recommend just using the LINQ extension method OrderBy.

Very nice to do something like:

people.OrderBy(p => p.Age).OrderBy(p => p.Name);

###### As always, you can subscribe to this RSS feed to follow my posts on Making the Complex Simple.  Feel free to check out ElegantCode.com where I post about the topic of writing elegant code about once a week.  Also, you can follow me on twitter here.

KevDog posted a question in response to my post about refactoring switch statements, Pulling out the Switch: It’s Time for a Whooping. I thought it would be good to go ahead and answer it as a post since it is a pretty interesting real world example of a somewhat difficult switch statement to get rid of.

Here is the original code:

```public static DataFileParser GetParser(DataFile dataFile)
{
switch ((FileType)dataFile.ValidationFormat.FileType)
{
case FileType.PDF:
return new PdfParser(dataFile);
case FileType.Image:
return new ImageParser(dataFile);
return new CsvParser(dataFile, true);
case FileType.Csv:
return new CsvParser(dataFile, false);
default:
throw new NotImplementedException("There is no parser for " +
dataFile.ValidationFormat.FileType.ToString());
}
}
```

## Surprisingly simple solution

Try and say that 3 times fast.

I thought about this a bit and at first was having a hard time coming up with a solution.  Then I typed the code into an editor and realized how easy it is.

The trick here is that it looks like something other than the simple case of data mapping to logic, but it isn’t.

• The logic in this case is the creation of the Parser.
• The data is the file type.

Once you think of it in those terms you can easily solve it using the pattern I mention in my previous post on refactoring switch statements.

## Switch be gone!

```private static Dictionary<FileType, Func<DataFile, DataFileParser>> DataFileTypeToParserCreatorMap =
new Dictionary<FileType, Func<DataFile, DataFileParser>>()
{
{FileType.PDF, file => new PdfParser(file)},
{FileType.Image, file => new ImageParser(file)},
{FileType.CsvWithHeaderRow, file => new CsvParser(file, true)},
{FileType.Csv, file => new CsvParser(file, false)}
};

public static DataFileParser GetParserRefactored(DataFile dataFile)
{
Func<DataFile, DataFileParser> parserCreator;
if(!DataFileTypeToParserCreatorMap.TryGetValue(
dataFile.ValidationFormat.FileType,
out parserCreator))
throw new NotImplementedException("There is no parser for " +
dataFile.ValidationFormat.FileType.ToString());

return parserCreator(dataFile);
}
```

That is the quickest solution that preserves the existing code as much as possible.

## Another solution

With the first solution we pushed the object creation into the map.

If we can make the constructor for all the parsers the same, we can use reflection to dynamically create our instances by looking up the type in the dictionary.

In this example, I assume that we have refactored CsvParser to have a constructor that only takes one parameter and internally sets a value of usesHeader to false, and we have created a CsvWithHeaderParser that inherits from the CsvParser and sets usesHeader to true.

```private static Dictionary<FileType, Type> DataFileTypeToParserTypeMap = new Dictionary<FileType, Type>()
{
{ FileType.PDF, typeof(PdfParser)},
{ FileType.Image, typeof(ImageParser)},
{ FileType.Csv, typeof(CsvParser)}
};

public static DataFileParser GetParser(DataFile dataFile)
{
Type parserType;
if(!DataFileTypeToParserTypeMap.TryGetValue(
dataFile.ValidationFormat.FileType,
out parserType))
throw new NotImplementedException("There is no parser for " +
dataFile.ValidationFormat.FileType.ToString());

return (DataFileParser) Activator.CreateInstance(parserType, dataFile);
}

```

Pretty similar solution.  I prefer the first though for several reasons:

• The refactor is localized, where the second solution has to touch other classes.
• Reflection makes you lose compile time safety.
• You may create a new parser that you want to have more parameters for the constructor.  With the second solution, you will have a hard time doing that.
• The first solution gives you ultimate flexibility in setting up the constructor of the parser.  If you wanted to do 5 steps for a particular parser, you could.

Anyway, next time you try refactoring switch statements that are hard to figure out how to refactor, try to break it into a mapping between data and logic.  There is always a solution.

# Pulling out the Switch: It’s Time for a Whooping

In my previous post I talked about how if-else and switch statements are very similar in that they both ignore the problem of combining data with code.

Today I am going to show you how to refactor switch statements to alleviate that problem.

There are some varied opinions on how to refactor switch statements which I believe derive from trying to treat all switch statements as the same.  I want to look at the kinds of switch statements that exist and why I recommend to refactor each one in a particular way.

## Data to data Switch Statement

The first, most obvious kind of switch statement is one that maps one form of data to another form of data.

```switch (state)
{
case "Florida": return "Tallahassee";
case "Idaho": return "Boise";
case "Arizona": return "Phoenix";
case "South Carolina": return "Columbia";
}
```

This example is clearly mapping one piece of data to another.  The best refactor for this situation is to use a map.  In C# it is a dictionary, in Java a map.

```var stateToCapitolMap = new Dictionary<string, string>()
{
{"Florida", "Tallahassee"},
{"Idaho", "Boise"},
{"Arizona", "Phoenix"},
{"South Carolina", "Columbia"}
};

return stateToCapitolMap[state];
```

I cannot believe how many people argue against this refactoring.  It doesn’t look like much, but we have greatly separated logic from data and increased the maintainability of the code.

Before our refactoring, consider, how you would be able to read all the states and capitols from a file and insert them into the switch statement?  The only possible way would be through code generation.  Clearly this indicates a coupling of code and data.  The switch statement is formatted in such a way that it almost looks like data, but don’t let that fool you, it is code.

Consider the refactored example.  If we want to read the values from a file, it is simple.  So simple, that I’ll even show the code right here.

```var stateToCapitolMap = new Dictionary<string, string>();
{
string[] stateCapitol = line.Split(',');
}
return stateToCapitolMap[state];
```

## Data to action Switch Statement

This kind of switch statement appears different than data to data, but it is actually very similar.  In this case we are mapping some data to a direct action that should be performed given that data.

Often this form of logic can be disguised by multiple actions happening in a case statement.

```switch (move)
{
case "Up": MoveUp();
break;
case "Down": MoveDown();
break;
case "Left": MoveLeft();
break;
case "Right": MoveRight();
break;
case "Combo":
MoveUp();
MoveUp();
MoveDown();
MoveDown();
break;
}
```

We can take the same approach here because really this is a form of mapping data to data.  The second data item is essentially the name of a method to call to perform an action.  We can illustrate this intent easily in C#.  (In Java, you will need to wrap the action into a set of classes with a common interface.)

```var moveMap = new Dictionary<string, Action>()
{
{"Up", MoveUp},
{"Down", MoveDown},
{"Left", MoveLeft},
{"Right", MoveRight},
{"Combo", () => { MoveUp(); MoveUp(); MoveDown(); MoveDown(); }}
};

moveMap[move]();
```

What we are essentially doing now is a dynamic look-up of the method to call based on the data.  We could even make this example data driven from a text file that specified how to map a move to a method name or list of method names, but that is far beyond on the scope of this post, and I don’t think I would recommend it unless you have a really good reason.

## Data to multiple actions Switch Statement

If you are familiar with the techniques of refactoring switch statements, you make be shaking your head by now saying, “that guy is wrong, he needs to use a factory.”  Okay, well now we are going to do it.

In the data-to-action refactoring, I opted for the simplest solution that can work, instead of trying to over solve the problem by adding complexity in the form of a factory.

But, what happens when you have multiple switch statements in your code that operate on the same set of data?

```switch (move)
{
case "Up": MoveUp();
break;
case "Down": MoveDown();
break;
case "Left": MoveLeft();
break;
case "Right": MoveRight();
break;
case "Combo":
MoveUp();
MoveUp();
MoveDown();
MoveDown();
break;
}

// ... somewhere else

switch (move)
{
case "Up":
moveName = "Basic Up Move";
break;
case "Down":
moveName = "Basic Down Move";
break;
case "Left":
moveName = "Basic Left Move";
break;
case "Right":
moveName = "Basic Right Move";
break;
case "Combo":
moveName = "Up Up Down Down Combo Move";
break;
}
```

Sure, we could refactor these both into maps or dictionaries.  But what will happen when we try and add a new move?  We’ll have to remember to add logic in both places or we’ll have a problem.  In the prior example we recognized that data was being mapped to an action, so we represented that as succinctly in code as possible.

In this example, the same data is being mapped to some data that describes a move and actions to perform.  We need some way to house these attributes that belong to the data we are switching on together, and we would like to have this all in one place, so we don’t have to change the code in multiple places.

Our best solution here is to use a factory that gives us the right kind of object that implements the behavior that should be tied to the data we are currently switching on.

```public interface IMove
{
void DoMove();
string GetFriendlyName();
}

public class UpMove : IMove
{
public void DoMove()
{
// Do the move
}

public string GetFriendlyName()
{
return "Basic Up Move";
}
}

public static class MoveFactory
{
private static Dictionary<string, Func<IMove>> moveMap = new Dictionary<string, Func<IMove>>()
{
{"Up", () => { return new UpMove(); }},
{"Down", () => { return new DownMove(); }},
{"Left", () => { return new LeftMove(); }}
// ...
};

public static IMove CreateMoveFromName(string name)
{
return moveMap[name]();
}
}
```

You can see here that we are creating a factory which contains a dictionary which maps a move name to what kind of object to create.  Each move implements a common IMove interface.  (I only show some of the implementation here.)

Now in our code we can replace those switch statements with polymorphic behavior from our object returned from the factory.

```IMove move = MoveFactory.CreateMoveFromName(name);
move.DoMove();
String friendlyName = move.GetFriendlyName();
```

The nice thing about this implementation is that if we try to add a new move the IMove interface will require us to implement all the proper methods.  We make a change in one place and the compiler reminds us what we need to do.

## Don’t jump straight to the factory

You may have heard the argument between using a factory or a dictionary to refactor switch statements before.  What I am trying to show in this blog post is that it depends on your situation.

The simplest solution is a dictionary or map.  Once you have a second place you are mapping the same data, you should move to a factory.  The factory then contains the mapping between a piece of data and a class.

I also wanted to note here that I didn’t use enumerations.  In real code you should.  I avoided them here to prevent adding one more layer of abstraction so that my example would not require as much explanation.