Back To Basics: Sorting

Written By John Sonmez

Why is sorting so hard?

One of the most common misunderstandings and frustrations I see from developers is around sorting.

Almost every developer has faced needing to sort a list of things in some manner in their development careers.  Many developers end up fumbling through it, looking for an example from the web and then copying that example, not really knowing what they did or why it works.

If you fall into that category, or just want to know a little better how sort works, stay tuned, we are going to make it simple.

harry-potter-sorting-hat

Sorting algorithms?

Nope.

Great for computer science.  You should have a basic idea of how they work, but unless you are writing some low level bit twiddling code on an integrated circuit, you don’t need to know about them.

Let’s think of sorting algorithms as a black box.  You don’t need to know what the algorithm is or how it works, you just need to know that if you give the two things it requires it will sort your list for you.

What two things do all sorting algorithms, regardless of implementation need?

  1. A list of things to sort
  2. A method to call to tell if object A comes before or after object B, or whether they are the same.

That is all you need to know and give to a sorting algorithm implementation and it will do the rest.

Every modern programming language has at least one sorting algorithm and every single one of them has the exact two same inputs I mentioned above, a list to sort, and a comparison method.

It doesn’t matter if you are using C#, Java, Ruby or some other language, they all implement a sorting algorithm and need those two things.  And that is all they need from you.

The burden is off your shoulders.

Step 1: Find the sorting method

I’m going to do better than teach you how to sort in each language.  I’m going to show you how to figure out how to sort in a language.

The first step in doing that regardless of the language, is to find the method that sorts things.

A good place to look is the list or collection object itself, since that is the most obvious place to put a sort routine.

When we look up List in C#, sure enough we find Sort on there.

Oh noes!  So confusing!  It has 3 sort methods!  Which one do I call?  Agh!

Don’t panic.  Remember what I said above.  Even though you see three signatures for the sort method, sort methods still only need two things.

Sort(), Sort(Comparison<T>), and Sort(IComparer<T>) still need just a list and a method to compare two of your objects.

We’ll dive into that more in a bit, now let’s look at Java.

If we look up ArrayList in Java, we don’t find a sort method.  Boo.  What were they thinking.  It’s ok though, googling for “Java sort” turns up Arrays.sort and Collections.sort.

Let’s look at Collections.sort.

This time we see two sort methods, sort(List list), and sort(List list, Comparator c).

Again, nothing to be afraid of since we know the two things all sorting algorithms require from us.

Step 2: Find the two things

So if every sorting method needs the same two things from you, then all you need to do is figure out how to give it those two things.

In the C# and Java instances above, the first thing (the list of things to sort) is fairly obvious.  In C# we saw that the sort method was on the list class itself, so we don’t even need to pass it the list of things to sort, it still needs it, but it knows where to find it.

In the Java instance the first parameter to both sort methods is a list.

See how easy we just made things?  We are already 50% there.

So really the only question left is how to provide the method that compares two of our objects.

When we know what we are looking for, it is much easier to find and understand.

Let’s break down each instance and figure this out.

If we look at C#’s Sort() method, without reading the documentation, we can probably guess two things.

  1. The list to sort is the list we are calling the sort method on.
  2. The objects in the list provide the comparing method.

The first is obvious.  The second one is a little less obvious, but if you think about it, since the method takes no parameters the only possible thing that could be providing the method for comparison is the objects in the list.

If we look at the documentation for Sort(), we find that is true.  The objects in the list have to implement IComparableIComparable has one method, CompareTo, which takes one object and compares it to the other.  If you implement this interface on your object, you can sort a list of them.  Simple.

For Sort(Comparison<T>), we can easily deduce that what is passed in there must be the method to compare two objects, and sure enough it is.  If we want to use this method we just pass in a method of type Comparison<T> where T is the type of object we want to compare.

We can see that Comparison<T> is just a method that takes two of your objects and returns an int to indicate which is greater or if they are equal (1 = greater, 0 = equal, –1 less).

The final C# Sort(IComparer<T>) just takes an object that implements IComparer<T>.  Sure enough, IComparer<T> has one method that must be implemented, Compare.  That method has the same signature as Comparison<T> from above.

Now, let’s look at Java.  You’re going to find it is pretty much the same thing.

For sort(List list), we can easily see that the list to sort is passed in as a parameter, but once again we have to find where we supply the method to compare two of our objects.

Since it can’t be passed in, it must be… say it with me… on the object.  That is right!

Turns out we just have to have our objects implement Comparable and then they can be sorted.  When we look at the Comparable interface, we see that it just has one method, CompareTo(Object o) and that method just takes another object to compare our object to and returns 1 if ours is greater, 0 if it is the same, or –1 if ours is smaller.

Now let’s look at sort(List list, Comparator c).  In this case, we can see that the method to compare our two objects is passed in, but since we can’t pass methods in Java, Comparator is just an interface that we can implement that provides the comparison method.

If we look at the Comparator interface, we see it has one method we need to implement compare(Object o1, Object o2).  I’m sure you can guess what that method should return by now.

So regardless of the language, when you are looking at how to use a sorting algorithm, find the two things that it needs.  The list is usually obvious, but the method to compare is almost always implemented in one of three ways.

  1. An interface the objects being compared implement.
  2. A method that is passed in.
  3. An object that is passed in that implements a comparison interface.

Step 3: Implement the method

After you have figured out how to pass your list and your comparison method to the sorting algorithm, you need to actually write the comparison method.

If you learn to write a comparison method you can use it for any sorting algorithm, since all comparison methods do the exact same thing.

Fortunately, this is easy as well, because comparison methods always do the exact same thing.  They take two objects and return whether the first one is larger, smaller, or equal to the second.  That is it.

Don’t worry about sorting.  Remember what I told you about the sorting algorithm black box?  Let that sorting algorithm worry about sorting, all you have to do is decide whether 1 given item comes before a 2nd given item or if they are they same, nothing more.

Can you do that?

Sort people on age?

If person1.Age > person2.Age return 1

Else if person1.Age < person2.Age return –1

Else return 0

Psuedo-code, but pretty simple.

Wrapping it up

If you can figure out what method you can use to sort in your language, find how to pass in the list of things to sort and provide the comparison method, and implement the comparison method, then you can sort!

Just remember to not get overwhelmed and think logically about the two things that all sorting methods need from you, and you’ll see how simple the whole sorting thing really is. 

You can be the guy that everyone asks about sorting instead of the guy copying and pasting some sorting code from the internet that you aren’t really sure how it works.

By the way, in C#, I would recommend just using the LINQ extension method OrderBy.

Very nice to do something like:

people.OrderBy(p => p.Age).OrderBy(p => p.Name);