What Makes Readable Code: Not What You Think

By John Sonmez

You often hear about how important it is to write “readable code.”

Developers have pretty strong opinions about what makes code more readable. The more senior the developer, the stronger the opinion.

But have you ever stopped to think about what really makes code readable?

readable What Makes Readable Code: Not What You Think

The standard answer

You would probably agree that the following things, regardless of programming language, contribute to the readability of code:

  • Good variable, method and class names
  • Variables, classes and methods that have a single purpose
  • Consistent indentation and formatting style
  • Reduction of the nesting level in code

There are many more standard answers and pretty widely held beliefs about what makes code readable and I am not disagreeing with any of these.

(By the way, an excellent resource for this kind of information about “good code” is Robert Martin’s excellent book, Clean Code, or Steve McConnell’s book that all developers should read, Code Complete. *both of these are affiliate links, thanks for your support.)

Instead, I want to point you to a deeper insight about readability…

The vocabulary and experience of the reader

I can look at code and in 2 seconds tell if you it is well written and highly readable or not.  (At least my opinion.)

At the same time, I can take a sample of my best, well written, highly readable code and give it to a novice or beginner programmer, and they don’t spot how it is different from any other code they are looking at.

Even though my code has nice descriptive variable names, short well named methods with few parameters that do one thing and one thing only, and is structured in a way that clearly groups the sections of functionality together, they don’t find it any easier to read than they do code that has had no thought put into its structure whatsoever.

In fact, the complaint I get most often is that my code has too many methods, which makes it hard to follow, and the variable names are too long, which is confusing.

There is a fundamental difference in the way an experienced coder reads code versus how a beginner does

An experienced developer reading code doesn’t pay attention to the vocabulary of the programming language itself.  An experienced developer is more focused on the actual concept being expressed by the code—what the purpose of the code is, not how it is doing it.

A beginner or less experienced developer reads code much differently.

When a less experienced developer reads code, they are trying to understand the actual structure of the code.  A beginner is more focused on the actual vocabulary of the language than what the expression of that language is trying to convey.

To them, a long named variable isn’t descriptive, it’s deceptive, because it is hiding the fact that NumberOfCoins represents an integer value with its long name and personification of the variable, as something more than just an integer.  They’d rather see the variable named X or Number, because its confusing enough to remember what an integer is.

An experienced developer, doesn’t care about integers versus strings and other variable types.  An experienced developer wants to know what the variable represents in the logical context of the method or system, not what type the variable is or how it works.

Example: learning to read

Think about what it is like to learn to read.

When kids are learning to read, they start off by learning the phonetic sounds of letters.

When young kids are reading books for the first time, they start out by sounding out each word.  When they are reading, they are not focusing on the grammar or the thought being conveyed by the writing, so much as they are focusing on the very structure of the words themselves.

Imagine if this blog post was written in the form of an early reader.

Imagine if I constrained my vocabulary and sentence structure to that of a “See Spot Run” book.

Would you find my blog to be highly “readable?”  Probably not, but kindergarteners would probably find it much more digestible.  (Although they would most likely still snub the content.)

sheetmusic What Makes Readable Code: Not What You Think

You’d find the same scenario with experienced musicians, who can read sheet music easily versus beginners who would probably much prefer tablature.

An experienced musician would find sheet music much easier to read and understand than a musical description that said what keys on a piano to press or what strings on a guitar to pluck.

Readability constraints

Just like you are limited to the elegance with which you can express thoughts and ideas using the vocabulary and structure of an early reader book, you are also limited in the same way by both the programming language in which you program in and the context in which you program it.

This is better seen in an example though.  Let’s look at some assembly language.

.model small
.stack 100h

.data
msg     db      'Hello world!$'

.code
start:
        mov     ah, 09h   ; Display the message
        lea     dx, msg
        int     21h
        mov     ax, 4C00h  ; Terminate the executable
        int     21h

end start

This assembly code will print “Hello World!” to the screen in DOS.

With x86 assembly language, the vocabulary and grammar of the language is quite limited.  It isn’t easy to express complex code in the language and make it readable.

There is an upper limit on the readability of x86 assembly language, no matter how good of a programmer you are.

Now let’s look at Hello World in C#.

public class Hello1
{
   public static void Main()
   {
      System.Console.WriteLine("Hello, World!");
   }
}

It’s not a straight across the board comparison, because this version is using .NET framework in addition to the C# language, but for the purposes of this post we’ll consider C# to include the base class libraries as well.

The point though, is that with C#’s much larger vocabulary and more complicated grammar, comes the ability to express more complex ideas in a more succinct and readable way.

Want to know why Ruby got so popular for a while?  Here is Hello World in Ruby.

puts "Hello, world"

That’s it, pretty small.

I’m not a huge fan of Ruby myself, but if you understand the large vocabulary and grammar structure of the Ruby language, you’ll find that you can express things very clearly in the language.

Now, I realize I am not comparing apples to apples here and that Hello World is hardly a good representation of a programming language’s vocabulary or grammar.

My point is, the larger the vocabulary you have, the more succinctly ideas can be expressed, thus making them more readable, BUT only to those who have a mastery of that vocabulary and grammar.

What we can draw from all this?

So, you might be thinking “oh ok, that’s interesting… I’m not sure if I totally agree with you, but I kind of get what you’re saying, so what’s the point?”

Fair question.

There is quite a bit we can draw from understanding how vocabulary and experience affects readability.

First of all, we can target our code for our audience.

We have to think about who is going to be reading our code and what their vocabulary and experience level is.

In C#, it is commonly argued whether or not the conditional operator should be used.

Should we write code like this:

var nextAction = dogIsHungry ? Actions.Feed : Actions.Walk;

Or should we write code like this:

var nextAction = Actions.None
if(dogIsHungry)
{
   nextAction = Actions.Feed
}
else
{
   nextAction = Actions.Walk;
}

I used to be in the camp that said the second way was better, but now I find myself writing the first way more often.  And if someone asks me which is better, my answer will be “it depends.”

The reason why it depends is because if your audience isn’t used to the conditional operator, they’ll probably find code that uses it confusing.  (They’ll have to parse the vocabulary rather than focusing on the story.)  But, if your audience is familiar with the conditional operator, the long version with an if statement, will seem drawn out and like a complete waste of space.

The other piece of information to gather from this observation is the value of having a large vocabulary in a programming language and having a solid understanding of that vocabulary and grammar.

The English language is a large language with a very large vocabulary and a ridiculous number of grammatical rules.  Some people say that it should be easier and have a reduced vocabulary and grammar.

If we made the English language smaller, and reduced the complex rules of grammar to a more much simple structure, we’d make it much easier to learn, but we’d make it harder to convey information.

What we’d gain in reduction of time to mastery, we’d lose in its power of expressiveness.

One language to rule them all?

It’s hard to think of programming languages in the same way, because we typically don’t want to invest in a single programming language and framework with the same fervor as we do a spoken and written language, but as repugnant as it may be, the larger we make programming languages, and the more complex we make their grammars, the more expressive they become and ultimately—for those who achieve mastery of the vocabulary and grammar—the more readable they become. (At least the potential for higher readability is greater.)

Don’t worry though, I’m not advocating the creation of a huge complex programming language that we should all learn… at least not yet.

This type of thing has to evolve with the general knowledge of the population.

What we really need to focus on now is programming languages with small vocabularies that can be easily understood and learned, even though they might not be as expressive as more complicated languages.

Eventually when a larger base of the population understands how to code and programming concepts, I do believe there will be a need for a language as expressive to computers and humans alike, as English and other written languages of the world are.

What do you think?  Should we have more complicated programming languages that take longer to learn and master in order to get the benefit of an increased power of expression, or is it better to keep the language simple and have more complicated and longer code?

If you like this post don’t forget to Follow @jsonmez or subscribe to my RSS feed.