By May 31, 2017

To Comment or Not to Comment? That is the Coder’s Question

Comments are one of these places where religion meets technology— metaphorically, of course. On one side, there are the firm believers that good code is commented code, and on the other side stand the devout followers of the scarcest commenting chapel.

To me, the primary characteristic that our code should have is expressiveness. That is to say that a reader of the code should understand the intention of the person who wrote it. It’s essential for the survival of a codebase; I’ve even devoted my blog to this topic.

Since I’m a C++ developer, I’ll use examples in C++ to illustrate my points in this article. But comments exist in virtually all languages, so what follows is by no means restricted to C++ and applies to your language, too.

We can't discuss expressive code and not talk about comments at some point. Is code expressiveness an alternative to comments or are the two satisfying different needs? This is the topic I want to address by showing when comments are useful and complementing otherwise good code and when comments should be replaced by a refactoring of the code.

I have condensed a fair amount of data about the controversial topic of comments from:

  • the latest occurrence of the Paris Software Craftsmanship meetup, where one of the topics was an exchange of experience about how to document code. (This meetup is a great event, by the way; anyone interested in getting better as a software developer and meeting interesting people would enjoy it.)
  • the reference book Code Complete by Steve McConnell (ranking first in John’s reading list), which actually dedicates 33 pages to the topic of comments (plus other related parts).
  • my own experience and reflection on this topic.

One last thing before we start: I’m not going to try to convert you to one of the two churches of commenting. Some comments add to the quality of code and some damage it. This article will show you which comments are of which sort, so that commenting becomes a powerful asset in your developer’s toolbox.

If I had to sum it up in two sentences…

Here is the rule that synthesizes it all:

“Imagine what you would need to tell someone who is reading your code if you were sitting right next to them. That is what you put in comments.”

And by a strange coincidence, this sentence contains exactly 140 characters. Must mean something right?

Indeed, consider the following line of code:

Imagine you were telling the person reading your code while you're sitting next to them, “Look, here we check that there are some entries before doing the actual work.”  What is this person likely to answer? “Thanks, but I can read!” By chipping in that way, you're just getting in the way and even interrupting their focus.

The same goes with comments. You don't want them to repeat what the code says.

On the other side, if the person opens up a big source file and you have to tell them, “This file deals with such and such aspects of the program,” then you'd be doing them a big favor because it would take them longer to figure this out just by looking at the code. The same goes for comments; they start adding value when they provide information that the code itself doesn’t easily give (even if it’s good code).

Avoid needing explanatory comments

There is another kind of comments: those that explain what the code does.

They can carry valuable information for a reader who would otherwise struggle to understand the code. But the piece of code that contains such comments is usually bad code because it is unclear that it needs to be explained.

The advice that is generally given is to write that piece of code differently in order to make it more expressive. There are a lot of ways to make code more expressive (take a look at Fluent C++ to see more about this).

When writing new code, it certainly makes sense to make it tell the story. But I'm not sure that this piece of advice is realistic in all situations.

Imagine that you're working on a bugfix and you stumble on unclear code that you struggle to understand, but finally manage to figure out. Are you going to interrupt your work and change it? It's unlikely. Or even log it for a later refactoring? Are you really going to do this for every such piece of code? That can be a Titan's work to do this systematically.

On the other hand, leaving a comment summarizing your findings may be a quick win for everyone. Indeed, leaving a comment on a piece of code that you’ve figured out takes just a moment, whereas refactoring and testing, while a more profound change, takes much more time—which makes it unrealistic to do systematically on a large codebase.

Plus, some code doesn't belong to you. Some explanatory comments tell that some code is done this way because way down in the stack, there is something warped that forces us to warp ourselves as a result. But you may not have the possibility to access the culprit code! For these situations, in my opinion, explanatory comments have a reason for existence.

But when you write brand new code, it should be straightforward enough to be self-explanatory.

Now, there are quick wins that can do away with some explanatory comments, like getting rid of magic values. Magic values are constant values that appear in the code without being named. Consider the following commented code:

It contains a magic value of 100, which is a bad practice. And the comments clumsily try to remedy to it. This can be quickly changed into:

thus making the explanatory comment superfluous.

“Comments don't get updated anyway”

This is the strongest argument of the anti-comments chapel. And it is true that nothing forces a maintainer of the code to keep the comments in line with the code. If this happens, the comments may get out of sync with the code and turn into misleading information. And everyone agrees that no comments are better than false comments.

While this is true, there are a few things that can reduce the chances of this happening.

The first one is to comment at the level of the intent, because the intent doesn't change as much as the actual implementation of that intent. (More on this in a moment.)

The second one is to keep the comments as close as possible to the corresponding code. Indeed, comments not being updated do not result from programmers with evil intentions. Sometimes we just don't pay attention to the comments. Steve McConnell even suggests stamping variable names into comments, so that when searching for the occurrence of a variable, the comment also shows up.

Finally, the last tip is to add comments in places that don't change often, typically at the beginning of a file, to describe what this file is about. Even if the implementation in a file can change, the subject this file covers tends to stay stable for a long period of time.

Talk at the intention level

One thing that dramatically increases the expressiveness of code is raising levels of abstraction Essentially, it consists of showing what the code does rather that how it does it; however, there is much more to it and I strong recommend you learn more about levels of abstraction. To me, it’s really an overarching principle of programming.

And even though you want your code to be at the right abstraction levels, comments can play a role in it, too.

In Chapter 9 of Code Complete, Steve McConnell shows the technique of the Pseudocode Programming Process. This consists of starting by writing what you want the code of your function to do (in English) in comments. When this is done, then you insert the lines of actual code in C++ (or in any language, for that matter), naturally interweaving with the comments. Then you take away some of the comments that are redundant with the code and leave those that explain what the code intends to do.

For this to work, the comments should be at the level of abstraction of the function. In other terms, they need to express what the code intends to do, rather than how it implements it.

There is another level of intent that the code can hardly express: the why. Why was the code implemented this way and not with another design? If you've tried out a design that turned out not to work, this is valuable information for a maintainer of the code (which could be yourself) to keep him off a wrong track.

And if someone has actually been on that wrong track, encountered a bug, and made a fix, it may be useful to include a reference to the bug ticket in question in the comments.

If you read Make your functions functional, you'll see that global variables break functions by inducing implicit inputs and outputs that the function has access to but doesn't declare in its prototype. A comment next to the prototype that indicates what interaction the function has with the global variable may be a good indication until the function gets fixed.

Another intent that is valuable to document in a comment is when (for some, good reason) you make the conscious decision of going against what usually is a best practice. If you don't mention anything about it, there is a high probability that someone is going to “fix” it later.

This is illustrated by a bug in the Debian system that had a large impact when somebody removed a variable that had been left non-initialized “by mistake.” It turned out this non-initialization was participating to the random number generation in the authentication keys. The result was that the generated security keys were very fragile and a lot of them had to be generated again. Oops.

So, to comment or not comment?

Now do you think that commenting on code just comes down to a matter of personal preference or adherence to a belief?

If there is a belief to have, it’s that comments are another way to improve code but, like many other tools, they can be damaging if misused.

Comments are useful when they express what the code can’t, so disclosing the intention level is a good case for comments. This ties up with raising levels of abstraction, for which comments can help. Comments are best placed close to the corresponding code and are particularly robust when they talk about high-level concepts that don’t change often.

And while it’s better to have expressive code than to use comments as a crutch, worse come to worse, it’s preferable to leave a comment summarizing your findings on unclear code than letting the next developer scratch their head over what you’ve already figured out.

To sum it up in two sentences:

“Imagine what you would need to tell someone who is reading your code if you were sitting right next to them. That is what you put in comments.”

About the author

Jonathan Boccara

Jonathan has been a C++ software developer for 6 years, working for Murex which is a major software editor in the finance industry. His focus is on C++ and in particular on writing expressive code, with some insights coming from Functional Programming with Haskell as a second language. He blogs regularly about expressive code in C++ on Fluent C++. You can also find him on Twitter @JoBoccara.