next up previous
Next: Conclusion Up: C++ Programming Style Previous: Indentation and Braces

Documentation

If you obey all the rules above, your program will have come a long way towards being clear and understandable. Nevertheless, many aspects of a program don't come out clearly in the code, and so we must document our code. What the documentation usually describes is the purpose, role, structure or technique of the code. This is because, Often, the reader misses the forest for the trees: the code has all the details but none of the outline. Like an impressionist painting, we must step back from the code in order to see it clearly.

Rule 7   Document the program as a whole. What does it do, and what are its inputs and outputs. Say who wrote it, and when.

At the top of your program file, you should have a longish comment that explains what the program does. This needn't be long and elaborate, but it should get the reader started. Think of it as a title and abstract for the paper. For example:

/*  This program reads in an array of integers, with the number of
  integers specified by the user, up to a maximum of 100.  It then
  sorts those numbers and prints them in order, one per line.

Written by Scott D. Anderson
October 23, 1997
*/
That would be a bare minimum for this first chunk of documentation. Additional documentation would describe the structure of the program, in terms of what data are defined, what functions are called, and so forth.

Notice, by the way, that I didn't have to start every line with , because I used C-style comment characters, which comment out everything from /* to */. These are very nice for long comments, especially paragraph-style comments where you might want to fill the paragraph when you're done (M-q in Emacs).

Rule 8   Document functions. Each function should be preceded by a brief paragraph explaining what it does, how it works (if necessary) and the meaning of its arguments and return values.

Each function is, essentially, a small program, so just as you'd document the inputs, outputs and purpose of a program, you should document each function. If that paragraph starts getting long and cumbersome, the function is probably doing too much; consider breaking it into pieces.

Rule 9   Document variables and data members. Explain the purpose and use of the data and any non-obvious aspects of its implementation.

We have seen this issue before, when we discussed how to name a variable. Often, the name is not sufficient to explain the variable, in which case the documentation helps the reader. For now, you should probably document all but the generic variables (like i or temp). For example, here are some class definitions:

class Student {
public:
  char name[MaxName+1];         // The student's name
  int number;                   // The student number, used for sorting
  float gpa;                    // The student's grade point average
};

class Link {
public:
  Student *elt;                 // A pointer, to make swapping efficient
  Link *next;                   // Pointer to next list element, or null.
};
You could argue that none of this documentation is particularly necessary, but it can be helpful, particularly for names that are abbreviated, such as ``gpa'' or possibly ambiguous, such as ``number.''

Notice, by the way, that the documentation is nicely lined up and separate from the code, making the code easy to read. This formatting is easy to do and worthwhile. In VI, you'd have to tab over the correct number of times, and in Emacs, the command M-; will move you to the correct column and insert the comment symbols.

Having documented the program, each function, and all the variables and data members, you might think we're done. Not quite. The documentation we've discussed so far concentrates on the high-level view: the purpose and plan of the code. Sometimes, the implementation is a little tricky, and so we document those tricky parts of the code, explaining what's going on.

Rule 10   Document any code that is not obvious.

Trickiness, of course, is a matter of judgment. What is complex to a beginner is obvious to an experienced programmer. For example, it wasn't that long ago that you appreciated the following comment:

   cur = cur->next;             // Go to next list element
Now, that code is starting to be idiomatic to you--you see it as a single idea, like i++. Indeed, such code is sometimes called programming idioms or clichès. Soon, you'll be omitting documentation for code like that. But there will still be many things to document:
   (*tail) = new Node;          // Add new node to end of list.
   tail = &((*tail)->next);     // update tail ptr to addr of next in last elt
For now, you should document any code that isn't obvious to you. If you're not sure, err on the side of documenting, because it's good practice, and because something that is obvious while you're writing the code may not be so obvious to a reader, even to yourself after a week's time.

Rule 11   Use proper spelling and grammar, except when brevity is more important.

Programming style is a way of showing respect and concern for your reader: you're going to go to the extra effort to make your code clear and understandable. Not bothering to write proper English undercuts that message. It makes your reader work harder to understand you. It's more likely to be ambiguous or misleading. It makes you look uneducated, which can make your reader doubt you and your code. Finally, every once in a while, the reader will be some pedant who just becomes irrational when forced to read poor spelling and grammar.1

If you think about it, spelling and grammar are the analog of the struggle we go through to write a program that means and does what we want. We have to spell all the keywords, types and identifiers correctly and use the syntax of the language properly to get our program to work. We just need to put in the same effort on our documentation; less, actually, because English is both more familiar and less demanding than programming language.

The only time when spelling and grammar rules can be broken is in end-of-line comment when abiding by the rules would cause visual clutter because the comment would have to be continued on the next line. Consider the following:

   (*tail) = new Node;          // Add new node to end of list.
   tail = &((*tail)->next);     // Update the tail pointer to be the
                                // address of the ``next'' data member
                                // in the last element of the list
Is this clearer than the original? Probably a little clearer. Yet it too three lines, resulting in either (1) blank lines in our program, which breaks the visual rhythm of the code or (2) putting the documentation at the end of other lines of code, which would be confusing. Neither of these is worth it. Abbreviate and telegraph as much as reasonable when using end of line comments. Of course, you can't sacrifice clarity: if the documentation isn't clear, it's not worth it. So, if the end-of-line comment just can't be done, go to an in-code block comment, as follows:
   (*tail) = new Node;          // Add new node to end of list.

   // Update the tail pointer to be the address of the ``next'' 
   // data member in the last element of the list
   tail = &((*tail)->next);
I sometimes like to precede such comments with a blank line, so that the reader knows to shift gears when reading the code. However, with modern font coloring by Emacs, the comment will be in a different color and so the reader will be easily able to distinguish code from comments.

Rule 12   Remember that screens and printers have finite width. Stick to an 80-character line.

Back in the olden days, screens and printers weren't bitmapped and there was only one, fixed-size font. Screens were exactly 80 characters wide and 24 lines long, and printers put 80 characters on a line (unless you had a line-printer, which could go to 132 characters) and 66 lines on a page. The 80 comes from the really ancient times, when programs were written in FORTRAN on 80-column punched cards.

Why should you, who are fortunate enough to live in modern times, care about keeping below 80 characters on a line? There are two related reasons. First, you may be able to select a smaller font so that more characters can fit on a screen or the page, but you and others don't really want to read a font that small. Secondly, there are psychological and physiological studies of human reading that shows that the human eye doesn't track so well from line to line (typically in the fast movement from the end of one line to the beginning of the next) when the line is very long. This is not so much as issue with code, since the lines aren't that long, but with documentation it is. Since modern times haven't made eyes any better, keep your line width in documentation to 80 characters, and do the same for your code, too.


next up previous
Next: Conclusion Up: C++ Programming Style Previous: Indentation and Braces
James Hale
2001-09-19