Giving variables meaningful names is a widely accepted best practice that is recommended in pretty much every style guide or book on programming that you can find. Well-named variables make code easier to understand and can sometimes even serve as a form of documentation, making comments unnecessary.
At the same time, almost everyone also names their loop variable
i rather than
something more meaningful, like
indexOfLoopOverAllRecords. Clearly, there are
some situations in which programmers believe that this best practice can be
Is that belief justified? Let’s look at what the science says!
The study we’re looking at today consists of three parts, which look at the acceptableness of single-letter variable names from different viewpoints.
Single-letter names in practice
Different programming languages have different coding conventions and idioms. This also translates into differences in variable naming.
They found that short variable names are common in many languages. Single-letter variable names are more or less equally common as other short lengths, .
i is the most commonly used name for single-letter variables in
j is also fairly common for the same reason. The frequency of
other letters appears to be language-dependent. For instance,
the most common letter in Perl, while
(counter?) are common in C.
Most single-letter variables use lowercase letters. Uppercase letters are primarily
used in Perl, where they even outnumber all lowercase letters, except for
Effect on understandability of code
Just because single-letter variable names are common in practice does not mean that they are a good idea. The second part therefore consists of three experimental procedures, of which two are controlled experiments and one is an opinion survey.
“Can we fix it?”
The first experiment attempts to measure the negative effect of single-letter variable names on the maintainability of code. It involves a coding task, in which experimental subjects are asked to fix a defective piece of code, whose functionality is explained beforehand. Some subjects are given a version of the code with meaningful names for everything, while others work with one of two versions in which some variables have been given a meaningless single-letter name.
The results suggest that there is little difference in the correctness and time
spent on solutions between the three versions. However, due to the small sample
size these results should be interpreted as
failing to show a difference, and
not as finding that there is no difference. Moreover, differences seem to be
largely attributable to other variables, like the age and sex of a subject.
“Why is it so hard”
The second experiment is even more focussed on the possible adverse effects of single-letter names. It uses a much more “computer sciencey” algorithmic coding problem that is harder to understand. Experimental subjects were given either a version with meaningful names or a version with single-letter names, and asked to 1) explain what the code does and 2) asked to extend it.
Sadly it turned out that this coding problem was too hard to understand for
almost everyone. Few people made it through part 1, regardless of what convention
was used for variable names. Again, individual differences between subjects had
a more significant impact on results than variable names:
if you do not have
the required background and skills, good variable names will not save you.
“What do you want?”
In the opinion survey, subjects were shown four
which function do you prefer
questions. Each question included 2 or 3 versions of a function, with varying
degrees of single-letter variable naming.
The results show that an overwhelming majority of respondents prefers functions with long, meaningful names, even though the experiments suggest that single-letter variable names do not have a significant effect on the understandability of code.
Expectations for single-letter names
i is commonly used for loop indices and therefore has a clear meaning
to programmers. But what about other letters? A survey was used to ask respondents
about their associations with all 26 letters of the English alphabet.
As expected, many letters are clearly associated with types that start with that
s for string,
c for char, and
o for object. This is not always
n are strongly associated with integers, while
t are associated with floating-point numbers.
associated with both types of numbers.
The number of distinct associated meanings differs between letters. Some letters
k (loop indices) have a single, clear meaning.
Programmers are less sure what to expect when they see letters like
m, for which they have many different associations or no associations at all.
Single-letter (and other short) variable names are common in real-life software projects
Programmers prefer longer variable names over single-letter names, even though they have no effect on program comprehension
Certain letters of the alphabet create expectations about their type and usage, which suggests that letters should be chosen carefully