These conventions are almost always based on guesswork rather than empirical evidence. Strange, especially since style can have such a large effect on how code is presented to its readers. To be fair, there is some empirical evidence that suggests that you should probably use two or four spaces. However, that conclusion is based on an experiment from 1983, which is old enough now that those results should probably be taken with a grain of salt.
This week’s paper presents a non-exact replication of that 1983 study on indentation.
The researchers recruited 22 participants, who were shown a randomised sequence of four 17-line Java code snippets that deliberately use non-descriptive identifier names to obfuscate their purpose. Each snippet was presented with zero, two, four or eight spaces.
The experiment provides data about three things:
Program comprehension: Participants were asked to determine the output of each snippet in a comprehension task. The researchers measured the correctness and response time, i.e. the time that a participant took to submit their answer.
Perceived difficulty: To determine the effect of indentation on perceived (reading) difficulty of code, the researchers asked participants to sort the four snippets from easy to difficult. This was done twice: before the start of the actual experiment, using the standard indentation depth of four spaces, and afterwards, using the indentation depths that the participant actually saw during the comprehension tasks.
Visual effort: A Tobii EyeX tracker was used to measure the visual effort required to read the code. The number of fixations (when someone focusses on a particular point) and saccades (transitions between two fixations) likely show how hard it is to read a piece of code.
There were a few findings that I’d like to mention here, mostly because they’re fun little factoids that do well in PowerPoint slides:
Code that is indented using eight spaces results in the fastest response times, but also in the lowest number of correct answers.
Code formatted using four spaces is associated with the highest number of correct answers.
Overall, participants needed most time to complete their tasks for code that’s indented using two spaces.
Code with small levels of indentation is perceived as harder to read.
Non-indented code generally requires more visual effort than code that is indented using four spaces.
Based on these findings you might conclude that code is best indented using four spaces.
However, that’s not actually the case. Although the researchers found these differences in correctness, response time, and visual effort between the various levels of indentation, none of them are actually statistically significant.
In other words: indentation is probably simply a matter of style, so just pick a style guide and stick with its recommendations!
- It’s unlikely that indentation level affects program comprehension or visual effort to read code