Chuniversiteit logomarkChuniversiteit.nl
The Toilet Paper

Why camelCase is better than snake_case

Does it matter whether you use camelCase or snake_case for identifiers? Yes it does! Probably. Maybe.

A camel steps on a snake
You probably didn’t see this one coming, because it was camelflaged.

Programming languages give you a considerable amount of freedom when it comes to identifier naming. Technically, you’re free to name identifiers whatever you want, even if it’s asdf, heIsInMyBehind or r2_do_you_is_fucking. It’s nice to have some consistency though, which is why that you use either camelCase or snake_case for identifier names that consist of multiple parts.

Many contemporary style guides recommend the use of camelCase over snake_case. However, there’s a psychology study that found that empty spaces, like those formed by underscores, make it easier for humans to recognise word boundaries. This would imply that snake_case is actually superior to camelCasing, right?

Well… it’s not quite that simple.

The researchers conducted a simple experiment that evaluated the speed and accuracy at which subjects could identify the right camelCased or snake_cased identifier, given a regular phrase.

First, a subject is shown a brief two- or three-word phrase (e.g. “start time”) on a screen. A next screen then shows four clouds that move around randomly. Each of the clouds contains an identifier in one of the two styles (camelCased or snake_cased). One cloud contains the correct phrase, the other three contain variants where either the beginning, middle or end of the phrase has been modified slightly.

The table below shows all phrases, which are presented to subjects in a camelCased or snake_cased style:

Type Phrase Distracters
Beginning Middle End
2 words, code start time smart time start mime start tom
full pathname fill pathname full mathname full pathnum
3 words, code get next path got next path get near path get next push
extend alias table expand alias table extend alist table extend alias title
2 words, not code river bank riser bank river tank river ban
drive fast drove fast drive last drive fat
3 words, not code read bedtime story raid bedtime story read bedsore story read bedtime store
movie theater ticket mouse theater ticket movie thunder ticket movie theater ticker

Results

Link

The study included 135 subjects, of which 32% were enrolled in an undergraduate computer science course. Some had received multiple years of computer science training, while others had no training at all.

Subjects without computer science training either preferred snake_case (46%) or had no preference (45%). A more surprising finding was that among the subjects who that received training using camelCasing, more than a third preferred snake_case. However, subjects that had received more training would generally prefer camelCasing over snake_casing.

More importantly, the experiment provides insight into the effect of identifier style on correctness and the amount of time that is needed to provide an answer.

Firstly, the identifier style has a significant effect on task correctness. A statistical model shows that someone is 51.5% more likely to select the correct identifier when identifiers are camelCased.

Secondly, the identifier style also has a significant effect on the time someone needs to find the correct identifier. .

Task correctness is not affected by training. Subjects with many years of training did just as well on tasks with a particular identifier style as subjects with few or no years of training.

Time on task on the other hand is affected by training. Subjects with more training were able to find the right identifiers more quickly when camelCased. Interestingly, training in one style also appears to negatively impact performance for other styles: subjects with more training needed more time to recognise snake_cased identifiers than subjects without any computer science training.

The overall conclusion of this study is that .

Summary

Link
  1. CamelCased identifiers are more likely to be read correctly than snake_cased identifiers

  2. CamelCased identifiers take more time to read than snake_cased identifiers

  3. Experienced developers read camelCased identifiers more quickly, but become slower when reading snake_cased identifiers