Do you remember this source code? (2018)

Two programmers with the dreaded wait cursor and beach ball in their heads

Developers occasionally get questions about code that they have written. Such questions are not always easy to answer, especially if that code was written a long time ago. Krüger, Wiemann, Fenske, Saake, and Leich used an online survey to study how developers lose familiarity with “their” source code over time.

Why it matters

Developers are generally better at resolving bugs, adding new features, and estimating costs for code that they’ve worked on before.

But that doesn’t last forever: as developers move on to other parts of a codebase or even completely different projects, they’ll slowly forget the details of their previous work, and become less effective again.

This phenomenon is not completely understood yet.

Ebbinghaus’ forgetting curve is often used in psychology to describe how humans slowly forget information over time. The authors wondered whether this forgetting curve can also be applied to familiarity with source code.

How the study was conducted

An online survey was held among 60 developers that had worked on publicly accessible projects on GitHub.

Each developer was asked to choose a single file they had last worked on in 2016The study was originally published in 2018, so for the respondents this would have been one or two years ago, but refrain from checking its contents. The survey’s questions focussed on the chosen file. Some of the more important questions include:

No pilot study was conducted.

What discoveries were made

The authors looked at three factors that might influence how well developers retain knowledge about source code:

The forgetting curve appears to describe knowledge retention fairly well for the 27 developers that only made a single change to their file, but underestimates retention for the 33 developers that did make multiple contributions.

The important bits

  1. Contributing repeatedly to a file has a strong positive effect on familiarity with its contents
  2. Being the last developer to modify a large ratio of the code in a file also has a positive effect on familiarity
  3. In theory developers’ loss of familiarity can be modelled using a forgetting curve. In practice you probably can’t.