The impact of code review coverage and code review participation on software quality

Published: 22 Jul 2018
Written by: Chun Fei Lung

You know about code coverage for tests, but code review coverage is just as important.

I’m GLaD I got burned, think of all the things we learned from this review

Code reviews are often used as a way to make sure that bad code doesn’t make it into public releases. Not all code reviews are equal for that purpose however, as this study by McIntosh, Kamei, Adams, and Hassan shows.

The following summary is loosely based on the second half of Code reviewing reviewed: recommendations for improving the efficiency and effectiveness of modern code reviews, an essay that I originally wrote for the Open University of the Netherlands.

About the article

Title	The impact of code review coverage and code review participation on software quality: A case study of the Qt, VTK, and ITK projects
Year	2014
Author(s)	Shane McIntosh (Queen’s University) Yasutaka Kamei (Kyushu University) Bram Adams (Polytechnique Montréal) Ahmed E. Hassan (Queen’s University)
Venue	Proceedings of the 11th Working Conference on Mining Software Repositories

Why it matters

It is generally believed that conducting code reviews can help catch bugs before they make it to users. Plenty of evidence exists that formal code inspections (side note: Formal code inspection used to be a popular code reviewing method, and involved in-person meetings between the author and the reviewer(s). Checklists were often employed to guarantee a base level of review quality.) can be very effective. Less is known about the effectiveness of modern code reviews however, where reviewers can – but do not necessarily have to – comment on the author’s code. It’s likely though that bugs are overlooked if not all code is properly reviewed or discussed.

How the study was conducted

There are a few important concepts in this study:

Code: here, this specifically refers to new code that is introduced in a newly released version;
Code review coverage: the proportion of code that is associated with a code review;
Code review participation: it’s possible that code for which a review has been requested was approved only by the author, hastily reviewed, or approved by reviewers without any discussion whatsoever. Participation is the proportion of code for which the reviews did not have these characteristics.
Post-release defects: the number of bugs that pop up after a newly released version.

In order to determine the effect of the code review coverage and code review participation on the number of post-release defects, the authors combine data from the version control system and code review tools from three open source projects: Qt, VTK, and ITK.

The version control system shows which commits were included in each release, while the code review tool stores links between commits and reviews. This makes it possible to determine how well the code for a release was reviewed.

The number of post-release defects can be deduced from the number of bug fix commits (commits with a message containing words like “bug” or “fix”) that are made after the release (side note: This approach possibly results in some false positives (e.g. fixes for bugs that were introduced in earlier versions), but it’s probably still a decent approximation).

Reviews aren’t the only factor that is known to influence the number of bugs. That’s why a number of commonly used software quality metrics (e.g. component size in lines of code (LOC) and cyclomatic complexity) are included as control metrics.

The review, bug, and control metrics are used to construct a model that predicts the number of expected post-release defects based on how thoroughly the code was reviewed.

What discoveries were made

The results are pretty much exactly what you’d expect: having high degrees of code review coverage and participation helps lower the number of post-release defects.

For code review coverage, the authors found that with a coverage below 29% at least one bug is to be expected, although for one of the projects even a coverage below 60% already results into least one bug. Of course, full coverage does not guarantee defect-free software: there are other factors (not within the scope of the study) that also affect the number of bugs after a new release.

The findings were much more conclusive for code review participation: components that have a high level of participation tend to have a lower number of bugs after a release. The opposite is also true: components with a low level of participation tend to have higher number of bugs.