Are static analysis violations really fixed?

Published: 17 Nov 2019
Written by: Chun Fei Lung

Static analysis tools are often used to find inconsistencies and possible bugs in code. But what happens next?

Static analysis tools just throw a bunch of reports against the wall and hope that something will stick

Static analysis tools can help developers improve the quality of their code and help managers (sort of) understand how well a system is built. This is why organisations often promote the use of static analysis tools in software projects. But are they really as useful as they’re made out to be?

About the article

Title	Are static analysis violations really fixed? A closer look at realistic usage of SonarQube
Year	2019
Author(s)	Diego Marcilio (University of Brasília) Rodrigo Bonifácio (University of Brasília and Paderborn University) Eduardo Monteiro (University of Brasília) Edna Canedo (University of Brasília) Welder Luz (University of Brasília) Gustavo Pinto (Federal University of Pará)
Venue	Proceedings of the 27th IEEE/ACM International Conference on Program Comprehension

Why it matters

Static analysis tools can be incorporated into continuous integration workflows to automatically report issues in your code, like refactoring opportunities, potential security vulnerabilities, performance bottlenecks, and code smells.

However, this doesn’t improve anything until developers actually act upon them. How often and when do developers fix reported issues?

How the study was conducted

The study consists of two parts. First a survey is used to learn how developers perceive the use of static analysis tools. Then open-source and commercial software repositories from four organisations are mined to understand how and how often developers actually make use of SonarQube, a widely adopted static analysis tool.

What discoveries were made

There are some differences between what developers say and what they actually do.

Survey results

The survey results show that developers value the warnings that are generated by static analysis tools, and use them to fix issues whilst they’re implementing bug fixes and new features.

More than 80% of respondents agrees with issues that are identified. However, these issues rarely seen as dealbreakers: Only few developers would reject pull requests based on reported issues. Furthermore, about half of developers never or rarely postpones releases due to such issues.

Software repository mining

Less than 9% of reported issues are actually fixed.

Turnaround

The median number of days between the first report and the fix is about 19 days (side note: This is pretty fast I guess?), but almost a third of the issues is only fixed after an entire year. Several factors affect how long issues are left unfixed:

Developers in industrial projects need a lot more time to fix issues than developers who work on open-source projects;
Issues are fixed much faster in greenfield projects than in legacy projects, where issues tend to pile up over time;
Issues with a “blocker” or “minor” severity are resolved in less time than issues with other severity levels, e.g. “major” and “critical”.

Frequency

Most of the resolved issues are related to code smells:

Severity	Code smell	Vulnerability	Bug	Deprecated	Total
Major	19,732	496	972	3,945	25,145
Minor	7,683	53	91	-	7,827
Critical	943	883	697	-	2,523
Info	944	6	-	-	950
Blocker	113	20	396	-	529

Rules

Issues are typically reported by SonarQube’s default rules: users can add custom rules, but only a small subset of these custom rules actually trigger issue reports.

There also appears to be some Pareto thing going on here, as 90% (side note: Close enough for me :D) of all fixed issues are triggered by only 20% of the rules. Not everything is Pareto though: it’s often assumed that 20% of all modules (or files) are responsible for 80% of the issues. This doesn’t seem to be true.

Won’t fix

As I already wrote above, many issues are simply never fixed. However, it’s also possible to explicitly mark issues as “Won’t fix” or “False positive”. The results suggest that – at least for one of the organisation’s repositories in this study – less than 7% of reported issues might fall into this category.

Summary

One a very small portion of reported issues is actually fixed
Most of the issues that are fixed are code smells and have a major or minor severity rating
Issues are fixed faster in open-source and greenfield projects
20% of the rules are responsible for more than 80% of all reports