Are static analysis violations really fixed? (2019)

Broken image with two sticky notes
Static analysis tools just throw a bunch of reports against the wall and hope that something will stick

Static analysis tools can help developers improve the quality of their code and help managers (sort of) understand how well a system is built. This is why organisations often promote the use of static analysis tools in software projects. But are they really as useful as they’re made out to be?

Why it matters

Static analysis tools can be incorporated into continuous integration workflows to automatically report issues in your code, like refactoring opportunities, potential security vulnerabilities, performance bottlenecks, and code smells.

However, this doesn’t improve anything until developers actually act upon them. How often and when do developers fix reported issues?

How the study was conducted

The study consists of two parts. First a survey is used to learn how developers perceive the use of static analysis tools. Then open-source and commercial software repositories from four organisations are mined to understand how and how often developers actually make use of SonarQube, a widely adopted static analysis tool.

What discoveries were made

There are some differences between what developers say and what they actually do.

Survey results

The survey results show that developers value the warnings that are generated by static analysis tools, and use them to fix issues whilst they’re implementing bug fixes and new features.

More than 80% of respondents agrees with issues that are identified. However, these issues rarely seen as dealbreakers: Only few developers would reject pull requests based on reported issues. Furthermore, about half of developers never or rarely postpones releases due to such issues.

Software repository mining

Less than 9% of reported issues are actually fixed.

Turnaround

The median number of days between the first report and the fix is about 19 daysThis is pretty fast I guess?, but almost a third of the issues is only fixed after an entire year. Several factors affect how long issues are left unfixed:

Frequency

Most of the resolved issues are related to code smells:

Severity Code smell Vulnerability Bug Deprecated Total
Major 19,732 496 972 3,945 25,145
Minor 7,683 53 91 - 7,827
Critical 943 883 697 - 2,523
Info 944 6 - - 950
Blocker 113 20 396 - 529

Rules

Issues are typically reported by SonarQube’s default rules: users can add custom rules, but only a small subset of these custom rules actually trigger issue reports.

There also appears to be some Pareto thing going on here, as 90%Close enough for me :D of all fixed issues are triggered by only 20% of the rules. Not everything is Pareto though: it’s often assumed that 20% of all modules (or files) are responsible for 80% of the issues. This doesn’t seem to be true.

Won’t fix

As I already wrote above, many issues are simply never fixed. However, it’s also possible to explicitly mark issues as “Won’t fix” or “False positive”. The results suggest that – at least for one of the organisation’s repositories in this study – less than 7% of reported issues might fall into this category.

The important bits

  1. One a very small portion of reported issues is actually fixed
  2. Most of the issues that are fixed are code smells and have a major or minor severity rating
  3. Issues are fixed faster in open-source and greenfield projects
  4. 20% of the rules are responsible for more than 80% of all reports