First make the tests easy, then make the easy tests

Published: 8 Jan 2023
Written by: Chun Fei Lung

The best time to make your software testable was at the start of the project. The second best time is now.

Fan-testic!

Almost every engineer knows that high-quality tests lead to high-quality software products and will tell you how important it is to test your work, but few engineers actually seem to practice what they preach. Half of developers never test their code and most programming sessions end without any test execution.

What gives?

About the article

Title	Testability First!
Year	2019
Author(s)	Mohammad Ghafari (University of Bern) Markus Eggiman (University of Bern) Oscar Nierstrasz (University of Bern)
Venue	International Symposium on Empirical Software Engineering and Measurement (ESEM)

A growing monolith

The authors of this paper conducted a case study at a large logistics company in Switzerland, which runs on a software system that comprises about 14,000 files with around half a million lines of Java and C# code, and has 15,500 recorded issues.

The system is a monolith that originally included some automated unit tests, but test coverage was generally low. As the project grew larger, so did the difficulty of managing it all. In response, the system was restructured in a more modular way, with newer features being added as modules that should be covered by unit tests. Additionally, manual tests would be performed by testers after every development sprint.

The researchers randomly selected 200 bugs from this project that had been resolved, and then studied the associated commits to learn more about the effect of testing practices on software quality. They also conducted interviews with eight senior developers to gain a better understanding of their views on testing.

Observations

Of the 244 components that were affected by the sampled bugs, only 8 were covered by unit tests. Virtually all tested components contained 5 or fewer bugs, whereas as much as a quarter of all untested components contained more than 5 bugs. In other words, tested components appear to be less prone to bugs than untested ones.

There were 140 cases that could be considered to be hard to test, mainly for three reasons:

A component violates the single responsibility principle;
A component has many dependencies on other parts of the system or external components;
No exact definitions of correct or incorrect behaviour exist.

Developers almost never wrote tests for code that is hard to test.

There were 16 cases that the researchers did not consider hard to test. In 7 of those cases there were indeed tests that at least partially covered the defective component.

Finally, the researchers applied mutation testing to assess the quality of tests. In general, the tests didn’t do a very good job at detecting code changes. However, tests that were introduced after the restructuring effort tended to fare better.

Developers’ views

Interviews with the system’s developers revealed several reasons for not writing tests:

It was considered less important (and perhaps less glamorous) than other tasks, like developing new features;
Unclear requirements and misunderstandings between the customer and developers regularly resulted in “bug reports” that were actually requirement changes. The high frequency of such changes also demotivated developers from writing tests;
The team suffered from high turnover, which made them neglect testing and quality assurance activities.

Only when the team improved its Scrum process, worked on better communication with the customer, gave higher priority to quality assurance, and started monitoring test coverage using SonarQube, did test coverage improve.

The researchers’ own analysis had already revealed that developers give up writing tests for components that are hard to test.

But the interviews also showed that developers do not follow the best practice of writing new tests that reproduce reported bugs, especially when they believe that it’s easy to reproduce a bug manually. However, an analysis of fixed bugs showed that there were eight pairs of recurring bugs that would have been easier to spot (and fix) if a test had been written the first time.

Shockingly, developers also seldom consider the impact (severity) of a bug when deciding whether to write a test, and generally just try to deploy fixes for critical bugs as quickly as possible.

All in all, it seems that developers write tests only whenever they feel like doing it and do not follow guidelines or make use of tools that help them write better tests.