Studying the impact of adopting continuous integration on the delivery time of pull requests
Continuous integration is a practice that promises more frequent releases, which means that new functionality can be delivered more quickly to end-users. Bernardo, Da Costa, and Kulesza analysed over 160,000 pull requests from 87 GitHub projects and suggest that you may want take those claims with a grain of salt.
Adopting continuous integration (CI) means that each change to an application is automatically built and verified: the entire development workflow is optimised for merging new features as quickly as possible into a project’s main VCS (version control system) branch. This makes it less likely that (hard to resolve) merge conflicts arise, and makes it possible to release new versions more often.
Studies have already shown that open source projects and discover bugs in less time once CI is adopted. CI also appears to change the way commits are made: practices like “commit often” and “commit small” are often adopted together with CI.
None of this is interesting to end-users however, who just want new functionality as fast as possible. Do features really make their way to end-users in less time when CI is adopted?
The authors searched GitHub for projects that:
-
Are among the 3,000 most popular Java, Python, Ruby, PHP, or JavaScript projects;
-
Make use of Travis CI;
-
Have merged at least 100 pull requests that ended up in releases before and after CI adoption;
-
Are not toy projects, created by students or hobbyists
and ended up with 87 projects that for further statistical analysis.
Many projects deliver merged pull requests more quickly after adoption of continuous integration, but not as many as you might have expected.
Any new change to an existing codebase needs to wait two times before it can be delivered to end-users:
- There’s a merge delay between the moment when a pull request is opened and the moment when it’s merged;
- There’s a delivery delay between the moment a pull request is merged and the moment it’s released.
The delivery delay is shorter in only 51% of the projects after adopting CI, which is surprising given that projects often adopt CI in an attempt to shorten these delays.
That’s not the only surprise however:
-
In 73% of the projects merge delays were shorter before adoption of CI practices;
-
Pull requests had a longer overall lifetime (merge delay + delivery delay) in 54% of the projects after adopting CI.
These findings suggest that things take a bit longer when you use CI.
It’s interesting to know why projects need more time to deliver changes to end-users after adopting CI.
The authors therefore look at the number of pull requests that are submitted, merged, and delivered per release; both before and after CI adoption.
It turns out that the number of submitted pull requests greatly increased for 71% of the projects after adopting CI: projects submitted a median of 15.3 pull requests per release before and 42.6 pull requests per release after CI adoption. It makes sense that it would take more time to review, merge, and deliver this many pull requests.
The number of delivered pull requests and pull request contributors per release also increase after CI adoption.
Strangely there’s no increase in the number of releases.
Finally, the authors looked at before and after CI adoption to learn how they affect delivery time.
The authors discovered that the merge workload – which is the number of pull requests that are created and waiting to be merged by a core contributor – has the strongest impact on delivery time before adoption of CI practices.
After CI adoption the primary factor becomes the position of a pull request in the pull request “queue”, i.e. pull requests that are merged later within a release cycle have a shorter delivery time.
There’s one factor that’s influential both before and after CI adoption: if pull requests by a contributor were delivered more quickly, then the same is likely to happen to future pull requests by that same contributor.
-
Only half of open source projects are able to deliver new features more quickly when adopting continuous integration
-
Adoption of continuous integration appears to be correlated with a higher number of contributions and contributors