Chuniversiteit.nl
The Toilet Paper

How developers perceive and deal with architecture erosion

Let’s talk about architecture erosion, its causes and consequences, and how developers can identify and control it.

Someone has removed a brick from the Colosseum
Sure, monoliths are hard to maintain, but let’s not get too carried away by the whole microservices hype…

I usually start my paper summaries with one or two paragraphs that describe the problem that the paper aims to solve or that provide some background information about its subject. Not this time. I’m used to reading papers with typos and other small mistakes, but I’m absolutely flabbergasted that a huge one was made in the title and made it all the way to “production”, despite having gone through presumably multiple rounds of review. Fortunately, the paper and the study itself are fine.

Software systems are typically designed with an architecture: a set of distinct modules or components that cooperate with each other. Ideally the architecture evolves over time together with the system itself, as new features are introduced or existing features are modified. In practice this doesn’t always happen, which means that the architecture is eroding.

The concept of architecture erosion has been researched before, but little is known about the current state of practice from the perspective of developers. The authors of this paper therefore reached out to developers in several online communities and conducted surveys and interviews with 10 and 4 participants respectively.

What architecture erosion looks like

According to the surveyed developers, architecture erosion leads to four types of engineering issues:

  1. Structure: the structure of the current architecture differs from the intended architecture, e.g. due to violation of design rules about encapsulation, accumulation of cyclic dependencies, and increased coupling. Other structural issues include dead/overlapping code, and obsolete third-party libraries.

  2. Quality: an eroded architecture may or not meet the original or current non-functional requirements, like reliability, performance, and user experience.

  3. Maintenance: software projects with an eroded architecture can be harder to understand, debug, and refactor. Increasing complexity and technical debt often become common in eroded architectures.

  4. Evolution: it’s hard or even impossible to plan the next evolutionary steps without breaking something. Even small changes become extremely expensive.

Causes and consequences

The surveys revealed 12 types of causes of architectural erosion:

  1. Inappropriate architecture changes that are made during the evolution and maintenance phase are the most common cause of architecture erosion. Examples include changes that introduce anomalies, break architectural rules, or do other unexpected things that undermine the architectural integrity of the system.

  2. Architecture design defects on the other hand often occur in the design phase. When the architecture itself contains major flaws, this will likely cause issues for anyone who tries to build something on top of it.

  3. Lack of management skills are also a common cause of architecture erosion. Examples include assigning incompetent developers to a project, not providing proper training and education for developers, or not having a strategy for architecture evolution.

  4. Accumulation of technical debt happens when developers continuously choose to implement solutions that are easy to implement, but violate architectural principles.

  5. A disconnect between architects and developers occurs when architects do not adequately monitor the implementation process, developers do not participate in the design process, or when there is no architect role at all.

  6. Knowledge vaporisation due to employee turnover can be a major source of issues, especially when architectural knowledge is poorly documented.

  7. Requirements changes can make it hard to keep an architecture “clean”.

  8. Lack of communication among developers: if some developers isolate themselves from others, this may reduce the communication complexity at the cost of increasing the program complexity.

  9. Agile development is often blamed for causing architectural erosion, due to its emphasis on quick iterations and releases.

  10. Increasing complexity of the system itself can degrade the architecture and gradually make it harder to maintain and evolve it.

  11. Lack of maintenance, e.g. when maintainers do not constantly refactor code and replace obsolete third-party libraries.

  12. Other (non-technical) causes, like environmental changes, changes in business processes, business pressure, and treating quality concerns as second-class citizens.

Architecture erosion can have seven consequences:

  1. An eroded architecture often is hard to understand and maintain.

  2. Run-time quality degradation directly affects users of the system, e.g. due to worsened performance, reliability, and user experience.

  3. Refactoring becomes extremely costly, resulting in a system that’s permanently stuck in an undesirable form.

  4. In the worst case, a system becomes a big ball of mud without any perceivable architecture.

  5. A messy architecture may result in a high turnover rate, as developers who are forced to work in it will quickly choose to leave the project for greener pastures.

  6. Finally, erosion can drastically increase the overall complexity of the project, which may bring it to a grinding halt.

Identifying erosion

Although tools exist that identify architectural issues that can be considered as symptoms of architecture erosion, none identify architecture erosion per se. Popular tools include Lattix, NDepend, and Sonargraph. A complete list can be found in the original article.

Aside from tools, there are also a number of practices that developers use to identify architecture erosion:

  • A dependency structure matrix visualises dependency relationships within a system in a square matrix, which can be especially helpful in the maintenance phase.

  • Software composition analysis can show which open-source components are used or may need an update. Examples of such tools are Snyk and Sonatype.

  • Architecture conformance checking helps you automate checks for architectural violations within a system.

  • Architecture monitoring for common issues, like coupling, file size, and smell hotspots makes it possible to keep track of technical debt.

  • Code reviews are a simple method to identify mistakes in code and violations of design patterns.

  • Computing changes in architectural smell density (e.g. the number of smells per 1,000 lines of code) per version allows you to see the erosion of architecture over time.

  • Architecture visualisation clearly shows the structure and dependencies of the system, as well as places where design rules have been violated.

Controlling erosion

Finally, the surveys and interviews yielded several methods to control architecture erosion:

  • Architecture assessment throughout the system development life cycle provides continuous insights into shortcomings and other architectural issues. Of course, these assessments also need to be followed up with concrete actions.

  • Periodic maintenance help to keep a system “clean” and running smoothly.

  • Architecture simplification by reducing the complexity or size of a system makes systems more resilient against architectural erosion. Decomposing a monolithic architecture into multiple smaller microservices can be a good way to do this.

  • Architecture restructuring, as was done for the Mozilla Firefox web browser, involves a full redesign of the architecture. This is of course insanely expensive.

  • Organisation optimisation in the form of more (capable) team members can also be a good way to address architectural erosion.

Summary

  1. Architecture erosion makes it harder to evolve a system and may also have a negative effect on the user experience

  2. This study identified 12 causes of architectural erosion, many of which are non-technical

  3. Continuous identification and removal of architectural issues can help prevent architecture erosion