Uncovering architectural design decisions (2018)

Peculiar house with an indoor fireworks kit and an outdoor pool without a way to exit it
Let’s jump right into this one, because you only get to live once…

Systems are easier to maintain if one understands why its code and architecture look the way they do. Unfortunately that “why” often isn’t documented. To address this issue Shahbazian et al. developed RecovAr, a technique that allows partial recovery of design decisions from a project’s issue tracker and version control repository.

Why it matters

An engineer who understands the architectural impact of their changes is less likely to deliver code that introduces regressions or architectural inefficiencies.

That’s only possible of course if they know what that architecture looks like and why it looks like that.

Unfortunately, the decisions made during architectural design are rarely well-documented and mostly reside in architects’ and engineers’ heads – at least, until they leave the project and the knowledge is lost forever.

How the study was conducted

Source code nowadays is usually stored in version control repositories, which contain the complete history of changes to the code in the form of commits. These commits often include references to unique identifiers in issue trackers like JIRA or YouTrack.

The authors propose a technique called RecovAr, which extracts information from these two sources to reconstruct the rationale behind architectural choices. This happens in three stages:

What discoveries were made

RecovAr extracts three types of decisions:

Results

The authors helpfully include some real examples that show what RecovAr is capable of. The table below shows three decisions that can be extracted from Hadoop.

Decision type Issue(s) Change(s)
Simple
  1. Job tracking module only kept track of the jobs executed in the past 24 hours. If an admin checked the history after a day of inactivity, e.g., on Monday, the list would be empty.
  1. hadoop.mapred component was modified.
Compound
  1. UTF-8 compressor does not handle end of line correctly.
  2. Sequenced files should support custom compressors.
  1. CompressionInputStream was added and CompressionCodec was modified.
Cross-cutting
  1. Random seeks corrupt the InputStream data.
  2. Streaming must send status signals every 10 seconds.
  3. Task status should include timestamp for job transitions.
  1. hadoop.streaming was modified.
  2. hadoop.metrics component was modified.
  3. hadoop.fs was modified.

Evaluation

The authors evaluated RecovAr’s applicability and accuracyWhich, as you may recall, is based on two other concepts: precision and recall by applying the technique on Hadoop and Struts. Both projects are widely-used, open source, and have long and active development history.

Applicability

On average only 18% of the issues for Hadoop and 6% of the issues for Struts have had architecturally significant effects.

The number of design decisions that can be extracted using RecovAr seems to depend a lot on the technique that’s used to recover the architecture, which makes sense given that RecovAr compares architectures to detect decision consequences.

Precision

To determine the precision of RecovAr, two PhD students independently evaluated the identified decisions by manually assigning ratings based on four criteria:

Overall scores are between 0.71 and 0.81, which is pretty good.

Most decisions that were deemed unacceptable had originated in newly released major versions. This is because the number of architectural changes between a minor version and the next major version tends to be large, which understandably leads to hard to understand results.

Recall

The authors initially found atrocious recall values around 20%, primarily due to two reasons:

Mitigation of these effects led to a rise in recall to 73% on average, which is also pretty okay.

The important bits

  1. Code and architectural changes are typically triggered by a rationale, which is often described in issue trackers
  2. RecovAr is a technique that determines why architectural changes were made by mining code repositories and issue trackers
  3. On average RecovAr achieves a precision around 0.7 and recall around 73%