Coalaty: A static analysis tool for comments
As someone with backgrounds in both software development (which is all about processing structured data) and information science (which is all about dealing with unstructured information), I like working on interesting problems at the intersection of these two fields – especially when not working on such a problem means that I can’t obtain my second master’s degree in Software Engineering.
Source code comments are one such interesting problem: from a technical point of view, comments are unstructured pieces of information that are scattered throughout a code base that’s written in a formal language, with precise syntax and semantics.
As a programmer, I thought I already knew a thing or two about comments. I had read my fair share of programming blogs and books on comments. And of course I try to write comments from time to time myself. But I also knew that none of this was enough if I really wanted to know everything there is to know about comments.
And so I spent a few months conducting a scientific literature review on comments in source code, after which I had a reasonably good idea of how comments are used, what they should ideally look like, and what they actually look like in practice.
Equipped with all this newfound knowledge I set out to build Coalaty, a tool that can do automated comment analysis and assessment of comment understandability. This tool should be able to reconcile all the numerous (sometimes conflicting) guidelines on writing comments in some way and automatically determine the comment quality in a Java software project, not unlike more common efforts to evaluate the understandability of source code.
Coalaty statically analyses a software project in four steps:
Parse all source code and comments into abstract syntax trees, which make it possible to explore and reason about the code programmatically.
Compute the cyclomatic complexity and other common metrics for each of the classes in the software project. This gives us some idea of the context within which the comments exist.
Compute text-related metrics about the placement, linguistic correctness, readability and contents of each comments, e.g. does it mention identifiers, domain terms, issue trackers, and so on?
Use an ensemble of predictive models to combine all computed metrics into a quality rating per comment and package (module).
The result of executing steps 1 to 3 is a set of metrics that have been computed for each comment, while taking into account the context in which they are written.
These metrics are somewhat comparable to those found in many other static analysis tools for source code that look at variables like function length and adherence to a certain code style, so technically I could’ve declared victory once I had implemented these first three steps.
Although Coalaty’s computed metrics can provide some value on their own, they are hard to interpret correctly and thus basically meaningless to most users. This is why I added the fourth step, in which I tried to convert all those seemingly meaningless numbers into a single score that clearly tells the user how good the comments in their project are.
Unfortunately the accuracy of the resulting scores left much to be desired: it was very good at identifying comments that were really good or really bad, but everything in between felt more like a wild guess. After several arduous rounds of tweaks the quality of the predictions only improved marginally.
Because the project was already way over budget and over time at this point, I ultimately decided it was better to pull the plug. Coalaty didn’t get to live the life it was supposed to have. At the end of the day it didn’t have to: a negative result is also a result – one that can be rewarded with a nice grade and a master’s degree. And that’s all that really mattered.