Beyond the code itself: How programmers really look at pull requests (2019)
Approval (or rejection) of pull requests is mostly done based on its contents. But it’s not just about the code – especially when the changes are proposed by someone you don’t know. This week’s paper examines the signals that developers look at when reviewing pull requests from unfamiliar contributors.
Why it matters
Software development is not just about code – it’s also about people.
The change that’s proposed in a pull request might make sense and adhere to whatever contribution guidelines you have set up, but if it’s made by someone whom you’re not familiar with, you’ll probably do some research on the submitter first: What’s their level of expertise? What have they worked on? Could they have nefarious intentions?
Contemporary Git hosting services, such as GitHub, facilitate such research by providing user profile pages that clearly show users’ online identities (avatar, username, and name), interests, and contributions.
We’ve known for quite a while that reviewers make use of such information, thanks to studies on pull review acceptance. Sadly, those studies often suffer from issues with validity, as they are either based on interviews with developersHumans tend to forget or misremember things pretty quickly. or analysis of GitHub data.
How the study was conducted
The authors recruited 42 students from an advanced computer science course for a controlled experiment with eye-tracking glasses.
Each participant was asked to review a pull request for a small mock project and indicate the likelihood that they would accept it on a 5-point Likert scale.
The pull request came in two variants: a reasonable version that works just fine and an unreasonable, buggy version. This pull request was “submitted” by one of three personas with identical experience levels: Abby, Tim, and Pat (who has an ambiguous gender).
Participants were interviewed afterwards about their reviewing behaviour and their attitude towards profiles in technical communities.
What discoveries were made
Eye-tracking data reveal that developers have the tendency to under-report their fixations in interviews: less than half of participants mentioned looking up information about the submitter, even though in practice every single participant has done so.
Overall, participants spent most of their time (about 60%) on reviewing the actual codeAs they should!, but also looked at technical (30%) and social signals (10%).
Further analysis of participants’ (self-reported) experience levels and fixation patterns yields some interesting results:
- High-Experienced Thinkers fixate long and often on code fragments. All participants in this group made correct decisions (true accept or true reject);
- High-Experienced Glancers don’t fixate long or often. This is a sign that they don’t review very critically, as only a few members of this group manage to make correct decisions;
- Low-Experienced Thinkers also fixate long and often on code fragments, but still make mistakes due to their lack of experience;
- Low-Experienced Foragers don’t have clear fixation patterns. Curiously enough all participants in this group managed to make correct decisions as well.
There are several things that developers take into consideration when creating and maintaining profiles:
- Trust: GitHub is kind of like Tinder: you seem more trustworthy if your profile looks like it’s owned by a real person, i.e. it shows a photo and a real name, and matches the owner’s profiles on other social platforms;
- Anonimity: Unfortunately there are situations in which it’s safer to remain anonymous: pseudonyms are a useful way to avoid stereotyping and harassment, especially for minorities like womenಠ_ಠ;
- Blending in: Each community has its own social norms. Content that’s acceptable in one community may be inappropriate in another.