Rethinking thinking aloud: A comparison of three think-aloud protocols (2018)

A usability study is conducted on a website

Think-aloud is one of the most popular methods used to evaluate usability of websites and other types of systems. Various think-aloud methods exist. Alhadreti and Mayhew compared three of them – concurrent, retrospective, and hybrid think-aloud — and found that one clearly outperforms the other two.

Why it matters

The general idea of think-aloud is that participants verbalise all their thoughts: what they see, what actions they want to perform and why, how they feel about something, and so on. By letting participants do this while they carry out a set of predefined tasks on a system, a usability researcher can discover which issues users might face when using the system and why.

Each of the three think-aloud methodsThese are not the only think-aloud methods. One example of another method is co-discovery, which involves two participants who have to complete tasks by working together. The reasoning behind that method is that it encourages discussion and thinking aloud in a more natural way. described in the paper has its own advantages and disadvantages:

The study compares the pros and cons of these methods.

How the study was conducted

For each of the think-aloud methods, a usability study was conducted on a university library website, with tasks of medium difficulty and 20 participants per group. Participants’ characteristics were kept similar between groupsOne notable similarity within groups is that all participants were – surprise, surprise – students. None of the participants had any prior exposure to the website under evaluation though (in case you were wondering: it was a different university’s website) to minimise impact of individual differences, while tasks were entirely identical between groups.

Several variables were measured to determine the effectiveness and efficiency of each think-aloud method:

What discoveries were made

Thinking aloud does not seem to affect how well participants are able to complete tasks, how they perceive the system’s usability, or how they feel during the evaluation.

However, when the number of identified usability issues are compared, it’s clear that retrospective think-aloud is much less effective than the other two methods: when it’s used, the number of identified issues is significantly lower; presumably because participants have forgotten about some of the issues they’ve encountered by the time they’re asked questions about them.

Many of the overlooked issues are minor and often related to layout, e.g. things that aren’t as clear or consistent as they should be. Unfortunately, retrospective think-aloud has few other redeeming qualities. To make matters worse, it’s also about 60% more expensive than concurrent think-aloud.

This is also where hybrid think-aloud falls flat on its face: it detects slightly more issuesThe difference is basically negligible than concurrent think-aloud, but is also much more expensive (about 70%).

The important bits

The overall conclusion is pretty clear:

  1. Concurrent think-aloud is the most cost-effective think-aloud method
  2. Retrospective think-aloud is generally best avoided