Search as a news curator: The role of Google in shaping attention to news information (2019)
When you use a search engine to learn more about current events, you often implicitly assume that it has a reasonably complete coverage of all news sources and that it presents its search results fairly and objectively. This study suggests that those assumptions might not be entirely fair.
Why it matters
Many people nowadays find their news using search engines. This can be a good thing, as it theoretically means that people are exposed to news from different sources (more diverse viewpoints!) and are thus better able to make sense of what’s happening in the world.
In practice it depends a lot on how search engines rank news sites in their results, as high-ranking news sites will attract more attention than lower-ranking news sites. Search engines like Google therefore have the power to introduce biases or even trap users in filter bubbles!
This paper sheds some light on how news sites are ranked by Google.
How the study was conducted
Google Trends is a service that can be used to find “stories” that are currently trending, based on newly published news articles and queries on Google Search.
One of the authors used the service to manually identify the top 30 hard newsNews with a broad societal impact (e.g. stories about government and macroeconomics) as opposed to soft news (e.g. stories about celebrities, and sports and stock results) stories every day for the duration of one month. For each of these stories the most popular search query was used to scrape the Top Stories box on the first search engine results page for 24 hours.
What discoveries were made
The findings are based on the three stories that are featured in the Top Stories box when viewed anonymously in a browser from North America.
First up is diversity; more specifically, diversity in news sources and diversity in ideological alignment.
When the authors look at overall diversity, they see that the top 20% of news sources account for over 80% of all impressions. The top three is formed by CNN, The New York Times, and The Washington Post. This isn’t very surprising, because these media outlets have enough resources to cover a large variety of stories.
If you take that into account by looking at the average number of impressions per search query, you’ll find that CNN and The New York Times still top the list. The rest of the list changes drastically. Smaller, national news sources like ABC News and CBS News are displaced by niche outlets like Forbes (business) and Wired (technology). These outlets don’t always appear in results, but when they do appear, they’re always shown first.
Of course we aren’t just interested in diversity of news outlets: what we really want to know is if users are exposed to a wide range of perspectives and viewpoints.
The authors use ratings data from a study by Bakshy, Messing & Adamic (2015), who calculated the ideological slant of news sources by looking at the alignment of users that shared their content on Facebook.
Ratings data are only available for 187 of 727 news outlet subdomainsThe authors look at subdomains here because some subdomains can have different ideological alignments, e.g. money.cnn.com is more conservative than cnn.com.. This still covers 74% of all impressions, so it’s not a problem. The data shows that 139 of the high-ranking subdomains are liberal or left-leaning, while only 48 are conservative or right-leaning. This suggests that the results are clearly left-leaning!
Does this mean that Google’s algorithms are biased towards left-leaning sources or are there simply more liberal news sources?
To answer this question, the authors compared the ideological distribution of Google’s results with those from the gargantuan GDELT database that contains about news sources from all over the world.
GDELT shows that there are indeed more liberal than conservative news sources, but the proportion of liberal news sources in the Top Stories box is much higher than what you’d expect based on the actual distribution of the ideological alignments of news outlets.
It’s not clear whether this is because Google’s ranking algorithm prefers liberal news sources or because liberal news sources are simply better at search engine optimisation.
The results also tell us how quickly Google’s algorithms are able to incorporate new news stories into their search results and how the age of an article affects its ranking.
Over 83% of articles in the Top Stories box are less than 24 hours old. Newer articles are usually higher-ranking, but it depends on the type of news: breaking news stories tend to be less than one hour old, news stories that are interesting for slightly longer periods (news about events that are still in the future) can be several days old, while high-ranking stories that are weeks old generally provide background or contextual information that doesn’t age.
Curation and attention
Finally, the authors looked at how the position of a story in the Top Stories box affects the number of clicks and amount of exposure to a news source by correlating them with analytics data provided by Chartbeat, a content intelligence platform for publishers.
It appears that stories that manage to get featured in the Top Stories box may receive about 10% more referrals than stories that appear in the organic search results. The position of a story in the box also matters: the middle position provides the largest boost, followed by the left-most and the right-most positions.
The important bits
- Bakshy, E., Messing, S., & Adamic, L. A. (2015). Exposure to ideologically diverse news and opinion on Facebook. Science, 348(6239), 1130–1132.