Chuniversiteit logomarkChuniversiteit.nl
The Toilet Paper

People are apparently farming citations on ResearchGate

This week’s paper about citation farming is right at the intersection of two of my “favourite” subjects: AI slop and academic fraud.

A farmer tends to his fields, with crops that look like citations
Old MacDonald had a farm, E-I-E et al.

I don’t think much of value is lost when my local baker uses AI slop for their billboards instead of free stock photos, but it becomes much more of an issue when a respected Big Four consultancy like Ernst & Young publishes reports that are full of hallucinations, because companies and governmental organisations often base major decisions on such reports.

Even science is not safe from AI slop. Recently, arXiv announced one-year bans for anyone who’s caught submitting AI slop to their platform in response to a general increase in fake citations, unedited prompt responses, and nonsensical diagrams in peer-reviewed literature, all of which pollute scientific knowledge and discredit scientific practices.

A bit over a year ago I wrote about the dishonest practice of deliberately publishing in fake journals to boost citation counts, one of the key productivity metrics on which researchers are often judged. This week’s paper shows that some researchers may be trying to similarly inflate their citation counts via paid services that submit LLM-generated papers to non-peer-reviewed platforms such as ResearchGate.

It all started with one of the paper’s authors who discovered an article on Google Scholar that was falsely attributed to them as a co-author. A web search revealed that the paper was published on ResearchGate by a supposed author who does not appear to exist outside the platform and likely only exists to provide citation boosts as a (paid) service. A broader search yielded four more fake accounts that had “co-authored” seemingly LLM-generated papers with that first account. Starting from these five accounts, the researchers created a dataset containing more than 22,000 real and fake authors.

Fake papers can be grouped into two categories. Papers published before have many more authors and are seemingly legitimate, in the sense that they are real papers that have been re-uploaded by fake accounts to claim co-authorship of legitimate research. Since 2022, fake accounts have begun publishing several papers per month. These newer papers are less likely to be published elsewhere and more obviously fake. They often contain little to no author information, text, formatting, or references, although some appear superficially well-produced with (nonsensical) text, figures, and tables.

To determine who the main beneficiaries of these fake publications are, the researchers first went looking for citation groups, i.e. clusters of papers that have an identical list of references, which are a clear sign that something strange is going on.

Presumably, at least one of the cited authors intends to benefit from the citations, while the remaining authors are merely there to hide the fraud. It’s likely that whoever is benefiting from these fake papers will be cited often across and within groups, more than one would normally expect.

With these two assumptions, we see that most authors in the dataset have only a few cited papers and appear in few groups. From a short analysis it appears that for one particular author, as many as 81% of their citations across ResearchGate can be marked as suspicious!

Summary

Link
  1. Fake papers are being uploaded to ResearchGate, presumably to boost citation counts

  2. This paper presents a method to identify authors who may have used paid services to boost their citation counts