Google Scholar’s Fake Citations
Seeding Google Scholar with fake authors and citations proves not only easy, but profitable
Reluctantly, I’m covering a preprint. It’s not flawless and would clearly be improved by thorough, expert peer review, but its most interesting findings probably will hold up.
Why the confidence?
- Because people are already actively pursuing effective scams that fit the claims the authors make and match the sting they perpetrated.
- Because the problem has been described for more than a decade in peer-reviewed papers — and apparently has gone unaddressed.
Here are the topline results from the recent preprint:
- Citation metrics in Google Scholar can be and are being manipulated
- Preprints, “special issues,” and bulk publishing can be used to plant fake citations in Google Scholar, and drive up various citation-related metrics in an illegitimate manner
- Even post hoc screening fails to eliminate the problem, as Google Scholar indexes and then caches citations that may appear for only a short while before being taken down
- Google Scholar misses obvious instances of citation manipulation
- Google Scholar is more widely used for faculty evaluation than previously documented
- Effective front-end screening can thwart malfeasance, while allowing fakes to get through due to ineffective screening creates lasting problems
The authors detail creating fake author profiles and 20 ChatGPT-generated papers, and then using these to seed four services — ResearchGate, arXiv, Authorea, and OSF — in order to see if citation metrics in Google Scholar could be manipulated in in favor of the fake author.
The fake papers included citations to non-existent papers attributed to the fake author.
In other words, could a fake author become an h-index star via Google Scholar in relatively short order and completely illegitimately?