How thousands of invisible citations sneak into papers and make for fake metrics

In 2022, Guillaume Cabanac noticed something unusual: a study had attracted more than 100 citations in a short span of less than two months of being published. 

Cabanac, a computer scientist at the University of Toulouse in France, initially flagged the study on PubPeer after it was highlighted by the Problematic Paper Screener, which automatically identifies research papers with certain issues. 

The screener flagged this particular paper — which has since been retracted — for containing so-called tortured phrases, strange twists on established terms that were probably introduced by translation software or humans looking to circumvent plagiarism checkers. 

But Cabanac noticed something weird: The study had been cited 107 times according to the ‘Altmetrics donut,’ an indicator of an article’s potential impact, yet it had been downloaded just 62 times. 

What’s more, according to Google Scholar, this paper had been cited only once. “There was a clear discrepancy between the counts on Google Scholar and the counts on Altmetrics/Dimensions,” Cabanac says. That gap is especially significant since “we know that usually Google Scholar overestimates the number of citations,” he adds. 

After a little probing,  Cabanac and his sleuthing colleagues figured out where the extra citations were coming from: the metadata files submitted to Crossref, a repository for unique identifiers for scholarly metadata, as the group report in a preprint posted to the arXiv server October 4. Google Scholar doesn’t use metadata files submitted to Crossref; instead it text-mines PDF versions of studies, Cabanac says.  

“We believe we found an undocumented way of cheating with citation counts,” Cabanac tells Retraction Watch. “It’s original because it doesn’t require fraudsters to alter the version of record, meaning the PDF or HTML version of the paper.”

 The metadata files of the papers in question seem to contain more references than are in the HTML or PDF versions, Cabanac says. According to Cabanac, the references are sneaked in at some point into metadata files that are submitted to Crossref and automatically ingested. Since metadata files can be resubmitted as many times as one likes, updated metadata files can also be submitted anytime after an article is published. 

These extra undue citations ultimately inflate the Altmetrics score represented by the donut, which depicts how often an article is being cited and mentioned on social media. That’s problematic because these inflated citation scores are ultimately reflected on bibliographic platforms like Dimensions. Citation counts are frequently used as a way to judge researchers and apportion funding so boosting such indicators could falsely amplify a researcher’s perceived impact.    

According to the study, the introduced references seem to be coming largely from journals published by Technoscience Academy, an open access publisher run out of Gujarat, India, and a Crossref member. Technoscience Academy did not reply to a request for comment. 

It isn’t clear who is manipulating the metadata files or whether the issue is due to a technical glitch, is unclear. But Cabanac says the phenomenon is a result of lack of gatekeeping. One way of addressing the issue would be by building tools and systems to regularly compare the references within PDFs, HTMLs and metadata files of the whole scholarly literature, he adds. 

Cabanac says if it becomes clear a publisher’s output includes cooked references, its Crossref membership should be scrutinized. Being the signatory on the agreement with Crossref, “the publisher is responsible for their actions,” Cabanac says. “They can run an audit in their own premises to see who the malevolent person is.”

“It looks really dodgy,” says Ginny Hendricks, director of member and community outreach at Crossref, who notes the case is the first time her organization has heard of sneaked references. “It definitely seems like a side effect of the community’s obsession with citation as a metric [and] a measure of impact or importance, which is unfortunate.”

She adds that Crossref will look into the issue, noting that the organization rarely ever revokes membership for cause. The only member it has excluded for cause in the past is Omics International, Henricks says: “They were causing harm to the whole community.”

Hendricks says Crossref has so far not considered introducing extensive gatekeeping but she encourages third parties to use Crossref’s open data to develop systems to do just that. “We’re not the people that decide scientific legitimacy,” she says.

The study analyzed the content of three journals published by Technoscience Academy, each of which have minted more than 1,000 digital object identifiers at Crossref. It found that around 9% of references included in metadata files of the papers published by these three journals — 5,978 references out of a total of 65,836 — benefitted just two researchers who had co-authored the studies being cited. 

One of the researchers in question is J. Nageswara Rao of the Vignan’s Institute of Information Technology in Visakhapatnam, India, who benefitted from 3,103 extra citations, the study found. 

Retraction Watch contacted Rao for a comment but has yet to hear back. The retraction notice for the paper Cabanac found reads: 

This article has been retracted by Hindawi following an investigation undertaken by the publisher [1]. This investigation has uncovered evidence of one or more of the following indicators of systematic manipulation of the publication process: 

(1) Discrepancies in scope 

(2) Discrepancies in the description of the research reported 

(3) Discrepancies between the availability of data and the research described 

(4) Inappropriate citations 

(5) Incoherent, meaningless and/or irrelevant content included in the article 

(6) Peer-review manipulation

The second author that the study namechecks as benefiting from the sneaked references is Bhavesh Kataria of the LDRP Institute of Technology and Research in Gandhinagar, India, who has benefitted from 1,564 extra citations, according to the study. Retraction Watch could not find contact details for Kataria. 

Three journals also profited from the sneaked citations, the study found. The International Journal of Scientific Research in Science, Engineering and Technology gained an extra 826 citations followed by the International Journal of Advanced Science and Technology and the Turkish Journal of Physiotherapy and Rehabilitation with 537 and 428, respectively. 

In addition to sneaked references, the study also reports on instances of ‘lost references,’ which are references that are in the HTML/PDF but not in Crossref metadata files. “Users of the Crossref metadata (e.g., Dimensions) disregard some references because they are not in their database or because they failed to properly textmine the text of the references provided in the metadata,” Cabanac says. The study found that to be the case for 56% (36,939 out of 65,836) references in HTML versions of papers. 

Editor’s note: Last month, Crossref acquired the Retraction Watch database. The deal does not involve the Retraction Watch blog, which remains independent.

Like Retraction Watch? You can make a tax-deductible contribution to support our work, follow us on Twitter, like us on Facebook, add us to your RSS reader, or subscribe to our daily digest. If you find a retraction that’s not in our database, you can let us know here. For comments or feedback, email us at team@retractionwatch.com.

How critics say a computer scientist in Spain artificially boosted his Google Scholar metrics

Want a higher h-index? Here’s a way – but be warned, it’s a method that will raise some eyebrows. Take the example of Juan Manuel Corchado, a computer scientist at the University of Salamanca in Spain. He has the 145th-highest h-index in the country. But many of the nearly 39,000 citations are by him to … Continue reading How critics say a computer scientist in Spain artificially boosted his Google Scholar metrics

Researchers sound alarm on ‘predatory’ rankings

Hey, researchers and universities, want to be included in a new ranking scheme? No problem, just pony up some cash.  Tanvir Ahmed, a nuclear engineer at North Carolina State University in Raleigh, says this year has seen a rise in news stories— for example from Bangladesh, Kashmir, and Nigeria —  reporting so-called predatory rankings. These … Continue reading Researchers sound alarm on ‘predatory’ rankings

How one US organization hopes to make retractions more visible

As Retraction Watch readers likely know, there’s ample evidence that retracted papers — 2,500 per year and growing — continue to attract citations that do not mention the fact the paper has been retracted. Some of that may be because it’s not clear on publishers’ sites and databases that these papers have been retracted or … Continue reading How one US organization hopes to make retractions more visible

Should journals retract when an author is sent to prison for a crime unrelated to their work?

Should a journal retract a paper when they learn that one of its authors has earned a year-long prison sentence for downloading child pornography? For Brill’s Journal of Afroasiatic Languages and Linguistics the answer was no. And experts in publication ethics say that was the right call. The researcher in question is Jan Joosten, who … Continue reading Should journals retract when an author is sent to prison for a crime unrelated to their work?

Singapore university revokes second researcher’s PhD in misconduct fallout

Last year, the fallout from a misconduct investigation at Nanyang Technological University (NTU) in Singapore resulted in the university revoking the PhD of a Harvard research fellow, and a senior researcher losing his job. In July 2016, NTU told us another researcher who could not be named at the time had also come forward and […]

The post Singapore university revokes second researcher’s PhD in misconduct fallout appeared first on Retraction Watch.

“Remarkable” it was ever accepted, says report: Science to retract study on fish and microplastics

Science is retracting a paper about how human pollution is harming fish, after months of questions about the validity of the data. The move, first reported by the news side of Science on Friday, follows a new report from a review board in Sweden that concluded the authors were guilty of “scientific dishonesty,” and the paper should be […]

The post “Remarkable” it was ever accepted, says report: Science to retract study on fish and microplastics appeared first on Retraction Watch.

Columbia University probe prompts retraction of cardiovascular paper

A journal has retracted a 2011 study at the request of Columbia University. According to Jeanine D’Armiento, the study’s last author, the newly retracted paper in Clinical Science contained a figure from a Journal of Hypertension paper published by the same authors earlier that year.  First and corresponding author Joseph George told Retraction Watch the error was unintentional. […]

The post Columbia University probe prompts retraction of cardiovascular paper appeared first on Retraction Watch.

How did a book chapter end up with two authors who didn’t contribute to it?

An erratum for a book chapter about water pollution has removed two out of the three original authors.  What’s more, the notice specifies that “any mistakes or omissions are the sole responsibility” of the remaining author, Michael Yodzis of the University of Guelph in Ontario, Canada.  This isn’t something we see every day, but one […]

The post How did a book chapter end up with two authors who didn’t contribute to it? appeared first on Retraction Watch.

Author objects to retraction of paper suggesting fingerprints can predict facial features

A journal has pulled a paper about predicting people’s faces from their fingerprints due to “significant overlap” with a previous paper by the same authors.    According to the retraction notice in Intelligent Automation & Soft Computing, the authors didn’t cite or acknowledge the other study in the Turkish Journal of Electrical Engineering & Computer […]

The post Author objects to retraction of paper suggesting fingerprints can predict facial features appeared first on Retraction Watch.