Hindawi reveals process for retracting more than 8,000 paper mill articles

Over the past year, amid announcements of thousands of retractions, journal closures and a major index delisting several titles, executives at the troubled publisher Hindawi have at various times mentioned a “new retraction process” for investigating and pulling papers “at scale.”  The publisher has declined to provide details – until now. 

So far in 2023, Hindawi has retracted over 8,000 articles – more than we’ve ever seen in a single year from all publishers combined. And Hindawi is not done cleaning up from paper mills’ infiltration of its special issues, according to a new report from its parent company, Wiley. 

Reckoning with Hindawi’s paper mill problem has cost Wiley, which bought the open-access publisher in 2021, an estimated $35-40 million in lost revenue in the current fiscal year, Matthew Kissner, Wiley’s interim president and CEO, said on the company’s most recent earnings call. Wiley will stop using the “Hindawi” name next year, Kissner told investors. 

The publisher has  issued a whitepaper, “Tackling publication manipulation at scale: Hindawi’s journey and lessons for academic publishing,” which explains “what happened at Hindawi” and the process the company developed to investigate and retract thousands of articles from special issues.  

According to the whitepaper, Hindawi’s research integrity staff identified “suspicious patterns” in multiple special issues around the same time as “independent researchers with an interest in research integrity, also began noticing indicators of large-scale systematic manipulation.” 

In September 2022, the publisher announced an initial batch of 500 retractions. Then, the whitepaper states: 

These initial investigations coupled with the findings of independent researchers pointed to the infiltration of Hindawi [special issues] SIs at an even greater scale than first anticipated, making it clear that thousands of manuscripts would need to be investigated. 

The publisher halted publication of special issues for a few months beginning in October 2022, and reassessed all special issue manuscripts with “a comprehensive checklist of specific hallmarks of papermill papers.” Hindawi also “carried out a thorough review and strengthening of our existing checks on editors, authors and reviewers,” according to the whitepaper. To investigate thousands of already-published papers, it developed “a new protocol aimed at detecting manipulation patterns and retracting papers rapidly and at scale.” 

Hindawi essentially created a checklist of criteria for scoring articles, and all that met a certain threshold were retracted. The whitepaper lists several “indicators of manipulation” Hindawi has been using as evidence to retract articles: 

  1. Discrepancies in scope.
  2. Discrepancies in the description of the research reported.
  3. Discrepancies between the availability of data and the research described.
  4. Inappropriate citations.
  5. Incoherent, meaningless, and/or irrelevant content included in the article.
  6. Compromised or manipulated peer-review.

The publisher explained: 

The decision to retract, given this evidence, is based on the rationale that the publication process has been undermined and we can no longer vouch for the integrity of the article. We have intentionally limited the specific details of what is under investigation in the retraction notice, in part because it is essential that the intelligence shared with bad actors is restricted. Moreover, to proceed expeditiously and communicate with thousands of authors, standardized wording of retraction notices was essential.

Hindawi outsourced the work of assessing individual papers to outside vendors. At least two people, trained by the publisher’s staff, evaluated each paper according to Hindawi’s criteria and filled out a questionnaire in a “bespoke software application.” In addition: 

computational tools were used to provide additional supporting evidence about fabricated content, plagiarized peer-review reports, and suspect peer review turn-around times. We also built on the valuable work done by independent research integrity sleuths, by collecting, evaluating, and categorizing comments provided on PubPeer as part of Smut Clyde’s list.

Besides retracting thousands of papers, Hindawi has used the results of its investigation to ban “several hundred” guest editors of special issues from future editorial roles and publishing articles. The publisher has also instituted “much more stringent checks” on proposals for special issues and guest editors, as well as “much greater scrutiny of peer review.” 

As well as continuing retractions, Hindawi plans to issue expressions of concern for entire special issues “to alert readers that they should take additional care interpreting all papers” when the publisher suspects more articles in a particular issue “are likely to be problematic” but has not yet investigated or found proof of issues. 

Multiple sleuths Retraction Watch asked to comment on the whitepaper complimented Hindawi for taking action and sharing information, but also questioned whether the publisher had really learned its lesson, and pointed out further work to do. 

“I think it’s a good thing they’re documenting what they’re doing and putting out information about it,” Adam Day, developer of the Papermill Alarm, said. “I hope it’s something other publishers take a lead from and feel encouraged to deal with their own problems openly as well, because this is affecting a huge number of publishers.” 

However, Day noted that he had still seen papermill articles in Hindawi journals as of this spring, after the publisher resumed special issues. “I think they’ve definitely made a big impact on the problem, but I don’t know that they’ve completely solved it yet,” he said. 

Jennifer Byrne, leader of the Publication and Research Integrity in Medical Research group at the University of Sydney in Australia, said that Hindawi’s practice of intentionally limiting the information in retraction notices should be balanced with the need for transparency. “It would be helpful for future retraction research scholars if at least some specific information about the reasons for retraction were disclosed,” she said. 

Byrne also said verifying the identities of guest editors for special issues is not enough to ensure quality, as “many guest editors could have superficially convincing credentials, ‘supported’ by publications from paper mills.” The difficulty will grow “as publications from paper mills become increasingly sophisticated and difficult to detect,” she predicted. 

Rather, the topics submitted with proposals for special issues “deserve more stringent scrutiny,” she said: 

Catch-all, duplicative special issue topics invite paper mill submissions, particularly in fields where genuine research remains very difficult, expensive and/or slow.

Cyril Labbé, who with Guillaume Cabanac and Alexander Magazinov developed the Problematic Paper Screener, praised the whitepaper’s acknowledgement of sleuths who “are working daily, most often pro-bono, to fix the work that has not been done properly by publisher.” 

Cabanac added to the whitepaper’s list of recommendations for other publishers: “fund sleuths and credit them.” 

Dorothy Bishop, who has examined paper mill activity in Hindawi special issues in depth, said she was “very pleased to see the publisher has at last grappled with the need for retractions ‘at scale.’” Banning editors and authors associated with paper mills is “a great step in the right direction, especially if they can share information about banned individuals with other publishers,” she said. 

But Bishop critiqued the framing of the document: 

The report presents Hindawi as a victim of an “academic culture of ‘publish or perish’, which has incentivized unethical behaviour”. What it omits is the influence of the commercial publisher culture of greed, which aims for massive growth in the number of published papers, with associated growth in profits. 

By her calculations, Hindawi would have brought in millions of dollars from article processing charges for now-retracted papers: “The authors of these articles don’t get their money back.”

Like Retraction Watch? You can make a tax-deductible contribution to support our work, subscribe to our free daily digest or paid weekly update, follow us on Twitter, like us on Facebook, or add us to your RSS reader. If you find a retraction that’s not in The Retraction Watch Database, you can let us know here. For comments or feedback, email us at team@retractionwatch.com.

How thousands of invisible citations sneak into papers and make for fake metrics

In 2022, Guillaume Cabanac noticed something unusual: a study had attracted more than 100 citations in a short span of less than two months of being published. 

Cabanac, a computer scientist at the University of Toulouse in France, initially flagged the study on PubPeer after it was highlighted by the Problematic Paper Screener, which automatically identifies research papers with certain issues. 

The screener flagged this particular paper — which has since been retracted — for containing so-called tortured phrases, strange twists on established terms that were probably introduced by translation software or humans looking to circumvent plagiarism checkers. 

But Cabanac noticed something weird: The study had been cited 107 times according to the ‘Altmetrics donut,’ an indicator of an article’s potential impact, yet it had been downloaded just 62 times. 

What’s more, according to Google Scholar, this paper had been cited only once. “There was a clear discrepancy between the counts on Google Scholar and the counts on Altmetrics/Dimensions,” Cabanac says. That gap is especially significant since “we know that usually Google Scholar overestimates the number of citations,” he adds. 

After a little probing,  Cabanac and his sleuthing colleagues figured out where the extra citations were coming from: the metadata files submitted to Crossref, a repository for unique identifiers for scholarly metadata, as the group report in a preprint posted to the arXiv server October 4. Google Scholar doesn’t use metadata files submitted to Crossref; instead it text-mines PDF versions of studies, Cabanac says.  

“We believe we found an undocumented way of cheating with citation counts,” Cabanac tells Retraction Watch. “It’s original because it doesn’t require fraudsters to alter the version of record, meaning the PDF or HTML version of the paper.”

 The metadata files of the papers in question seem to contain more references than are in the HTML or PDF versions, Cabanac says. According to Cabanac, the references are sneaked in at some point into metadata files that are submitted to Crossref and automatically ingested. Since metadata files can be resubmitted as many times as one likes, updated metadata files can also be submitted anytime after an article is published. 

These extra undue citations ultimately inflate the Altmetrics score represented by the donut, which depicts how often an article is being cited and mentioned on social media. That’s problematic because these inflated citation scores are ultimately reflected on bibliographic platforms like Dimensions. Citation counts are frequently used as a way to judge researchers and apportion funding so boosting such indicators could falsely amplify a researcher’s perceived impact.    

According to the study, the introduced references seem to be coming largely from journals published by Technoscience Academy, an open access publisher run out of Gujarat, India, and a Crossref member. Technoscience Academy did not reply to a request for comment. 

It isn’t clear who is manipulating the metadata files or whether the issue is due to a technical glitch, is unclear. But Cabanac says the phenomenon is a result of lack of gatekeeping. One way of addressing the issue would be by building tools and systems to regularly compare the references within PDFs, HTMLs and metadata files of the whole scholarly literature, he adds. 

Cabanac says if it becomes clear a publisher’s output includes cooked references, its Crossref membership should be scrutinized. Being the signatory on the agreement with Crossref, “the publisher is responsible for their actions,” Cabanac says. “They can run an audit in their own premises to see who the malevolent person is.”

“It looks really dodgy,” says Ginny Hendricks, director of member and community outreach at Crossref, who notes the case is the first time her organization has heard of sneaked references. “It definitely seems like a side effect of the community’s obsession with citation as a metric [and] a measure of impact or importance, which is unfortunate.”

She adds that Crossref will look into the issue, noting that the organization rarely ever revokes membership for cause. The only member it has excluded for cause in the past is Omics International, Henricks says: “They were causing harm to the whole community.”

Hendricks says Crossref has so far not considered introducing extensive gatekeeping but she encourages third parties to use Crossref’s open data to develop systems to do just that. “We’re not the people that decide scientific legitimacy,” she says.

The study analyzed the content of three journals published by Technoscience Academy, each of which have minted more than 1,000 digital object identifiers at Crossref. It found that around 9% of references included in metadata files of the papers published by these three journals — 5,978 references out of a total of 65,836 — benefitted just two researchers who had co-authored the studies being cited. 

One of the researchers in question is J. Nageswara Rao of the Vignan’s Institute of Information Technology in Visakhapatnam, India, who benefitted from 3,103 extra citations, the study found. 

Retraction Watch contacted Rao for a comment but has yet to hear back. The retraction notice for the paper Cabanac found reads: 

This article has been retracted by Hindawi following an investigation undertaken by the publisher [1]. This investigation has uncovered evidence of one or more of the following indicators of systematic manipulation of the publication process: 

(1) Discrepancies in scope 

(2) Discrepancies in the description of the research reported 

(3) Discrepancies between the availability of data and the research described 

(4) Inappropriate citations 

(5) Incoherent, meaningless and/or irrelevant content included in the article 

(6) Peer-review manipulation

The second author that the study namechecks as benefiting from the sneaked references is Bhavesh Kataria of the LDRP Institute of Technology and Research in Gandhinagar, India, who has benefitted from 1,564 extra citations, according to the study. Retraction Watch could not find contact details for Kataria. 

Three journals also profited from the sneaked citations, the study found. The International Journal of Scientific Research in Science, Engineering and Technology gained an extra 826 citations followed by the International Journal of Advanced Science and Technology and the Turkish Journal of Physiotherapy and Rehabilitation with 537 and 428, respectively. 

In addition to sneaked references, the study also reports on instances of ‘lost references,’ which are references that are in the HTML/PDF but not in Crossref metadata files. “Users of the Crossref metadata (e.g., Dimensions) disregard some references because they are not in their database or because they failed to properly textmine the text of the references provided in the metadata,” Cabanac says. The study found that to be the case for 56% (36,939 out of 65,836) references in HTML versions of papers. 

Editor’s note: Last month, Crossref acquired the Retraction Watch database. The deal does not involve the Retraction Watch blog, which remains independent.

Like Retraction Watch? You can make a tax-deductible contribution to support our work, follow us on Twitter, like us on Facebook, add us to your RSS reader, or subscribe to our daily digest. If you find a retraction that’s not in our database, you can let us know here. For comments or feedback, email us at team@retractionwatch.com.

Author demands a refund after his paper is retracted for plagiarism

The author of a 2021 paper in a computer science journal has lost the article because he purportedly stole the text from the thesis of a student in Pakistan – a charge he denies.  According to the editors of Computational Intelligence and Neuroscience, a Hindawi title, Marwan Ali Albahar, of Umm Al Qura University College … Continue reading Author demands a refund after his paper is retracted for plagiarism

‘Beggers’ can’t be choosers as another meta-analysis is retracted

A group of researchers in China may be asking for a refund after, they claim, they got bad advice from a course in writing meta-analyses that led to a retraction for plagiarism and other problems.  They may not be alone. We’re aware of at least nine articles with similar issues that have been retracted so … Continue reading ‘Beggers’ can’t be choosers as another meta-analysis is retracted

Journal expresses a great deal of concern over deceased author’s work

A gastroenterology journal has issued an extensive expression of concern about a 2013 paper by Yoshihiro Sato, a Japanese endocrinologist who has posthumously been climbing the Retraction Watch leaderboard. (He’s now ranked number three, ahead of Diederik Stapel.) To call the statement an “expression of concern” is like calling Charles M. Schulz a talented cartoonist, … Continue reading Journal expresses a great deal of concern over deceased author’s work

No delight for Turkish surgeon in authorship dispute over case study

A surgeon in Turkey has won a court case in which he argued that he deserved to be named in  a list of authors from his institution who’d published a paper. But even that doesn’t appear to have satisfied the aggrieved medic, as you’ll see.  The article, “Late onset traumatic diaphragmatic herniation leading to intestinal … Continue reading No delight for Turkish surgeon in authorship dispute over case study

Chemistry researcher who studies oil wells is up to seven retractions

A chemistry researcher in India is up to seven retractions and one correction for problematic images and other issues.  The researcher, Mahendra Yadav, was the first author on an article titled “Corrosion inhibition of tubing steel during acidization of oil and gas wells,” which appeared in 2013 in the Journal of Petroleum Engineering (JPE). Yadav, … Continue reading Chemistry researcher who studies oil wells is up to seven retractions

Caught stealing a manuscript, author blames a dead colleague

As William Faulkner wrote in Requiem for a Nun, “The past is never dead. It’s not even past.” Farzad Kiani learned that lesson the hard way. Kiani, of Istanbul Sabahattin Zaim University, was the “author” of a 2018 review article in Wireless Communications and Mobile Computing titled “A survey on management frameworks and open challenges … Continue reading Caught stealing a manuscript, author blames a dead colleague

Publisher retracts two papers, will correct five more for lab with high “level of disorganization”

A lab at the University of Malaya has lost two papers and will have to correct five more — just from one publisher — over poor lab practices. One of the retracted papers paper tested the effects of a plant on liver damage; its notice says the paper contains overlap with another paper from the … Continue reading Publisher retracts two papers, will correct five more for lab with high “level of disorganization”

Gluten-free turkeys? Paper on dangers of wheat-based diet in birds retracted

The journal Scientifica has retracted a 2016 paper on gut disease in turkeys for a rafter of sins including plagiarism and authors plucked out of thin air. The article, “Role of wheat based diet on the pathology of necrotic enteritis in turkeys,” was purportedly written by a team from Pakistan and France. But it turns … Continue reading Gluten-free turkeys? Paper on dangers of wheat-based diet in birds retracted