Could AI Disrupt Peer Review?

Publishers’ policies lag technological advances

3 min read
A 3d illustration of a horizontal stack of papers, all of which are glowing green, except for one red one that is being pulled out to the foreground.
Getty Images

Spending time poring over manuscripts to offer thoughtful and incisive critique as a peer reviewer is one of academia’s most thankless jobs. Peer review is often the final line of defense between new research and the general public and is aimed at ensuring the accuracy, novelty, and significance of new findings.

This crucial role is voluntary, unpaid, and often underappreciated by academic publishers and institutions. As with other tedious jobs in today’s world, this raises the question: Can, and more importantly, should, publishers trust AI to handle peer review instead? A number of researchers say no and are growing concerned about how AI may threaten the integrity of the review process by reinforcing bias and introducing misinformation.

Vasiliki Mollaki is a bioethicist and geneticist at the International Hellenic University in Greece addressed this issue in the journal Research Ethics on 9 January in an article pointedly titled “Death of a Reviewer or Death of Peer Review Integrity?”

In her paper, Mollaki reviewed the AI policies of top academic publishers—including Elsevier and Wiley—to determine whether they were preparing to address the potential use of AI in peer review. While several journals have developed policies around AI used by authors to write manuscripts, such policies for peer review were almost nonexistent.

“If [AI] is mentioned, it’s on the basis that there might be confidential data or even personal data that should not be shared with tools [because] they don’t know how this data can be used,” Mollaki says. “The basis is not on ethical grounds.”

Without concrete policies that lay out guidance on transparency or penalties for using AI in peer review, Mollaki worries that the integrity and good faith trust in the peer review process could collapse. Never mind that the question of whether AI is actually capable yet of providing effective peer review is also up for debate.

“Current AI tools are very bad at suggesting specific authors, journals, or papers, and often start hallucinating because their training data is not aimed at forming these connections.”—Tjibbe Donker, Freiburg University Hospital

James Zou is an assistant professor of biomedical data science at Stanford University and is the senior author on a preprint paper published on arXiv in late 2023 that evaluated how AI’s feedback on research papers compared to that of human reviewers. This work found that AI reviewers’ points overlapped with human reviewers’ points at a rate comparable to two human reviewers and that more than 80 percent of researchers found AI’s feedback more helpful than that of human reviewers.

“This is especially helpful for authors working on early drafts of manuscripts,” Zou says. “Instead of waiting for weeks to get feedback from mentors or experts, they can get immediate feedback from the LLM.”

Yet, work published that same year in Lancet InfectiousDiseases by Tjibbe Donker, an infectious disease epidemiologist at Freiburg University Hospital, in Germany, found that AI struggled to generate personalized feedback responses and even created false citations to support its reviews.

“Current AI tools are very bad at suggesting specific authors, journals, or papers, and often start hallucinating because their training data is not aimed at forming these connections,” Donker says.

Despite his reservations, Donker is not necessarily in favor of barring all AI tools from peer review. Instead, he says that using these tools selectively to assist human reviewers in their process could be beneficial, such as helping reviewers assess novelty independent of an author’s writing style by summarizing the paper’s main points. AI could also play a role in consolidating human reviewer’s letters into a single decision letter for authors.

To ensure that reviewers use AI tools in a minimally invasive way, Mollaki says it will be important for journals to write AI review policies that go beyond issues of privacy and focus on disclosure and transparency.

“[Journals] should be as clear as possible about what is not permitted,” Mollaki says. “[How] the tools have been used should be disclosed and even the prompts that were used.”

For authors who break these policies, Mollaki is in favor of a penalty that excludes future participation in peer review. Donker, however, says those repercussions may need to be a little more nuanced. Reacting too strongly to the use of AI in peer review could ironically have the same impact as letting AI run wild.

“Peer reviewing is done voluntarily, unpaid, without much of a reward for the reviewer,” Donker says. “Most scientists would be quite happy to be excluded from this process, while journals end up with even fewer reviewers to choose from.”

The Conversation (0)