Taking a stand on AI Accountability: ChatGPT co-authorship deemed inappropriate

By Olivia Haslam

- Last updated on GMT

GettyImages - AI / Laurence Dutton
GettyImages - AI / Laurence Dutton

Related tags AI Artificial intelligence Machine learning Technology ChatGPT Research

ChatGPT has been rejected as a co-author in scientific papers by The American College of Rheumatology (ACR) journal editors due to concerns over lack of accountability.

The journal editors and Committee on Journal Publications have stated in a recent editorial, that it is a necessary decision that the large language model (LLM) ChatGPT​ cannot be granted co-authorship, based on the International Committee of Medical Journal Editors (ICMJE) authorship criteria​.


The generative pre-trained transformer (GPT) is the next generation of AI-powered chatbots that is capable of constructing full sentences on topics and synthesising information from multiple sources with great nuance. 

ChatGPT has been put to the test in previous clinical scenarios​, and the results have been notably passable.

In a recent experiment conducted by authors from ACR, chatGPT was asked to create a patient-facing educational brochure on medications for gout.

It was given criteria for creating the material, such as reading level, length of brochure, focus, and items that needed mentioning. The authors reported: “Within seconds, it had produced a relatively accurate brochure that required editing, but was a good draft.”

The successes of LLM interfaces are already being noted, evidenced by AI’s inclusion as a co-author​ on previous scientific papers. 

According to the authors of the ACR editorial, some editors have also wondered whether LLM AI tools could be used in the peer review process. 

They note that ACR journals already use AI tools to check for plagiarism and image authenticity, and utilise AI to find appropriate peer reviewers. 

However, they mention that they have not yet put AI tools to use as actual “peer reviewers,” and while they do not anticipate substituting human peer reviewers with LLM AI tools, they will consider future practice, stating: “We will monitor if such tools can be a useful adjunct.”


One main issue appears not to be the ability to produce information, but who is liable for it. 

The authors from ACR note: “At this stage, no one doubts that these tools can generate useful text that might accurately synthesise previously collected or original data.

“But, authorship raises other questions about accountability. If the methods that LLM AI tools use to generate text are not transparent (they probably will never be), then who is accountable?”

When asked about the accountability issues when it comes to using ChatGPT as a source of information, the LLM itself responds: “It is important to use ChatGPT as a source of information with caution and to complement its responses with human expertise and critical thinking.

“As an AI language model, ChatGPT does not have personal accountability in the same way that a human expert would. It is not responsible for the accuracy of its responses or the impact of its advice.”


The authors also note that a potential issue is that LLM AI tools are trained on existing literature that may be inaccurate and biased. They state: “We have concerns that unintended biases may be magnified through these tools, often in ways that are not apparent.”

ChatGPT can generate biased information in a few ways. ChatGPT's training data​ is based on a large corpus of text, which means that any biases in that text can be learned and reinforced by the model.

If the training data contains biased language or viewpoints, ChatGPT may replicate those biases in its responses.

A statement​ on OpenAI’s site reads: “ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers.

“Fixing this issue is challenging, as during Reinforcement Learning (RL) training, there’s currently no source of truth.”

RL involved training an initial model using supervised fine-tuning, in which human AI trainers provided conversations in which they played both sides - the user and an AI assistant. 

The trainers are given access to model-written suggestions to help them compose their responses, and then this new dialogue data set is combined with the InstructGPT dataset, which is transformed into a dialogue format.

The OpenAI statement on ChatGPTs limitations reads: “While we’ve made efforts to make the model refuse inappropriate requests, it will sometimes respond to harmful instructions or exhibit biased behaviour.”

Journal: Arthritis & Rheumatology and ACR Open Rheumatology.


“ChatGPT, et al…Artificial Intelligence, Authorship, and Medical Publishing”

Authors: Daniel H. Solomon, Kelli D. Allen, Patricia Katz, Amr H. Sawalha, Ed Yelin

Follow us


View more