Navigating health care systems as a patient can be daunting at the best of times, whether you’re interpreting jargon-filled diagnoses or determining which specialists to see next. Similarly, doctors often have grueling schedules that make it difficult to offer personalized attention to all their patients. These issues are only exacerbated in areas with limited physicians and medical infrastructure.
Bringing AI into the doctor’s office to alleviate these problems is a dream that researchers have been working toward since IBM’s Watson made its debut over a decade ago, but progress toward these goals has been slow-moving. Now, large language models (LLMs), including ChatGPT, could have the potential to reinvigorate those ambitions.
The team behind Google DeepMind have proposed a new AI model called AMIE (Articulate Medical Intelligence Explorer), in a recent preprint paper published 11 January on arXiv. The model could take in information from patients and provide clear explanations of medical conditions in a wellness visit consultation.
Vivek Natarajan is an AI researcher at Google and lead author on the recent paper. He says that while AMIE isn’t designed to replace human physicians, he does believe a similar AI could play a role in assisting both physicians and patients.
“There may be scenarios when people might benefit from interacting with systems like AMIE as part or in addition to their clinical journeys,” Natarajan says. “These include understanding symptoms and conditions better, including simplifying explanations in local vernaculars…and acting as a valuable second opinion.”
“This, in turn, potentially provides a pathway for medical AI towards superhuman diagnostic performance.” —Vivek Natarajan, Google
Thomas Thesen is an associate professor of medical education at Dartmouth’s Geisel School of Medicine who created the AI Patient Actor app to help train medical students on diverse patient scenarios. While he thinks AI will play an increasingly larger role in health care, he doesn’t believe it will replace the expertise of human physicians.
“What I see coming in the next decade is AI increasingly supporting doctors by streamlining their work and contributing to certain limited diagnostic processes,” Thesen says. “However, the expert judgment of a trained doctor will remain crucial for final diagnosis and treatment plans.”
To bring AMIE up to speed without sending it through medical school, Natarajan and colleagues started by feeding the AI on real-world medical texts, including the transcripts of nearly 100,000 real physician-patient dialogues, 65 clinician-written summaries of intensive care unit medical notes, and thousands of medical reasoning questions taken from the United States Medical Licensing Examination.
Yet, these data alone were not enough to set AMIE up for success, Natarajan says, particularly because the data tend to be noisy and capture only a small subset of potential medical scenarios. To fill in these gaps, the team also used a simulated diagnostic environment that allowed AMIE to learn from its own mistakes through two different “self-play” loops.
“The environment included two self-play loops—an ‘inner’ self-play loop, where AMIE leveraged in-context critic feedback to refine its behavior on simulated conversations with an AI patient simulator, and an ‘outer’ self-play loop where the set of refined simulated dialogues were incorporated into subsequent fine-tuning iterations,” Natarajan says. “The resulting new version of AMIE could then participate in the inner loop again, creating a virtuous continuous learning cycle.”
While Natarajan stresses that there is no substitute for real human experience in medicine, this training model does give AMIE a leg up in some ways over human physicians. For example, a human physician may see only 10,000 patients in their career, but AMIE could “see” that many patients in just a couple of training cycles.
“This, in turn, potentially provides a pathway for medical AI towards superhuman diagnostic performance,” Natarajan says.
To see how well AMIE stacked up against human physicians, Natarajan and colleagues pitted it against 20 human primary-care providers in a blind and randomly controlled trial to consult with patient actors located in Canada, India, and the United Kingdom. There were 149 different consults conducted via live texting and evaluated by both the patient actors and human specialists.
“The expert judgment of a trained doctor will remain crucial for final diagnosis and treatment plans.” —Thomas Thesen, Dartmouth
The consults were measured using several factors, including perceived empathy, openness and honesty, diagnostic accuracy, and management planning. Both the patient actors and specialists determined that AMIE provided “greater diagnostic accuracy and superior performance” compared to their human counterparts. However, these results are not necessarily as black and white as they sound.
For one thing, these consults were completed using the type of live, text-based chats that are typically used to communicate with LLMs. However, this format is very different from the type of face-to-face interaction that human physicians are used to, potentially offering AMIE an advantage. The team found that AMIE also tended to write significantly longer responses than human physicians, which they believe could be interpreted as more time-intensive—and thus thoughtful and empathetic—by patients.
Going forward, Natarajan says he and colleagues are interested in expanding AMIE’s capabilities to include multimodal sources, such as video chats. The team will also look at problems of equity, fairness, and adversarial testing to better prepare AMIE for the real world.
As for the human physicians anticipating AMIE’s arrival, Thesen says it’s important that they proactively prepare for how this technology could change medicine.
“Medical schools have a responsibility to incorporate AI literacy into their curriculum,” Thesen says. “This includes understanding the ethical implications to ensure that as AI becomes more integrated into clinical practice, future doctors can use it responsibly and protect their patients’ well-being.”
- AI-Human “Hive Mind” Diagnoses Pneumonia - IEEE Spectrum ›
- How IBM Watson Overpromised and Underdelivered on AI Health ... ›
- AI Medicine Comes to Africa's Rural Clinics - IEEE Spectrum ›