The February 2023 issue of IEEE Spectrum is here!

Close bar

Doctors Still Struggle to Make the Most of Computer-Aided Diagnosis

Language barriers and human interfaces slow adoption of diagnostic-aid tech

3 min read
Language barriers and human interface slow adoption of diagnostic-aid tech
Photo: iStockphoto

As a timer counted down, a team of physicians from St. Michael’s Medical Center in Newark, N.J., conferred on a medical diagnosis question. Then another. And another. With each question, the stakes at Doctor’s Dilemma, an annual competition held in May in Washington, D.C., grew higher. By the end, the team had wrestled with 45 conditions, symptoms, or treatments. They defeated 50 teams to win the 2016 Osler Cup.

The stakes are even higher for real-life diagnoses, where doctors always face time pressure. That is why researchers have tried since the 1960s to supplement doctors’ memory and decision-making skills with computer-based diagnostic aids. In 2012, for example, IBM pitted a version of its Jeopardy!-winning artificial intelligence, Watson, against questions from Doctor’s Dilemma. But Big Blue’s brainiac couldn’t replicate the overwhelming success it had against human Jeopardy! players.

The trouble is, computerized diagnosis aids do not yet measure up to the performance of human doctors, according to several recent studies. Nor can makers of such software seem to agree on a single benchmark by which to measure performance. Using reports on such software in the peer-reviewed literature, one team of researchers found wide performance variations across different diseases, as well as different usage patterns among doctors. For example, younger doctors are likelier to spend time putting more patient data into a tool and likelier to benefit from the aid. Two presentations at the 6–8 November Diagnostic Error in Medicine Conference in Hollywood, Calif., confronted the issue of how to realistically incorporate technological aids into doctor training and hectic diagnosis routines.

Another issue is figuring out how to compare different software aids. “If you look at, for example, the big progress that has occurred in speech recognition or in image classification, it's really been brought about by having really good benchmark data sets and really like having actual competitions,” says computer scientist Ole Winther at the Technical University of Denmark in Lyngby. “We don't have the same in the medical domain.”

While IBM did publish a report in 2013 on its Watson-vs-Doctor’s Dilemma test, Winther says that he has been unable to obtain the subset of questions IBM used, so he was unable to directly compare it to a diagnostic aid he and colleagues built, called FindZebra. Last year, his team estimated that both FindZebra and Watson list the correct diagnosis among their top 10 results about 60 percent of the time, which is in line with what a Spanish team reported earlier this year.

Despite the lack of a unified benchmark for computer-aided diagnostics, individual doctorsfamily members of misdiagnosed patients, and academic and clinical groups have built and are marketing such aids. Clients include private health insurance companies and research hospitals around the world–among them, a pair of medical facilities in North Carolina and Japan that have reported some success diagnosing patients with Watson. Yet, at a recent IBM Research event in Zurich, one of IBM’s clients, Jens-Peter Neumann of the Rhön-Klinikum hospital network in Germany, said that it is too early to estimate the potential cost savings of his team’s Watson collaboration.

In February 2016 the Rhön-Klinikum network began pilot-testing Watson against the ultimate challenge for any diagnostics aid: rare diseases. The 7,000 or so known rare diseases affect perhaps 7 percent of Europe’s population, according to Munich Re, an insurance and risk management firm. As genomic screening grows more sophisticated, insurer Munich Re predicts the discovery of over 1,000 more diseases by 2020. “Memorizing them all is just not going to happen,” says computer scientist and physician Tobias Mueller of the University Clinic Marburg in Germany, who is involved in the Rhön-Klinikum pilot.

Instead the team is structuring the natural-language medical histories of the 522 patients in the pilot into the right format for Watson, a time-consuming process that combines human and computer efforts. Watson can then compare these structured histories to the medical literature and suggest ranked diagnoses. 

One issue, Mueller says, has been consistently processing medical literature from both German and English. So far, the team have opted to use a combination of medical taxonomies, such as MedDRA and ICD10, to describe symptoms and diagnoses. He also notes that sometimes the knowledge sources fed into Watson contradict each other. In other words: computerized diagnosis aids are struggling with some of the same problems humans do when sharing and comparing information. “However, this reflects the diversity of the knowledge base of Watson and is no different than having a room full of doctors with different backgrounds and different opinions. It's more a strength, than a weakness,” Mueller says.

Despite the struggles, Winther says computer-aided diagnosis will ultimately mature: “A lot of patients spend years and years juggling between [general practitioners] and the wrong specialists. That’s still a challenge where there’s room for these kinds of tools.”

This post was updated on 15 November 2016 to clarify the timing and aims of the Rhön-Klinikum pilot study.

The Conversation (0)
Illustration showing an astronaut performing mechanical repairs to a satellite uses two extra mechanical arms that project from a backpack.

Extra limbs, controlled by wearable electrode patches that read and interpret neural signals from the user, could have innumerable uses, such as assisting on spacewalk missions to repair satellites.

Chris Philpot

What could you do with an extra limb? Consider a surgeon performing a delicate operation, one that needs her expertise and steady hands—all three of them. As her two biological hands manipulate surgical instruments, a third robotic limb that’s attached to her torso plays a supporting role. Or picture a construction worker who is thankful for his extra robotic hand as it braces the heavy beam he’s fastening into place with his other two hands. Imagine wearing an exoskeleton that would let you handle multiple objects simultaneously, like Spiderman’s Dr. Octopus. Or contemplate the out-there music a composer could write for a pianist who has 12 fingers to spread across the keyboard.

Such scenarios may seem like science fiction, but recent progress in robotics and neuroscience makes extra robotic limbs conceivable with today’s technology. Our research groups at Imperial College London and the University of Freiburg, in Germany, together with partners in the European project NIMA, are now working to figure out whether such augmentation can be realized in practice to extend human abilities. The main questions we’re tackling involve both neuroscience and neurotechnology: Is the human brain capable of controlling additional body parts as effectively as it controls biological parts? And if so, what neural signals can be used for this control?

Keep Reading ↓Show less
{"imageShortcodeIds":[]}