Andrew Ng X-Rays the AI Hype

AI pioneer says machine learning may work on test sets, but that’s a long way from real world use

03 May 2021

2017 photo of computer scientist Andrew Ng at his office in Palo Alto, Calif.

Photo: Eric Risberg/AP

“Those of us in machine learning are really good at doing well on a test set," says machine learning pioneer Andrew Ng, “but unfortunately deploying a system takes more than doing well on a test set."

Speaking via Zoom in a Q&A session hosted by DeepLearning.AI and Stanford HAI, Ng was responding to a question about why machine learning models trained to make medical decisions that perform at nearly the same level as human experts are not in clinical use. Ng brought up the case in which Stanford researchers were able to quickly develop an algorithm to diagnose pneumonia from chest x-rays—one that, when tested, did better than human radiologists. (Ng, who co-founded Google Brain and Coursera, is currently a professor at Stanford University.)

There are challenges in making a research paper into something useful in a clinical setting, he indicated.

“It turns out," Ng said, “that when we collect data from Stanford Hospital, then we train and test on data from the same hospital, indeed, we can publish papers showing [the algorithms] are comparable to human radiologists in spotting certain conditions."

But, he said, “It turns out [that when] you take that same model, that same AI system, to an older hospital down the street, with an older machine, and the technician uses a slightly different imaging protocol, that data drifts to cause the performance of AI system to degrade significantly. In contrast, any human radiologist can walk down the street to the older hospital and do just fine.

“So even though at a moment in time, on a specific data set, we can show this works, the clinical reality is that these models still need a lot of work to reach production."

This gap between research and practice is not unique to medicine, Ng pointed out, but exists throughout the machine learning world.

“All of AI, not just healthcare, has a proof-of-concept-to-production gap," he says. “The full cycle of a machine learning project is not just modeling. It is finding the right data, deploying it, monitoring it, feeding data back [into the model], showing safety—doing all the things that need to be done [for a model] to be deployed. [That goes] beyond doing well on the test set, which fortunately or unfortunately is what we in machine learning are great at."

From Your Site Articles

Andrew Ng X-Rays the AI Hype

AI pioneer says machine learning may work on test sets, but that’s a long way from real world use

Better Biosensors Just Need a Touch of Cheap Plastic

Diluting organic semiconductors with polystyrene improves performance

Full Moon: Keeping the Far Side Quiet for Scientists

A lunar "gold rush" risks drowning out signals on the moon's far side

How Engineers at Digital Equipment Corp. Saved Ethernet

Their groundbreaking learning bridge technology increased LAN performance

Software Sucks, but It Doesn’t Have To

How to make leaner, greener software

These Electronic Textiles Don’t Need Chips or Batteries

Soft fibers light up and send wireless signals at the touch of a finger

Video Friday: LASSIE On the Moon

Your weekly selection of awesome robot videos

Stretchable Batteries Make Flexible Electronics More So

As tech demands stretchy and pliable form factors, batteries will adapt

A Novel IEEE Workshop Showcases Jamaica’s Engineering Community

Top leaders discussed tech careers and STEM education

AI Generates 3D City Maps From Single Radar Images

In the hours after a major disaster, these maps could save lives

How to Accelerate Wind Turbine Technology

A group in Germany outlines four key tests to get prototypes in the field sooner

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum