Andrew Ng X-Rays the AI Hype

AI pioneer says machine learning may work on test sets, but that’s a long way from real world use

03 May 2021

2017 photo of computer scientist Andrew Ng at his office in Palo Alto, Calif.

Photo: Eric Risberg/AP

“Those of us in machine learning are really good at doing well on a test set," says machine learning pioneer Andrew Ng, “but unfortunately deploying a system takes more than doing well on a test set."

Speaking via Zoom in a Q&A session hosted by DeepLearning.AI and Stanford HAI, Ng was responding to a question about why machine learning models trained to make medical decisions that perform at nearly the same level as human experts are not in clinical use. Ng brought up the case in which Stanford researchers were able to quickly develop an algorithm to diagnose pneumonia from chest x-rays—one that, when tested, did better than human radiologists. (Ng, who co-founded Google Brain and Coursera, is currently a professor at Stanford University.)

There are challenges in making a research paper into something useful in a clinical setting, he indicated.

“It turns out," Ng said, “that when we collect data from Stanford Hospital, then we train and test on data from the same hospital, indeed, we can publish papers showing [the algorithms] are comparable to human radiologists in spotting certain conditions."

But, he said, “It turns out [that when] you take that same model, that same AI system, to an older hospital down the street, with an older machine, and the technician uses a slightly different imaging protocol, that data drifts to cause the performance of AI system to degrade significantly. In contrast, any human radiologist can walk down the street to the older hospital and do just fine.

“So even though at a moment in time, on a specific data set, we can show this works, the clinical reality is that these models still need a lot of work to reach production."

This gap between research and practice is not unique to medicine, Ng pointed out, but exists throughout the machine learning world.

“All of AI, not just healthcare, has a proof-of-concept-to-production gap," he says. “The full cycle of a machine learning project is not just modeling. It is finding the right data, deploying it, monitoring it, feeding data back [into the model], showing safety—doing all the things that need to be done [for a model] to be deployed. [That goes] beyond doing well on the test set, which fortunately or unfortunately is what we in machine learning are great at."

From Your Site Articles

Andrew Ng X-Rays the AI Hype

AI pioneer says machine learning may work on test sets, but that’s a long way from real world use

Phone Keyboard Exploits Leave 1 Billion Users Exposed

Popular Chinese-language keyboard apps reveal leaky security standards

An Engineer Who Keeps Meta’s AI infrastructure Humming

Susana Contrera helps build the infrastructure for the company’s AI-research data centers

Solar Fuel Production Just Needs a Change in Direction

Photoelectrodes that move electric charge diagonally split water better

Electronically Assisted Astronomy on the Cheap

Take surprisingly detailed images of the heavens with budget hardware

Will Human Soldiers Ever Trust Their Robot Comrades?

The Pentagon’s “trust engineers” are probing warfighters’ attitudes

Video Friday: RACER Heavy

Your weekly selection of awesome robot videos

As Ukraine Builds New Reactors, Renewables Beckon

Wind turbines and gas-fired generators are easy to build and hard to target

Travels with Perplexity AI

Generative search wouldn’t exist without the Internet Bob Kahn helped create

This IEEE Society’s Secret to Boosting Student Membership

The budding engineers serve on its boards and have voting privileges

Why Haven’t Hoverbikes Taken Off?

Physics aside, personal flying craft's rotors pose huge safety risks

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum