The July 2022 issue of IEEE Spectrum is here!

Close bar

Andrew Ng X-Rays the AI Hype

AI pioneer says machine learning may work on test sets, but that’s a long way from real world use

2 min read
2017 photo of computer scientist Andrew Ng  at his office in Palo Alto, Calif.
Photo: Eric Risberg/AP

“Those of us in machine learning are really good at doing well on a test set," says machine learning pioneer Andrew Ng, “but unfortunately deploying a system takes more than doing well on a test set."

Speaking via Zoom in a Q&A session hosted by DeepLearning.AI and Stanford HAI, Ng was responding to a question about why machine learning models trained to make medical decisions that perform at nearly the same level as human experts are not in clinical use. Ng brought up the case in which Stanford researchers were able to quickly develop an algorithm to diagnose pneumonia from chest x-rays—one that, when tested, did better than human radiologists. (Ng, who co-founded Google Brain and Coursera, is currently a professor at Stanford University.)

There are challenges in making a research paper into something useful in a clinical setting, he indicated.

“It turns out," Ng said, “that when we collect data from Stanford Hospital, then we train and test on data from the same hospital, indeed, we can publish papers showing [the algorithms] are comparable to human radiologists in spotting certain conditions."

But, he said, “It turns out [that when] you take that same model, that same AI system, to an older hospital down the street, with an older machine, and the technician uses a slightly different imaging protocol, that data drifts to cause the performance of AI system to degrade significantly. In contrast, any human radiologist can walk down the street to the older hospital and do just fine.

“So even though at a moment in time, on a specific data set, we can show this works, the clinical reality is that these models still need a lot of work to reach production."

This gap between research and practice is not unique to medicine, Ng pointed out, but exists throughout the machine learning world.

“All of AI, not just healthcare, has a proof-of-concept-to-production gap," he says. “The full cycle of a machine learning project is not just modeling. It is finding the right data, deploying it, monitoring it, feeding data back [into the model], showing safety—doing all the things that need to be done [for a model] to be deployed. [That goes] beyond doing well on the test set, which fortunately or unfortunately is what we in machine learning are great at."

The Conversation (0)

AI-Guided Robots Are Ready to Sort Your Recyclables

Computer-vision systems use shapes, colors, and even labels to identify materials at superhuman speeds

11 min read
Vertical
An image of different elements of trash with different markings overlaying it.
AMP robotics
Green

It’s Tuesday night. In front of your house sits a large blue bin, full of newspaper, cardboard, bottles, cans, foil take-out trays, and empty yogurt containers. You may feel virtuous, thinking you’re doing your part to reduce waste. But after you rinse out that yogurt container and toss it into the bin, you probably don’t think much about it ever again.

The truth about recycling in many parts of the United States and much of Europe is sobering. Tomorrow morning, the contents of the recycling bin will be dumped into a truck and taken to the recycling facility to be sorted. Most of the material will head off for processing and eventual use in new products. But a lot of it will end up in a landfill.

Keep Reading ↓Show less
{"imageShortcodeIds":[]}