AlphaGo became the first household AI name by teaching itself to play the ancient Chinese game Go and then beating the world’s best human player. Self-driving cars use AI systems to learn to park or merge into traffic by practicing the maneuvers over and over until they get it right.
It’s clear that AI programs are good at training themselves to win, maximize, or perfect. But what if success means striking a balance?
In cancer treatments, doctors endeavor to dose patients with enough drugs to kill as many tumor cells as possible but as few patient cells as possible. In other words, they balance shrinking a tumor with minimizing side effects.
“We said, ‘Wait. This sounds like a machine-learning search problem and optimization issue,’ ” says Pratik Shah, an MIT Media Lab principal investigator. “We thought we could do something to understand the process better.”
Today at the 2018 Machine Learning for Healthcare conference at Stanford University, Shah and researcher Gregory Yauney will present a self-learning artificial intelligence model that undertakes that balancing act. Trained with real patient data, their AI model predicts individual dosing regimens that shrink tumors while minimizing side effects for a deadly form of brain cancer.
Currently doctors typically choose how to dose a patient using protocols based on animal studies and past clinical trials, some from the 1950s or earlier. Shah figured there was room for improvement. He and Yauney decided to focus on glioblastoma, a type of brain cancer, because it is treated aggressively with large doses of chemotherapy and radiation therapy to shrink the tumors as quickly as possible. That also means the treatment can make patients very, very sick.
“Our goal was to reduce toxicity and dosing for patients who unfortunately have this disease,” says Shah.
It wasn’t an easy goal to achieve. First, they approached it as a classification problem that needed a supervised neural network. That didn’t work. “The search data is too large, and even humans don’t know the right answer,” says Shah. Next, they thought maybe it was an automation problem: The system just needed to be told what to do. It wasn’t. That system simply threw drugs at the problem, suggesting more doses than seemed appropriate.
Ultimately, they turned to reinforcement learning (RL), a technique in which an AI agent learns through trial and error to favor a certain behavior in order to maximize a reward. That’s how Google’s DeepMind wins at so many games, for example.
But in this case, the researchers tweaked the RL model, creating what they call an “unorthodox approach.” Instead of pointing the AI agent toward a single goal, like winning a game of Go or parking a car, they programmed it to strike a balance between shrinking a tumor as much as possible and reducing the size and number of doses by action-derived rewards.
The model was trained with clinical trial data from 50 glioblastoma patients. The AI conducted about 20,000 trial-and-error test runs on simulated versions of those patients. Whenever the model initiated a dose, it would be rewarded if that dose shrunk the overall tumor size. Yet if the model chose to administer the full set of doses, it was penalized, encouraging the model to choose fewer, smaller doses when possible.
The researchers then tested the model on 50 new simulated patients. Compared to treatment regimens currently used by doctors, the regimens designed by the self-taught AI achieved significant tumor reduction while lowering the frequency and size of drug doses.
The technique has yet to be tested in real patients. The researchers are currently in discussions with regulatory agencies and academic hospitals about selecting clinical trial sites to try out the AI. Shah imagines doctors and patients might use it as a recommendation for possible dosing options, though final decisions for the foreseeable future would remain in the doctor’s hands.