Machine Learning Tool Can Spot Mutations in Tumors

New method of identifying unique genetic changes in tumors could lead to more precise cancer treatments

2 min read
Photo-illustration of cancer cell lymphocyte T and DNA helix background sequencing on a computer in a lab
Photo-illustration: iStockphoto

Cancerous tumors tend to have a life of their own. They grow and evolve constantly, and so does their DNA. Exactly how tumor DNA changes is important information because it influences doctors’ treatment decisions.

There are technologies out there that can perform this type of complicated analysis. The DNA of a tumor sample is sequenced, and then a combination of computational tools and human experts analyze the data to figure out what kinds of genetic changes, or mutations, are occurring.

But none of these existing tools is completely accurate, according to a report published today in Science Translational Medicine. In an effort to remedy that, the authors of the report say they have developed a new method involving machine learning that automates the tumor DNA diagnostic process.

“It’s underappreciated how difficult it is to identify the true mutations in clinical tumor specimens,” says Samuel Angiuoli, coauthor of the report and chief information officer at Personal Genome Diagnostics in Baltimore. “Our machine learning approach improves the accuracy of that identification,” compared with existing techniques, he says. 

​With that information—the type, number, and location of mutations in a tumor—doctors can choose a therapy that is specific to the type of tumor. Some of those therapies already exist on the market. One drug, called vemurafenib, specifically treats skin cancer cells that have a mutation in a gene called BRAF. And many other mutation-specific therapies are in development. 

Of course these therapies are more likely to work if the mutations in the tumors can be correctly identified. That’s not as straightforward as it sounds. The sheer size of sequencing data makes it easy to miss small genetic changes. Plus there’s a significant amount of noise in that data. Lab prep methods and the sequencing machines themselves can introduce artifacts that look like genetic alterations. And there are decoy DNA mutations that can be present in a cell but are not important for tumor identification. These false positives are tricky to filter out.

Computerized tools help, but teams of human reviewers are often needed to ensure the results are of high quality. That puts these cancer diagnostic tools in centralized labs, and far away from patients. “Our goal is to develop a kit, including software, that can run anywhere in the world without the need for expert review,” Angiuoli says.

His company’s new tool, dubbed Cerebro, automates the job using an ensemble of algorithms called random forest classifiers. This traditional machine-learning technique works by evaluating a large set of decision trees to generate a confidence score for each candidate mutation—a way to judge whether a variant in the tumor DNA is a true positive.   

Angiuoli and his team trained Cerebro using millions of real-world and in silico mutations. They then pitted Cerebro head-to-head against several existing cancer mutation identification methods and found that the machine learning technique was more accurate in almost every circumstance. 

“The improvements [to mutation identification software] matter and have clinical implications,” says Angiuoli. That’s particularly true as more DNA-specific cancer therapies continue to become available on the market, he says.

Angiuoli’s company, called PGDx for short, spun out of Johns Hopkins University in 2010 with the aim of developing proprietary algorithms to identify alterations in cancer genomes. The company plans to take its products to the U.S. Food and Drug Administration (FDA) in the hope of receiving market approval. 

The Conversation (0)

Are You Ready for Workplace Brain Scanning?

Extracting and using brain data will make workers happier and more productive, backers say

11 min read
Vertical
A photo collage showing a man wearing a eeg headset while looking at a computer screen.
Nadia Radic
DarkGray

Get ready: Neurotechnology is coming to the workplace. Neural sensors are now reliable and affordable enough to support commercial pilot projects that extract productivity-enhancing data from workers’ brains. These projects aren’t confined to specialized workplaces; they’re also happening in offices, factories, farms, and airports. The companies and people behind these neurotech devices are certain that they will improve our lives. But there are serious questions about whether work should be organized around certain functions of the brain, rather than the person as a whole.

To be clear, the kind of neurotech that’s currently available is nowhere close to reading minds. Sensors detect electrical activity across different areas of the brain, and the patterns in that activity can be broadly correlated with different feelings or physiological responses, such as stress, focus, or a reaction to external stimuli. These data can be exploited to make workers more efficient—and, proponents of the technology say, to make them happier. Two of the most interesting innovators in this field are the Israel-based startup InnerEye, which aims to give workers superhuman abilities, and Emotiv, a Silicon Valley neurotech company that’s bringing a brain-tracking wearable to office workers, including those working remotely.

Keep Reading ↓Show less
{"imageShortcodeIds":[]}