Machine Learning Could Predict Outbreaks by Identifying Dangerous Rodents

null
Photo: Holly Vuong
This northern flying squirrel may look cute, but computer models predict it to be a disease vector.

Outbreaks of zoonotic diseases follow a depressing pattern: Somewhere, people come in contact with an animal bearing a disease that’s compatible with human biology. These infected people return to their communities and spread the illness.  Once public health authorities get wind of the outbreak, an all-out scramble begins to determine where the disease came from and how it’s spreading.

A team of computer modelers hope their research will put an end to that reactive model. They want to predict outbreaks, and hope to help prevent crises like the Ebola epidemic that ravaged West Africa in the past year.

Disease ecologist Barbara Han and her colleagues have taken a big data approach to identify animals that serve as disease reservoirs—animals that harbor viruses or bacteria that can be transmitted to humans. In a study published last week, the scientists fed information about all 2277 known rodent species into their computer models, and used machine-learning algorithms to identify more than 50 species that may be disease reservoirs.

The scientists drew information from several sources, including the massive PanTHERIA database, which lists what’s known of mammalian species’ physiology, behavior, range, social structure, and so forth. PanTHERIA is a “painstaking collection” of data from thousands of individual field studies, Han tells IEEE Spectrum

Although the PanTHERIA compilation will probably never be comprehensive, the researchers’ computer models could work with the incomplete data set. “This was one strength of this approach,” says Han. “If you tried to wait until you know everything, you’d be paralyzed.” The researchers also used public health databases of species that harbor zoonotic diseases.

The researchers first used information about known rodent disease reservoirs to train their computer model. “What the algorithm is doing is picking out the key features that repeatedly show themselves to be predictors of a species being a disease reservoir,” Han explains.

Once the model had created that “profile” of a disease-bearing rodent, the researchers tested its ability to distinguish between reservoir and non-reservoir species. The model’s 90 percent accuracy rate gave them confidence to try it with rodent species whose reservoir status was unknown. By identifying more than 50 new species as highly likely to carry zoonotic diseases, the researchers have provided hypotheses that field researchers can test. “Our predictions are a jumping off point,” Han says. A couple of voles and a grasshopper mouse are at the top of the suspect list. 

Interestingly, the model didn’t pick out species that were closely related to each other. Instead, the most predictive feature was having what Han calls “a fast-paced strategy for life.” These are short-lived animals that reach sexual maturity quickly, and then reproduce prolifically.

This jibes with previous scientific findings about these fast-paced animal species and their immune systems. Such species may “allocate their resources to reproduction,” Han says, and have an immune system that isn’t very effective, allowing them to harbor diseases. Those diseases may not be of much concern to the species, as they’re “more concerned with knocking out as many babies as possible,” Han explains.  

It seems that Han’s computer models have given humanity a good tip: Beware of animals that live fast and die young. 

Advertisement

Tech Talk

IEEE Spectrum’s general technology blog, featuring news, analysis, and opinions about engineering, consumer electronics, and technology and society, from the editorial staff and freelance contributors.

Newsletter Sign Up

Sign up for the Tech Alert newsletter and receive ground-breaking technology and science news from IEEE Spectrum every Thursday.

Advertisement