Researchers have developed a reliable early warning system for dengue fever outbreaks in Lahore, the capital city of the province of Punjab in Pakistan. Based on statistical analysis of dengue-related phone calls to a public health hotline, the system can track the incidence of symptoms down to the neighborhood and give local government officials a three-week heads up on potential outbreaks.
The forecasting system was developed by researchers at New York University and the Information Technology University in Lahore. Reserachers there worked with local Pakistani government officials to record and analyze over 300,000 calls to the hotline from 2012 to 2015. During that time, the system accurately predicted dozens of spikes in dengue cases. The results are described today in Science Advances.
“This is a lovely example of implementation science,” says Ann Kurth, an epidemiologist and dean of Yale University’s School of Nursing in Orange, Connecticut, who was not involved with the project. “Here’s a scientific team working closely with public health officials in Punjab to leverage its infrastructure and call system.”
An inexpensive forecasting system with neighborhood-level predictions is particularly helpful in developing countries. Their governments may not have the resources to collect and analyze disease incidence in real-time nor the funds to reduce mosquitoes everywhere all the time. With intra-city-level data, public health officials can concentrate their efforts on certain blocks or neighborhoods where the danger is greatest.
The hotline was set up by the Punjab provincial government in 2011, after a dengue outbreak killed more than 350 people and affected more than 21,000 in the province. The hotline lets residents report their symptoms and locations, standing water issues, and ask questions about availability of beds in treatment centers. It was advertised on television and at local meetings.
University researchers analyzed the information from the calls to get a sense of where in the city symptoms were occuring. “This was essentially crowd-sourced data,” says Kurth. “They didn’t have to pay anyone for it.”
The researchers then compared the information from the calls to the actual number of dengue cases reported by hospitals and the addresses of those patients. They found that incidence of dengue-related calls mirrored the number of actual cases in a neighborhood.
They then combined the call data with meteorological information and the number of hotline advertisements in a neighborhood. With those variables, the researchers were able to develop a system based on a learning algorithm called a random forest model to predict dengue outbreaks up to three weeks in advance.
“Random forest allows you to create a mixture of models that essentially takes the average or median case,” says Lakshminarayanan Subramanian, a computer scientist at NYU who worked on the system. “You don’t want one particular outlier data point to affect your algorithms too much,” and the random forest model addresses that, he says.
The model can be continually trained with new data so that it adapts to shifts in calling patterns. The three-week warning gives officials enough time to treat the at-risk area with pesticides and remove standing water to try to curb the outbreak.
The NYU-Lahore group is not the first to try a hotline or crowdsourcing-based forecasting system. Researchers at Google in 2008 claimed they could predict flu outbreaks by culling data from Google searches for flu-related information. The project, called Google Flu Trends failed spectacularly, missing the peak of the 2013 flu season in the United States by 140 percent.
After the Google Flu fail, “people realized that you can’t look at a single variable [like Google searches or call volume] and think it’s going to have enough predictive value,” says Kurth. The NYU-Pakistan model succeeded in part because it factored in other variables such as weather and hotline awareness, in addition to the call data, she says. “The specificity of the modeling is how it distinguishes itself from some of the other attempts,” she says.
Another key to their success can be attributed to the researchers’ direct link to government officials in Lahore, who facilitated the study. Subramanian’s key academic collaborator in Pakistan, Umar Saif, is also head of the Punjab Information Technology Board, which oversees all the IT activities for Lahore and the province of Punjab.
“We’re in a unique position where our collaborator is both the one running the system and is an academic,” Subramanian says. Still, the hotline-based forecasting system could be adopted to other regions, he says.
Fancier systems involving epidemiological data, mathematical modeling, and geographical mapping have been used globally to track outbreaks such as SARS, Ebola, influenza, and dengue. The models can potentially tell us how bad an epidemic might get, where it might spread, the impact different public health decisions might have, or where we should place a treatment center.
But high quality data to train those models isn’t always available in developing countries. Conventional disease surveillance techniques in Pakistan often involve paper-based reporting systems that take weeks to compile and are prone to error, says Nabeel Abdur Rehman, a PhD student at NYU and one of the researchers involved. “Once a governement has all the data from all the hospitals and is in a position to make a decision, the time is already passed. They are already two months late,” he says.
A next step for the reserachers is to determine what effect their system had on curbing dengue in Lahore. After the outbreak affecting 21,000 people in 2011, cases dropped to 257 in 2012 and 1,600 in 2013, and about a hundred cases in both 2014 and 2015. “At a high level, analysis of the containment activities seem to indicate that the hotline is having a huge impact,” says Subramanian. “We’re currently working towards scientifically [verifying] this impact.”