Web-scouring algorithms are aiding the surveillance of a deadly vaping-related lung disease. The online tool, called HealthMap, first spotted the disease on 25 July, according to its curators. That’s nearly a month before U.S. federal officials announced an investigation into the e-cigarette–related illness.
Since then, HealthMap’s case counts have lined up closely to that of the feds at the U.S. Centers for Disease Control and Prevention (CDC). In its most recent update, which was based on data collected through 8 October, the agency reported 1,299 confirmed and probable cases of the lung illness; HealthMap counted 1,305 up to the same date.
The accuracy of HealthMap suggests that such web-based tools are a viable addition to traditional surveillance methods. “We see it not as a replacement [to traditional warning systems], but as a supplement,” says Yulin Hswen, a research fellow at Boston Children’s Hospital, Harvard Medical School. “It gives you a more comprehensive picture of everything that’s going on, and in real time,” she says.
HealthMap works by combing online news reports, social media and announcements from local officials, looking for natural language related to infectious and other public health-related diseases. Machine learning algorithms make sense of the information, and the system maps it geographically on an interactive map. HealthMap researchers manually curate the data, removing duplicate reports.
To further improve reporting on the state of vapers’ lungs, Hswen and her colleagues will launch this Thursday a free app called YouVape. E-cigarette users can answer questions on the app to find out whether their symptoms match those of the deadly respiratory disease. Their answers will provide Hswen and her colleagues a larger cache of data on vaping usage and potential disease.
“We’re using participatory citizen science surveillance,” says Hswen. “We’ve been collaborating with the CDC on the type of questions they would want to ask.” Users’ responses will be run through “clinically evaluated evidence-based algorithms to identify potential links between vaping and lung injury,” she says.
Hswen says she hopes the data will also help determine what exactly is causing the disease. “We’ll ask questions about what they’re vaping, how they’re vaping, the frequency they’re vaping—questions that will help tease out causality,” Hswen says.
Researchers have not been able to identify a singular vaping ingredient that is causing the lung illness. Some patients with the lung illness have reported vaping exclusively THC-containing products, while others have reported vaping exclusively nicotine-containing products, according to the CDC. The number of confirmed and probable cases of the malady ballooned from 380 on 11 September to nearly 1,300 as of 8 October.
HealthMap and other digital surveillance tools such as ProMED and GPHIN have proven to be early warning systems in previous epidemics too. During polio outbreaks in 2013 and 2014, digital reports preceded official reports from the World Health Organization (WHO) by an average of two weeks. And during the Ebola outbreak of 2014, HealthMap spotted a news report describing a “mystery hemorrhagic fever” in Guinea that turned out to be the disease. That was nine days before the WHO sounded the alarm.
But other online systems haven’t worked so well. Google Flu Trends provided estimates of influenza activity by aggregating Google search queries, but consistently overestimated flu prevalence. Google no longer publishes flu estimates. “The issue was noise,” says Hswen. With Google Flu Trends “there were a lot of false positives because of the streams of media coming in that weren’t actual cases.”
HealthMap’s data on the vaping-related lung illness is not yet publicly available on its interactive map, as the team is working on a dedicated site for vaping.