Data Mining for Flu Symptoms

Last Tuesday, Google opened its free Internet service called Flu Trends which it claims can let you know whether â''the number of influenza cases is increasing in areas around the U.S., earlier than many existing methods,â'' according to a report in the Wall Street Journal.

How does Google do it? According to its website, â''We have found a close relationship between how many people search for flu-related topics and how many people actually have flu symptoms. Of course, not every person who searches for "flu" is actually sick, but a pattern emerges when all the flu-related search queries from each state and region are added together. We compared our query counts with data from a surveillance system managed by the U.S. Centers for Disease Control and Prevention (CDC) and found that some search queries tend to be popular exactly when flu season is happening. By counting how often we see these search queries, we can estimate how much flu is circulating in various regions of the United States.â''

Google built Flu Trends with guidance from the CDC.

This is not the first time the Internet has been trolled for data about the status of a disease, according to the WSJ story: â''In 2003, the Canadian government and other organizations used versions of these data collection and health sites to detect early signs of the SARS virus in China.â''

In addition, â'' At Harvard Medical School's Children's Hospital Boston, a site called HealthMap crawls through 24,000 Web sites looking for disease related terms. Results appear on a world map, which has colored markers for dengue fever, avian flu and other diseases.â''

It would be interesting to compare the Google flu tracker results with the disease surveillance information data mined from the DoD AHLTA electronic health record system.


Risk Factor

Robert Charette
Spotsylvania, Va.
Willie D. Jones
New York City