Six Creative Ways to Solve Biomedicine's Big Data Problem

Biomedical research generates an obscene amount of data. Many of the sensors, robots and other technologies IEEE Spectrum regularly profiles spew out terabytes to petabytes of data—and that’s only a sliver of the volume of health information stored in databases around the world.

Now, three funding agencies are trying to spur the development of tools and platforms to improve researchers’ ability to find, access and use that data. Yesterday at the 7^th Health Datapalooza conference in Washington, D.C., the National Institutes of Health, the U.K.-based Wellcome Trust, and the Howard Hughes Medical Institute announced six finalists for the first-ever Open Science Prize, a global science competition for prototype tools and platforms to tame biomedical’s big data behemoth.

Part of the problem with developing these kind of tools is that no one is sure who should be responsible for them. “Data is generated globally, but it’s essentially managed and funded nationally,” says Philip Bourne, associate director for data science at the NIH. In an effort to transcend international borders and fund data science in a new way, Bourne and colleagues at Wellcome and HHMI hatched a plan for the prize.

Launched last October, 96 teams spanning 45 countries entered the competition. Each team was required to have one member based in the U.S. and at least one other in another country.

Yesterday, an expert panel announced the six finalists who would receive $80,000 each to spend over the next nine months to develop their prototype. “We tried to pick things that had real promise, and where a small amount of money and a bit of publicity would really help,” says Bourne.

Here we take a peek at the six finalists, but be sure to stay tuned later this year to help pick a winner: In December, each team will demonstrate their prototype at a showcase, and the public will be invited to vote for their favorites. The winner will receive a grand prize of $230,000 to turn their idea into reality.

Without further ado, let’s meet the finalists:

Brainbox – The amount of brain imaging data available on the Internet is, well, mind-boggling. And compared to other types of data, neuroimaging data requires a substantial amount of human effort, such as curating and editing images. BrainBox is an online laboratory designed to give researchers easy access to brain imaging data (notably without downloading it) and to enable distributed collaboration so everyone can share in the effort.

NeuroArch – Despite valiant efforts to map the entire human brain, a more near-term goal is to map a smaller brain, such as that of a fruit fly—which shares more than 70 percent of the genes involved in human brain disorders. The Fruit Fly Brain Observatory project would develop an open graph database platform called NeuroArch to store and process information about the fly brain, including the location, shape, and connectivity of every neuron. With all that data in one place, it might be possible to generate a simulated fly brain and see what happens when it is altered via genetics or drugs.

MyGene2 – Rare diseases aren’t as rare as you think. More than 6,000 known rare diseases affect an estimated 25 million people in the U.S. today. Yet more than half of families who undergo genetic testing fail to get a diagnosis for a suspected rare disease. A website named MyGene2 provides a place for families and clinicians to share health and genetic information on rare diseases as a way to promote the diagnosis and discovery of new rare conditions and the genes that cause them.

Nextstrain – To intervene and stop the outbreak of an epidemic, scientists need to get their hands on genomic data from viral pathogens as soon as possible. The Nextstrain project pools genetic data from research groups around the world to visualize the spread of a virus in near real-time. For example, check out their graphic of the current evolution of the Zika virus.

OpenAQ – According to the World Health Organization, air pollution exposure is responsible for one in eight global deaths, yet air quality data has traditionally been stored on obscure websites that are difficult to access and have inconsistent formats. The OpenAQ platform prototype aggregates and standardizes publicly available, real-time air quality data. It has already collected and shared 9.7 million air quality measurements from 500+ locations in 13 countries.

OpenTrialsFDA – When the U.S. Food and Drug Administration approves a drug, the agency publically publishes a package of information about that drug, often including previously unpublished clinical trials. Though this information is quite valuable, it is notoriously difficult to access, aggregate and search. OpenFDA is an effort to build a user-friendly web interface to enable anyone to access the data, plus APIs to allow third-party platforms to tap into and search the data.

human os medical diagnostics big data biomedical nih

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Six Creative Ways to Solve Biomedicine's Big Data Problem

International teams compete to develop tools to harness biomedical big data

Entrepreneurship Program Expands to More Countries

Video Friday: Lobster Tail Turns Into Robotic Gripper

Are We Testing AI Intelligence the Wrong Way?

Related Stories

These Technologists Are Trying to Make COVID-19 Risk Assessment More of a Science

What Role Will At-Home COVID-19 Tests Play in an Increasingly Vaccinated World?

Quantum Computing Makes Inroads Towards Pharma

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Enjoy more free content and benefits by creating an account

Saving articles to read later requires an IEEE Spectrum account

The Institute content is only available for members

Downloading full PDF issues is exclusive for IEEE Members

Downloading this e-book is exclusive for IEEE Members

Access to Spectrum 's Digital Edition is exclusive for IEEE Members

Following topics is a feature exclusive for IEEE Members

Adding your response to an article requires an IEEE Spectrum account

Create an account to access more content and features on IEEE Spectrum , including the ability to save articles to read later, download Spectrum Collections, and participate in conversations with readers and editors. For more exclusive content and features, consider Joining IEEE .

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more about IEEE →

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to this e-book plus all of IEEE Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more about IEEE →

Access Thousands of Articles — Completely Free

Create an account and get exclusive content and features: Save articles, download collections, and post comments — all free! For full access and benefits, subscribe to Spectrum.

Six Creative Ways to Solve Biomedicine's Big Data Problem

International teams compete to develop tools to harness biomedical big data

Entrepreneurship Program Expands to More Countries

Video Friday: Lobster Tail Turns Into Robotic Gripper

Are We Testing AI Intelligence the Wrong Way?

Related Stories

These Technologists Are Trying to Make COVID-19 Risk Assessment More of a Science

What Role Will At-Home COVID-19 Tests Play in an Increasingly Vaccinated World?

Quantum Computing Makes Inroads Towards Pharma