The December 2022 issue of IEEE Spectrum is here!

Close bar

World's Largest Dataset on Human Genetic Variation Goes Public

National Institutes of Health finds one way to manage big data for biomedical science

2 min read
World's Largest Dataset on Human Genetic Variation Goes Public


The entire contents of the National Institutes of Health's 1000 Genomes Project—all 200-terabytes of it—will be made freely available to the public, the agency announced today. The project is touted as the world's largest set of data on human genetic variation. Amazon's cloud computing unit, Amazon Web Services,  will store the database
The project aims to provide a foundation for investigating how human genetic variation contributes to health and disease. Making the whole thing available for free means more scientists can use the data and, hopefully, conclusions about the relationship between genotype and such diseases as cancer and diabetes will be drawn at an accelerated rate. 
The project was initiated in 2008 and is based on the genomes of more than 2600 people from 26 populations around the world. Results from sequencing the DNA of 1700 of those people will be released on cloud now. The remaining 900 samples will be sequenced this year. 
The NIH's initiative is part of a larger movement to manage the deluge of "big data" in science,  which has become a scientific discipline in itself. Such data sets have become so massive that few researchers have the computing power to use them. The NIH has calculated that the 1000 Genomes Project is the equivalent of 16 million file cabinets filled with text, or more than 30 000 standard DVDs.
Making it available on cloud is a good deal for scientists and their institutions, who won't have to take on the costs of acquiring more bandwidth, data storage and analytical computing capacity just to access the data. "This means researchers and labs of all sizes and budgets have access to the complete 1,000 Genomes Project data and can immediately start analyzing and crunching the data without the investment it would normally require in hardware, facilities and personnel," says Deepak Singh, a principal product manager at Amazon Web Services. "Researchers can focus on advancing science, not obtaining the resources required for their research."
It may also end up being a good deal for Amazon Web Services (A.W.S.). Manipulating this much information requires a lot of computing power, and A.W.S. will be charging for additional resources that can be used to further process or analyze the data, reports the  New York Times
The White House, for its part, sees the 1000 Genomes Project on cloud as one example of the kind of solutions it is proposing through its  Big Data Research and Development Initiative [pdf]. The Office of Science and Technology Policy announced today that more than $200 million will be doled out to six federal agencies in an effort to make the most of the mountains of data being created for scientific discovery, environmental and biomedical research, education, and national security. 





The Conversation (0)

Are You Ready for Workplace Brain Scanning?

Extracting and using brain data will make workers happier and more productive, backers say

11 min read
A photo collage showing a man wearing a eeg headset while looking at a computer screen.
Nadia Radic

Get ready: Neurotechnology is coming to the workplace. Neural sensors are now reliable and affordable enough to support commercial pilot projects that extract productivity-enhancing data from workers’ brains. These projects aren’t confined to specialized workplaces; they’re also happening in offices, factories, farms, and airports. The companies and people behind these neurotech devices are certain that they will improve our lives. But there are serious questions about whether work should be organized around certain functions of the brain, rather than the person as a whole.

To be clear, the kind of neurotech that’s currently available is nowhere close to reading minds. Sensors detect electrical activity across different areas of the brain, and the patterns in that activity can be broadly correlated with different feelings or physiological responses, such as stress, focus, or a reaction to external stimuli. These data can be exploited to make workers more efficient—and, proponents of the technology say, to make them happier. Two of the most interesting innovators in this field are the Israel-based startup InnerEye, which aims to give workers superhuman abilities, and Emotiv, a Silicon Valley neurotech company that’s bringing a brain-tracking wearable to office workers, including those working remotely.

Keep Reading ↓Show less