Bodies are fragile and prone to the eventual failure known as death. But the DNA that encodes all the instructions for creating and operating those perishable bodies? That stuff sticks around. Thanks to 7000-year-old DNA from a tooth found in a Spanish cave, for example, we know that the “caveman” who died there had blue eyes and was probably lactose intolerant.
Now Microsoft is making an investment in DNA data storage, a cutting-edge technique that takes advantage of genetic material’s durability and efficiency. The company has taken a chunk of data that would normally be stored in a file on a hard drive, and has translated it into the genetic code of As, Cs, Gs, and Ts that represent the chemical building blocks of DNA. Then it asked synthetic biology startup Twist Bioscience to manufacture 10 million DNA strands with those sequences of letters.
“They give us the DNA sequence, we make the DNA from scratch,” Twist CEO Emily Leproust says in a phone interview. Once the data has been transformed into invisible molecules at the bottom of a test tube, Twist will send it back to Microsoft for testing. The company can experiment with using the DNA for long-term data storage (there are ways to simulate the passing of millennia) and reading the data back out from those test tubes.
Interestingly, Twist doesn’t know what data it’s encoding. “We don’t have the decoder key, so I have no idea what it is,” says Leproust. Microsoft hasn’t divulged that information either.
Twist, a San Francisco-based biotech startup, uses machines of its own devising to mass-produce DNA. Its primary customers are researchers creating novel bits of DNA to stick into microbes. The DNA gives the microbes useful new properties such as, say, the natural ability to produce an industrial chemical typically manufactured from petroleum. But synthesizing DNA for data storage would open up an entirely new market.
The idea of storing data in DNA has intrigued researchers for decades, and got a serious push forward in 2012 when Harvard geneticist George Church encoded an entire book in DNA. As a storage medium, genetic material is not only durable, but also incredibly compact: A single gram of DNA can store almost a zettabyte (one trillion gigabytes) of digital data.
As our Information Age society generates ever-increasing amounts of data, researchers are worrying about where to put it, and how long it can be stored using existing technologies. DNA is one idea in the running for stashing away data that doesn’t need to be accessed often. Its read-out mechanism is DNA sequencing, which has become cheap and easy in the past two decades (the price of sequencing an entire human genome has dropped from about $1 billion in the year 2001 to about $1000 today).
Researchers are still looking into the error rate of coding and decoding information into DNA, and costs still have to come down quite a bit before it will be considered a real alternative to existing technologies. How much? “We have to lower the cost by about 10,000 times from where it is today,” says Leproust. But that’s no deal-breaker, she says; costs in electronics have dropped by millions of times.
As for Microsoft, the company seems willing to give it a go. “The initial test phase with Twist demonstrated that we could encode and recover 100 percent of the digital data from synthetic DNA,” says Doug Carmean of Microsoft Research in a press release. “We’re still years away from a commercially viable product, but our early tests with Twist demonstrate that in the future we’ll be able to substantially increase the density and durability of data storage.”
Eliza Strickland is a senior editor at IEEE Spectrum, where she covers AI, biomedical engineering, and other topics. She holds a master’s degree in journalism from Columbia University.