The first time geneticist Jef Boeke designed a synthetic chromosome, he sometimes wrote and edited its DNA sequence in a Microsoft Word document.
His goal was to create a slightly altered version of yeast chromosome 9, the shortest of the 16 chromosomes that make up the organism’s genome and contain all the operating instructions for life. He started with the short chromosome’s right arm, but even this task was daunting. Its DNA code consisted of 90,000 “letters,” the molecules referred to as A, C, G, and T that are arranged in particular sequence to encode biological function.
Painstakingly, Boeke went through the code, making changes that he thought would be scientifically interesting or that would make the chromosome more stable. This misery drove him to seek help from student Sarah Richardson in his neighbor Joel Bader’s lab, who wrote scripts to automate some of the most tedious steps. This was the embryonic beginning of what was to become the genome design software called BioStudio.
Once Boeke finished his design, the synthetic chromosome was constructed by taking short snippets of manufactured DNA and stringing them together. Then Boeke’s team checked the design by taking a normal yeast cell, swapping out its natural chromosome 9, and looking to see if it would keep functioning with a manmade chromosome inside. Nobody knew if it would work.
It did. The results were published in Nature in 2011, and the quest to build synthetic critters from scratch took a big step forward. Boeke’s team prepared to design the other 15 chromosomes to make a completely synthetic yeast—and the world’s first completely synthetic complex organism.
But the manual approach wasn’t scalable. The chromosome 9 project had involved 90,000 letters, a length denoted as 90 kb. The overall yeast genome was 12 million letters long, or 12Mb. “It was obvious right away that we needed something much more heavyweight,” Boeke says.
The results of their solution are now on display in the journal Science, which yesterday published seven papers from the synthetic Yeast 2.0 project. One of those papers describes their breakthrough enabling technology, the custom-built software program BioStudio.
Boeke, who leads the yeast project and serves as director of NYU’s Institute for Systems Genetics, oversaw the genome design. The papers published today describe that design process using BioStudio and also report on the completion of five new chromosomes by collaborators from around the world.
BioStudio allowed Boeke’s team to take the normal yeast genome and make the deletions, insertions, and changes they wanted, making genetic tinkering as easy as cut and paste. The program also includes a version control feature akin to Word’s track changes, recording each edit of the genome so it can easily be reversed if it’s later found to be detrimental to the yeast’s survival.
Nothing like BioStudio existed when Boeke asked Richardson to help him out. Then a PhD candidate at Johns Hopkins University (and now chief scientific officer at the synthetic biology startup MicroByre), Richardson says existing software focused on displaying long genome sequences and allowing researchers to annotate them as they laboriously figured out the purpose of various strings of DNA. When she asked around about adding an editing function to let researchers change those intricate sequences, she got shocked responses. “You would have thought I’d suggested abandoning a toddler at the mall,” she says.
Richardson worked with Boeke to create a genome editing software that was wrapped in a user-friendly web interface called Gbrowse. For a while, Boeke was the software’s only user, and he provided Richardson with plenty of frank feedback. “I’d say, it’s way too slow, it’s killing me!” he remembers. They achieved one big speed-up when they realized that every edit—even the insertion of just a few letters—was causing a cascade of updates throughout the entire genome. By localizing the update, Boeke says, the editing process got about 15 times faster.
Once BioStudio was fully up and running, Boeke’s team designed the full genome of what they call Sc2.0, referencing the scientific name for brewer’s yeast, Saccharomyces cerevisiae. Overall, their Sc2.0 genome design is 8 percent shorter than the original yeast genome, and it includes 1.1 Mb (or roughly a million) changes.
After finalizing this initial design, they asked collaborators around the world to take on the project of building specific chromosomes. They knew the design would continue to morph, as some of their initial changes would prove infeasible. But they also knew that all edits made by their collaborators would be captured in track changes.
The original edits came from a long list, Boeke says. “We spent something like eight months debating what changes to put on the list,” he says. “It’s fundamentally an arbitrary list of genetic changes we thought would be interesting.” But the team had to be careful not to push it too far: “We knew that with every change we made, we’d increase the risk that we’d kill the yeast,” he says.
BioStudio enabled the designers to make some major edits easily, explains Leslie Mitchell, a postdoc researcher in Boeke’s lab who took the lead on much of the genome design. With single keystrokes, she could make changes that would affect all the DNA in a chromosome. Some of these system-wide changes removed repetitive segments of DNA or took out pieces called transposons that make genomes more prone to mutation. Another added “watermarks” that would show up when the synthetic DNA was added to a normal yeast cell, making it obvious which parts of the cell were human-made.
After such broad-scale edits were done, Mitchell says, the designers could go in and look at each chromosome’s sequence in detail, making expert decisions about where they wanted to make further changes. Overall, she estimates, it took about an hour to edit 100 kb of DNA, so the 500-kb chromosome 5 took about 5 hours to design.
Richardson, the coder, remembers that the researchers had one more big ask for BioStudio, which had to do with DNA assembly. While synthetic biology companies now make it easy to order custom strings of manufactured DNA, those strings are typically fairly short. For the synthetic yeast project, the researchers would order strings of DNA that were only about 70 letters, or base-pairs, long. When those strings arrived in the lab, the researchers first assembled them into “building blocks” of about 750 bp, then put those building blocks together into into 2-4 kb “minichunks,” then constructed 10 kb “chunks,” and finally built 30-60 kb “megachunks.”[shortcode ieee-pullquote quote="Synthetic biology lends itself to engineering's classic "design-build-test" cycle." float="left" expand=1]
But there are genetic constraints on how strings of DNA can be assembled. The researchers wanted BioStudio to take any long DNA sequence and make it “modular,” chopping it up into pieces that could be ordered from the DNA-makers and then patched together in that series of assembly steps. “They wanted to be able to push a button when they were done with their edits, and have the genome slot itself into an assembly pattern,” Richardson remembers. “That was the craziest thing they asked for.”
Synthetic biology lends itself to engineering’s classic “design-build-test” cycle. For Sc2.0, megachunks of the designer chromosomes were built and inserted into normal yeast cells to test whether they interfered with its life functions. If the yeast cell died or displayed abnormal behavior, the researchers embarked on a debugging process.
In one type of debugging, they would make many yeast colonies with many different combinations of synthetic megachunks and watch to see which colonies failed, then look for the common denominator in those failures. Mitchell, who led the work on designing and debugging chromosome 6, explains that there were different sorts of bugs. The most interesting were those that arose from genome changes they’d made that they expected to be harmless—because those bugs taught the researchers something about yeast biology.
Boeke says that so far, the team has found a bug in their genome design roughly every 300 kb. “But there may be more, we may not have found them all yet!” he says. With most of the synthetic yeast chromsomes still under construction, he’s still expecting surprises. “It’s like when you release code and wait for the user feedback,” he says.
The synthetic yeast project is on track to complete all 16 chromosomes by the end of 2017. Then the team will turn to the task of putting all the chromosomes into a single cell, and seeing if it still functions as a yeast cell should. That process may yield still more bugs, Mitchell says. “It might be that individual changes on two chromosomes are well tolerated, but they don’t work when you put them together,” she says. “We may potentially have to track bugs across chromosomes.”
While BioStudio has been invaluable for the synthetic yeast project, the researchers aren’t sure whether it will be useful for other synthetic biology projects. “If you want to make the kinds of changes we made for yeast, it’s very straightforward,” says Mitchell, “but for other types of changes you’d have to write the code.” The software is open source, she notes, so interested parties could build on it.
Whether it’s BioStudio or another program, the fast-growing field of synthetic biology will need software to help geneticists explore this new design frontier: the design of life itself.
Some synthetic biology startups are trying to adapt simple organisms like yeast to make them produce useful products, such as biofuels, vaccines, or even perfume. Other researchers are more interested in constructing whole critters from scratch, in hopes of gaining new insights into the mechanics of life in the process.
The first completely synthetic genome was bacterial, constructed at the J. Craig Venter Institute in 2010; its single-chromosome measured 1 Mb in length. From that start, the 12-Mb yeast genome marks a big step up. And Boeke is part of a group that has proposed to scale up considerably from the single-celled yeast. Last June they called for the creation of a synthetic human genome as part of a massive project to develop DNA assembly technology; they published an article in the journal Science that suggested a $100 million investment to get the project off the ground.
The human genome clocks in at 3 billion letters, or 3 Gb. To tackle that project, genome designers and coders may have to get together for a Synthetic Bio Hackathon.