The Year Is 2007, And I'M Sitting At Home, drinking a cup of tea, and observing a galaxy millions of light-years away. No, I don't have a fortune in cutting-edge astronomical instruments, just a personal computer and a reasonably fast Internet connection.
Scattered across the screen is a handful of images, each showing that same galaxy but at a different wavelength. The visible-light image, a five-year-old photo from the twin Keck telescopes on Mauna Kea in Hawaii, shows the classic galactic pinwheel, spiral arms twisting out from a dense, starry center. In the infrared image, captured just a few seconds ago by a mountaintop telescope in Arizona, the galaxy looks more like a series of concentric rings, the telltale signs of dust-filled regions where stars are born. A radio image from a space-based telescope also shows a bright ring, but in this case it signifies the energy thrown off by countless exploding stars. Seen in the X-ray portion of the spectrum, the galaxy's rings are completely lost, replaced by a bright central core--probably a black hole.
As I superimpose the different images, I spot something peculiar: a faint, curved wisp of infrared gas next to a bright X-ray star. Zooming in, I realize that the shock wave from a supernova explosion has smashed into a gas cloud and triggered the formation of a batch of baby stars. My fingers tremble as I dash off a message to order up a new set of images....
It's all a dream now, unfortunately. When I pore over data on my computer nowadays, even at work, I see the same information I've been chewing over for weeks or months. No instant access to new data, no effortless comparing of multiple views of the universe. Though all those other images may exist in the public domain, they're stored away in vast databases at research institutions around the world, locked up in computers that speak different languages, use different data-storage formats, and even identify the same celestial bodies by different names. Getting those images takes days or weeks of fiddling and analysis--no astronomer can pull all those streams of data together in an easy way.
Soon, though, we'll be able to. An international collaboration of astronomers and computer scientists is now piecing together the means to connect all those dispersed stores of data--many trillions of bytes' worth, collected over the last several decades by hundreds of ground-based and orbiting observatories in thousands of archives. Their efforts will create, in effect, the world's biggest and best telescope. Known as the Virtual Observatory, or VO, it will allow astronomers, as well as students and the general public, to easily locate and download research data over the Internet. The VO will also serve as a grid computing network, giving researchers, regardless of location or resources, the equivalent of a supercomputer on their desktops, for comparing billion-record archives or running large-scale simulations [for more on cosmological simulations, see "Computing the Cosmos" in this issue].
The VO will transform how we view the universe. With our eyes, we can see only a tiny fraction of the light that makes up the night sky. But astronomical objects shine in every portion of the electromagnetic spectrum--optical, infrared, radio, X-ray, gamma ray, and more [see box, The Spectrum of Astronomy"]. Each band of light reveals distinct physical processes. For example, infrared radiates from the cold gas and dust clouds around forming stars, while X-rays are generated by matter cooling in the fireball of a supernova. Only by fusing together these different clues can we get deep insights into the underlying processes driving our universe [see photos, " The All-Seeing Eye"].
Ultimately, the Virtual Observatory will alter the course of discovery. Astronomers will no longer be confined to working with one or two types of instruments, and they'll be freed from the tedious searching and gathering together of data that accompany current efforts. By allowing rapid comparisons of enormous quantities of disparate data, the VO will make it possible to get comprehensive views of large-scale processes at work in the universe, shedding light on some of the most fundamental questions: how did the universe evolve? When did the stars first form? How many different kinds of galaxies are there? By giving researchers the means to comb quickly through enormous databases of images and catalogs and then compare the results, the observatory will also let them pinpoint rare events, such as the sudden, quick gamma-ray burst that occurs when certain stars die. Computer scientists are likewise betting that they can apply the cutting-edge technologies developed for the VO to other undertakings--from drug discovery to aerospace design--that require moving and manipulating huge amounts of data.
The VO encompasses a patchwork of projects organized under the International Virtual Observatory Alliance. The alliance includes more than 200 astronomers and computer scientists in at least 13 countries. In the United States, the VO effort is led by astronomers Alex Szalay at Johns Hopkins University in Baltimore and Roy Williams at the California Institute of Technology in Pasadena, assisted by computer scientist Jim Gray at the Microsoft Bay Area Research Center in San Francisco. The Europeans are led by Peter Quinn at the European Southern Observatory in Garching, Germany, and Françoise Genova at the Stellar Data Center in Strasbourg, France.
Through the VO's various working groups, the scientists are hammering out standards to make the archives interoperable, outlining the necessary IT infrastructure, and defining the VO's scientific goals. Compared with advanced astronomical instruments, which can cost several hundred million dollars to build and launch, the VO is operating on a shoestring: about US $30 million over five years.
The main pieces of a working global system are expected to be in place within two years. An early demonstration offered a tantalizing glimpse of what's possible: in 2002, astronomer Bruce Berriman and colleagues at NASA's Infrared Processing and Analysis Center, based at Caltech, used a VO matching algorithm to compare tens of millions of entries from two of the larger data archives, the visible-light Sloan Digital Sky Survey and the infrared Two Micron All Sky Survey. By looking for objects that are bright in the infrared but invisible in the optical, the hallmark of the elusive brown dwarf star, they quickly narrowed down the list to several hundred thousand objects and then to just a handful. When the search was finished, they'd identified a new brown dwarf.
Szalay, whose specialty is galaxy formation and who helped design the database architecture for the Sloan Digital Sky Survey, points to the VO as evidence of a broader trend now reshaping science. "Traditionally, science was entirely phenomenological and descriptive," he says. "Now, the quantity of scientific data is so enormous that dealing with data is a whole new discipline in itself--this is happening in every branch of science. You need to combine information management, computer science, new statistical approaches, and your own domain-specific expertise, whether that's astronomy, or genomics, or oceanography, or business."