It was the evening of 26 August 2005, and Hurricane Katrina was barreling toward the Gulf Coast of the United States. Weather models were predicting that the center of the huge and devastating hurricane would slam directly into New Orleans in two and a half days. But New Orleans officials, perhaps recalling false warnings in the past, didn’t order a mandatory evacuation until the morning of 28 August—too late to do much good. The prediction was just 50 kilometers and a few hours off target. It is now painfully clear that an evacuation order ought to have come a lot sooner.
It was an all-too-rare example of a forecasting bull’s-eye. Just a month later, the two- to three-day forecast of Hurricane Rita’s path showed the storm hitting Houston; hundreds of thousands of people evacuated, or at least tried to, but it missed the city entirely. An accurate forecast of Rita’s path would have prevented an enormous amount of disruption and even death—a bus accident during the evacuation killed 23 people.
Help is on the way. Within the next 10 years, the two- to three-day hurricane forecast will be as accurate as the generally spot-on 24-hour forecast is today. Even more important, from the viewpoint of emergency officials, the four- to six-day forecast will give solid and reliable information to base evacuation orders on. Fatalities and false alarms will be avoided. Billions of dollars will be saved every year, in all probability. And it’s not just hurricane prediction that will get better; forecasts for rainfall, heat waves, blizzards, and even day-to-day temperatures will all get much better.
Three things are going to foment this revolution in forecasting accuracy: supercomputers, satellites, and advances in the scientific understanding of how weather evolves. Supercomputer processing power, for example, is projected to increase 16-fold during the next decade, from today’s 2 trillion floating-point mathematical operations per second to a speed approaching 32 trillion flops a second.
By then, nine additional advanced weather-specific satellites will likely join the fleet orbiting Earth, providing the first direct measurements of winds and the structure of clouds. And all that data and computer power will be used to better effect as a result of research already under way on the details of how storms gather force.
In a sense, the improvements will continue a trend that has been going on for half a century. Today’s five-day weather forecast, for example, is just as accurate as the three-day forecast was in 1976 and the 36-hour forecast in 1955. And the three-day forecast position of hurricanes is now as accurate as the two-day forecasts were 25 years ago. That means today, 72 hours before a hurricane strikes land, meteorologists can predict its landing point within 100 km, about a 1-hour drive on an expressway.
The use of computers , sensors, and science to predict weather dates back to the 1950s. This numerical weather prediction is an endeavor that the U.S. National Academy of Sciences called one of the most significant scientific, technical, and political accomplishments of the 20th century.
Around the world, about a dozen major and countless smaller facilities perform numerical weather forecasting. Among the largest within their regions are the Centro de Previsão de Tempo e Estudos Climáticos, in São Paulo, Brazil; the European Centre for Medium-Range Weather Forecasts, in Reading, England; the Japan Meteorological Agency, in Tokyo; the National Centers for Environmental Prediction, in Camp Springs, Md.; and the South African Weather Service, in Pretoria. Forecasts start with observations: temperature, humidity, pressure, wind speed and direction, and cloud properties—how much water does the cloud contain, is it liquid or ice, how big are the drops or ice crystals?
Collecting the data is not a trivial task. Observers gather it from weather stations, from balloons, from aircraft, from ships at sea, and from more than a dozen satellites that circle the globe transmitting images of clouds and measurements of the radiation emitted by the clouds and the atmosphere. Forecasters use specialized software to derive data about temperature, winds, and moisture from the radiation measurements.
As wide-ranging as it is, the weather data pool today is a patchwork collection coming from different sources, taken at different times, with different accuracies, random errors and biases, and a variety of gaps. That’s because most data collection tools provide information only on wind, temperature, or some other variable, rather than a full picture of the weather at a particular point.
To create a three-dimensional weather model of the world, the computer models require in principle that all individual pieces of data for the 1 billion or so points on the globe refer to the same instant in time. In practice, of course, they never do.
The means by which meteorologists bring all the data together and put it into a form that can be used by the models is called assimilation. It’s an iterative process: scientists use previous weather predictions as a first guess of atmospheric conditions, comparing them with actual data to spot potential problems. They also exploit the relationships among different variables to account for parameters that aren’t measured—for example, they use the horizontal distribution of temperature to calculate how fast the wind is blowing.
Forecasters use supercomputers to plot the data on a 3-D map of the earth’s atmosphere. This resulting grid of data sets the initial state of a weather model. The computer systems apply basic physics equations to determine how different parameters will change during a period of days and affect the weather in a particular place. The equations include Newton’s second law of motion, which states that a force acts on a mass and produces an acceleration; the first law of thermodynamics, which relates temperature changes in a mass to heat added or removed from that mass; and the law of conservation of mass, which precludes mass in the atmosphere being either created or destroyed. Another is the equation of state, which relates the pressure, temperature, and density of a fluid.
Repeated over and over on the barrages of data flowing in from global sensors, the equations take myriad fluctuations in data and compute effects. An increase or decrease in the temperature of a section of the atmosphere leads to changes in barometric pressure—which, in turn, change the direction and speed of the winds, and so on. Typically, the modeling computers run the set of equations 5000 times to project weather conditions one day ahead; it takes about 10 minutes on a 2-teraflop computer.
Global and regional weather models feed off each other. The regional models take the output of global ones and fine-tune it to predict local weather. Forecasters combine the results of both global and regional models to create the weather predictions you see on television and in newspapers. The global model gives the basic picture, and the regional models fill in local details. For example, in the Katrina forecast, the global model predicted the storm track and provided the regional model with the information needed for it to give the specifics about the storm’s intensity and evolution.
The process is not as straightforward as it might seem, because the atmosphere is a nonlinear system—that is, it can behave chaotically and therefore be impossible to model with complete accuracy. In a nonlinear system, a minor alteration of initial conditions is typically magnified into an enormous change. The most nonlinear weather phenomena, and therefore the most difficult to forecast, are low-pressure systems, fronts, and thunderstorms—the weather events that people are most concerned with.
Whether your picnic is ruined by a tumultuous summer thunderstorm or is undisturbed by a fat cumulus cloud drifting overhead can be determined by a factor as minor as a 2 °C temperature difference or a 5 percent difference in relative humidity 12 hours before you spread out your blanket. If that thunderstorm forms, it can change the local winds, atmospheric pressure, and temperature for such a short time and on such a small scale that its effect is not picked up by sensors and therefore not included in the data that forecasters input to set a model’s initial conditions. Because of nonlinearity, the imperfection created by that missing data grows unpredictably over time and makes subsequent forecasts wrong.
That is one example of the so-called butterfly effect, the hypothetical idea that, under the right circumstances, small changes such as the flapping of a butterfly’s wings can eventually cause a huge weather disturbance thousands of kilometers away.
Weather models, like everything else in the digital world, are discrete. They compute key meteorological parameters such as temperature and relative humidity—the amount of moisture in the air as a fraction of the amount it could hold at that temperature—at regularly spaced points on a 3-D grid enveloping the Earth. The distance between the points on that global grid defines the resolution of the model. And just as in today’s digital cameras, the higher the resolution of a model, the clearer the picture of future weather it produces.
Today’s global models typically use a 35-km horizontal resolution; the resolution of regional models can be as small as 5 km. Three such regional models cover the continental United States.
Today’s resolution, however, is often not good enough. For example, even 5 km is too coarse to accurately simulate a typical thunderstorm. Sometimes, too, the models miss more serious weather conditions, including tornadoes, squalls, floods, or the location of the line separating rain and snow in a major winter storm.
There are countless examples of disasters exacerbated by inaccurate predictions. In one case, in October 2003, the U.S. Global Forecast System (GFS) Model, on which most U.S. forecasts are based, predicted that in 48 hours about 2.5 centimeters of rain would fall off the coast of the state of Washington, as well as an insignificant amount further inland. A storm did indeed appear, on 20 October, but it dumped more than 30 cm of water inland, making 21 October the wettest day in Seattle history and causing devastating floods in seven counties. The model’s limited resolution was the likely culprit in this forecast bust.
Basically, the model had missed conditions off the coast of Asia, including several tropical storms and a developing typhoon. In their early stages, these storms were just too small to register on the model’s 35-km resolution. They kicked off a wavelike disturbance in the atmosphere that carried tropical moisture toward the North American coast. In effect, it was an airborne tsunami. When this wave hit the already developing storm, the residents of Washington had a major flood.
Clearly, spacing the data points more closely together on the grid would improve forecast accuracy. But halving the distance between grid points in all three dimensions increases the number of points by a factor of eight—two in each spatial dimension. That increase in spatial density also requires halving the time step, because the ratio between the time step and the grid spacing has to stay within certain parameters or the model produces an excessive number of errors and the resulting forecast is useless. Today the time step is typically about 15 seconds.
Therefore, doubling the resolution of the model would increase the computing requirements by a factor of 16. If you tried to look a week ahead with a global model and one of today’s computers, it would take most of a day to execute the whole program.
That’s too much time, because the computers used today to do weather simulations have other duties. For example, in the United States, the same computers used to run the global models also tackle data assimilation and compute the regional models, as well as long-range climate and ocean models, which are growing in importance. And the computer runs the entire set of models again from new data every 12 hours for global models and as often as every three hours for regional predictions.
So today’s models are a compromise between resolution and realistic run time. For example, the U.S. GFS Model has a 35-km horizontal resolution and a 60-level vertical resolution. To look ahead 10 days, it takes a little more than an hour and a half on a 308-processor IBM Power4 supercomputer, a system with an average processing capacity of about 2 teraflops.
Of course, Moore’s Law, the anticipated periodic doubling of transistor density and therefore processing power of ICs, suggests that in about 10 years meteorologists will be able to increase by a factor of eight the resolution of numerical weather models and still expect to churn through them in a reasonable amount of time.
Researchers are already testing such higher-resolution models as simulations. These simulations are currently impractical for operational forecasting, as they take longer than real time to produce results. In some cases, the simulations cover the entire globe, but most of them concentrate on worrisome events that could take shape in specific regions—for example, hurricanes or severe weather during the spring and early summer in the continental United States. The simulations have demonstrated that higher-resolution models can improve predictions of such difficult weather features as hurricane intensity and heavy rainfall.
Today, forecasters can’t predict storm intensity more than a few hours in advance with any confidence. With high-resolution modeling, researchers conjecture that forecasters will be able to accurately estimate hurricane intensity several days in advance.
In the absence of sufficiently powerful computers, forecasters have to rely on clever techniques to minimize the potential of errors being magnified by the chaotic nonlinearity of weather. One of their key strategies is to run groups of models, called ensembles, rather than single models.
To produce a forecast ensemble, a numerical weather prediction system repeatedly runs the same weather model but changes the initial state slightly each time by adding small increments to the original atmospheric measurements. A temperature reading of 21 oC at a certain grid point, for instance, might be changed by a 10th of a degree. A relative humidity at a different grid point might be increased by half a percent, and so on.
A range of reasonable forecasts emerges. As forecasters execute the programs repeatedly, one member of this forecast ensemble appears most frequently, and meteorologists regard that as the most likely forecast. They don’t discard the other members but rather use them to calculate the probability that a particular forecast will come to pass—for example, a 60 percent chance of rain or a 70 percent chance of a hurricane’s striking a particular section of coastline.
Ensembles were key to the confidence researchers had in the numerical forecasts of Hurricane Katrina’s path. On 25 August, as the storm passed over Florida, the predictions generated by different members were all over the place—atmospheric conditions magnified small changes dramatically. For instance, the winds in the area as Katrina approached Florida were so divergent that a minor difference in the predicted location at that point led to major differences in predictions of the storm’s path further out.
Forecasters therefore were unable to say with any real confidence where Katrina was heading. But by 26 August, the ensemble predictions began to coalesce into a smaller range of potential forecasts. By 27 August, one path clearly had a very high degree of probability, and forecasters could confidently predict the hit near New Orleans, which occurred on 29 August. Accurately communicating this changing degree of uncertainty to decision makers and the public is a critical challenge for forecasters.
Ensembles, of course, add to the processing burden. But the advantage of the ensemble approach is that the burden can be easily shared among many computers—even ones separated by continents or oceans. And thanks to international cooperation in ensemble modeling, we may be able to use larger ensembles long before 10- or 20-teraflop computers become available.
Modeling organizations in different countries are already sharing their research as well as results of complete model runs. A forecaster in the United States is likely to check his results against predictions about the same weather event computed by his colleagues in the United Kingdom, for example.
But collaboration by several countries in simultaneously producing a single ensemble forecast is not yet possible. That kind of collaboration would be a major step forward: it would allow individual facilities to each run fewer versions of higher-resolution models, resulting in greater accuracy.
In hopes of making collaboration on individual ensembles routine, the world’s major forecast centers are starting to work together. Already the United States and Canada are sharing their ensemble members and combining them to produce a single forecast. Such collaboration requires that forecasters standardize independently developed models by putting the predictions from the different models onto a common grid and then calibrating the system to remove the biases inherent in each model. Some models, for example, predict consistently warmer or colder temperatures than others.
Work is under way to combine the ensemble models of the world’s major global forecast centers. In January 2005, the 187 nations of the World Meteorological Organization (WMO), based in Geneva, under the auspices of the United Nations, launched a 10-year effort to accelerate the current rate of improvement in forecasts and reduce worldwide fatalities from weather events by half. The project is called The Observing System Research and Predictability Experiment (Thorpex). Its aims include researching the benefits of producing a single international forecast by combining the ensembles of all the various forecast systems.
Also poised to make a huge leap are the number and quality of platforms and sensors that feed data to the models. Even with today’s global satellite observations, the data collection process has gaping holes. Most existing weather satellites measure only visible and infrared radiation. And none can measure even a single variable all over the globe at one time. Instead, data stream in continuously to the assimilation system software, with information from different locations coming from different satellites.
Some of the satellite sensors, including those that detect visible and infrared radiation, can’t see below the clouds and therefore can’t make measurements when cloud cover is present. Newer sensors, like those that use microwave radar, can penetrate cloud cover. And modelers increasingly try to incorporate other types of satellite measurements, such as data on aerosol and ozone concentrations or on the character of the ocean’s surface.
They’re also exploring detailed and direct measurements of the wind. These data could greatly improve accuracy in forecasting conditions in which nonlinearity is likely to have a huge effect, such as in predictions of hurricanes and intense winter storms. Today direct measurements of the wind are infrequent.
The WMO projects that the amount of weather-related data provided by satellite systems will increase by a factor greater than 10 000 during the next decade, thanks to the launch of new satellites in the next few years. This remote-sensing revolution will vastly improve our ability to characterize the atmosphere and the earth’s surface.
To help do that job, six satellites were launched simultaneously in April as a joint venture between the United States and Taiwan. The United States calls the project COSMIC (for Constellation Observing System for Meteorology, Ionosphere, and Climate). In Taiwan the moniker is Formosat-3. COSMIC/Formosat-3 is designed to track temperature in the upper atmosphere up to 55 km and to give detailed 3-D information related to distribution of temperature and water vapor in the troposphere—crucial for predicting precipitation. The satellites use a technique called radio occultation. They intercept signals from Global Positioning System satellites after they pass through the atmosphere close to the horizon, calculate the delay in the expected arrival of the signal, and relate this delay to the bending of the path of the signal and its dependence on the atmospheric conditions.
NASA added two more satellites, CloudSat and CALIPSO, to the weather-data effort in April. CloudSat uses microwave radar to map the vertical structure of clouds. The strength of the radar signal that returns to the satellite is related to the amount of water in a cloud. CALIPSO (for Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations) is a joint venture with the French space agency, CNES. CALIPSO uses light detection and ranging, or lidar, a technology that sends out laser pulses in the infrared and visible spectra and senses the backscattered radiation. CALIPSO’s lidar studies aerosols—tiny particles in the atmosphere, including water vapor, dust, and pollutants.
CloudSat and CALIPSO are helping scientists understand how clouds form, evolve, and affect weather and the climate. Researchers expect the information to let them improve forecasts of events involving heavy rainfall, an area where progress has been much slower than for other types of forecasts. Improving the prediction of heavy rainfall by one day in the next 10 years would be a huge breakthrough. That is, four-day forecasts in 2016 would be as accurate as three-day forecasts are today.
Next year, the Paris-based European Space Agency, in a project led by researchers in the United Kingdom, plans to launch Aeolus, the first satellite designed specifically to collect data on wind. Aeolus also will use lidar, in this case to produce wind profiles at different altitudes. Aeolus is expected to track the length of time it takes the laser signal to bounce back from the aerosols and use that information to determine the altitude of the particles. It also will track the change in the frequency of the light, called the Doppler shift, to determine the wind speed and direction.
These are all research satellites, intended to improve the science of weather prediction. New and better operational satellites, those used for current forecasting, are also contributing to accuracy. The United Kingdom began replacing its two Earth-observing geostationary satellites in 2002; Japan replaced its single geostationary weather satellite last year. The two U.S. geostationary satellites are due to be replaced starting in 2012. The United States also plans to launch a set of three polar orbiting satellites in 2008 to replace its current civilian and defense polar satellites; the European Space Agency plans to launch Europe’s first polar orbiting meteorological satellite this summer. And in the same way that the burden of ensemble modeling will someday be shared around the world, 44 countries are developing plans to share their collected data as part of the Global Earth Observation System of Systems, or GEOSS.
Although the new satellites can help fill important gaps in our global observing system, the information flowing from them is likely to overwhelm our ability to process it. We must fine-tune data assimilation software to select those items most important to a forecast and then use the system to determine the initial conditions for the model.
Researchers are discovering that parts of the global weather model are more susceptible to the butterfly effect than others, and they expect to use that knowledge to avoid forecasting errors. It turns out that the extent to which an initial data error multiplies as a model runs depends on where in the world, or where in a particular weather pattern, an erroneous approximation occurs.
Decades of experience in running weather models has taught us that, for example, using incomplete data from the region just off the coast of Asia, particularly in the winter, can cause dramatic mistakes, like the floods in Seattle described earlier. By contrast, data gathered from high-pressure systems sitting over an area of fair weather adds little to forecast accuracy. Summer weather in the Midwestern United States, for example, can be clear and stable for weeks.
So researchers are considering doing something that would have been considered heretical until recently: making selective use of satellite observations. They would gather and assimilate huge volumes of satellite data in certain times and places, like the coast of Asia in winter, where uncertainty in the initial conditions tends to degrade the forecast dramatically. Conversely, they’d thin the satellite data flow in areas like the Midwest in summer, where the potential for error growth is smaller.
Meteorologists have begun to further attack uncertainty by taking additional observations in sensitive regions using weather balloons, remotely piloted aircraft, commercial aircraft, and stratospheric balloons, which fly higher and longer than standard weather balloons. The U.S. National Oceanographic and Atmospheric Administration and the U.S. Air Force Reserve Hurricane Hunters have collaborated on such a targeting strategy to reduce errors in the prediction of hurricane landfall, flying aircraft near and even through hurricanes and releasing half-kilogram instrument packages, called dropsondes, that transmit temperature, wind, humidity, and pressure data as they fall through the atmosphere.
Taiwan recently began using such dropsondes for typhoon surveillance. And for the current hurricane season in the North Atlantic, a collaboration of CNES and the U.S. National Center for Atmospheric Research, in Boulder, Colo., deployed a fleet of dropsonde-carrying stratospheric balloons from Africa.
Thanks to all these advances—in data collection and assimilation, in computer processing power, and in our understanding of how weather evolves—we will have the capability within a decade to dramatically change weather prediction and perhaps even make accurate weather forecasts twice as far in advance as we do today for many disastrous weather events. Historically, global weather prediction accuracy has improved by one day every 10 to 15 years, but researchers are optimistic, based on early experimental results, that the changes under way will likely accelerate that rate of improvement dramatically and we will be able to make accurate hurricane position forecasts five to six days in advance.
What will that mean? Fast forward to August 2016. Hurricane Karl is due to make landfall somewhere on the east coast of the United States sometime on the 29th of the month. On the 23rd, forecasters pinpoint the region most likely to be affected by the hurricane’s path—Fort Lauderdale, Fla., on the Atlantic Coast. Emergency officials begin planning an evacuation. Buses are ordered. Emergency shelters outside the potential path of the hurricane are identified and stocked with food, water, cots, and blankets. People in the potential landfall area contact friends and relatives for emergency housing and do what they can to protect their valuables.
On the 25th, forecasters narrow down Karl’s landfall to within 50 km. They estimate the intensity, size, and structure of the storm and predict the height of the waves and the storm surge. Karl is a narrow and intense storm, so the governor gives an order to evacuate 100 km of coastline, as opposed to the 300 km typically evacuated today, just to be safe. Aware of several years of accurate forecasts preceding this one, residents are quick to pack their cars or make reservations on the evacuation buses. Emergency workers fly in to build barriers around low-lying areas and protect valuable public property.
One day before the predicted hurricane strike, Fort Lauderdale and its coastal neighbors are ghost towns, except for a few TV crews setting up remote cameras and getting ready to leave.
On 29 August, Karl hits as predicted. Thanks to the efforts to protect key low-lying areas, damage is reduced by hundreds of millions of dollars as people have more time to move belongings to higher ground and get boats, automobiles, and expensive industrial equipment out of the path of the storm. And, much more important, thanks to the orderly evacuation, no one is killed.
About the Authors
Robert Gall is director of the Developmental Testbed Center at the U.S. National Center for Atmospheric Research (NCAR), in Boulder, Colo.
David Parsons is a senior scientist at NCAR and co-lead for North American activities under The World Meteorological Organization’s Observing System Research and Predictability Experiment (Thorpex).