PSC staff and resources make possible an unprecedented experiment in “ensemble” forecasting of severe storms

The night of May 4, 2007 won’t fade soon from the memory of people in Greensburg, Kansas. An extremely powerful tornado took only a few minutes to flatten almost every above-ground structure in this southwest Kansas town, claiming 10 lives. Catastrophic as it was, the loss would have been worse but for a very strongly worded warning from the National Weather Service office in Dodge City, about a half-hour in advance of the funnel’s arrival, that residents credit with allowing most people to find safe shelter.

What if the warning had come a half-day in advance? Thunderstorms are difficult to predict, and “supercells,” the high-energy vortex systems that spin-off tornados, are notoriously difficult. Nevertheless, if it were possible six or eight hours or more in advance to say when, where and with how much force a severe storm would strike, millions of dollars annually — if not billions — and countless lives would be saved.

PHOTO: Voth and Blood

Ming Xue, director of CAPS, Steven Weiss, science and operations officer of the NOAA Storm Prediction Center in Norman

Scientists at the Center for Analysis and Prediction of Storms (CAPS) at the University of Oklahoma, Norman know that it is possible to dramatically improve severe-storm forecasting, and their ground-breaking work over the past 15 years — often in partnership with PSC — has convinced many skeptics. They took a further large step this spring, collaborating with NOAA (National Oceanic and Atmospheric Administration), PSC and others in an unprecedented experiment.

Running from April 15 to June 8, the 2007 NOAA Hazardous Weather Testbed (HWT) Spring Experiment had the ambitious goal of testing “ensemble” forecasting — multiple runs of a forecast model that make it possible to specify the amount of uncertainty involved in the overall forecast. CAPS relied on PSC resources to run a 10-member ensemble, 10 runs each day of a high-resolution model (four-kilometer grid spacing) that extended from the Rocky Mountains to the East Coast, two-thirds of the continental USA. In addition, CAPS and PSC ran a single higher resolution run (two-kilometer spacing) over the same domain. The HWT experiment marked the first time ensembles were used at the storm scale, and it also marked their first use in real time in a simulated operational forecast environment.

To make this happen, PSC brought to bear BigBen, its Cray XT3 — a lead system of the TeraGrid — and helped to bring about a dedicated Pittsburgh to Oklahoma high-bandwidth link, contributed by Cisco Systems, Inc. Most critically, PSC staff pitched-in with know-how. They optimized performance of the forecast model, automated the daily forecast runs, coordinated the high-bandwidth link and — meeting the challenge of an enormous amount of computing and data-handling — produced forecasts each day for the eight weeks of the experiment with no serious problem.

“The experiment,” says Ming Xue, director of CAPS, “was enormously successful. Scientists who took part were extremely impressed by what PSC was able to pull off. Most other places would struggle to do a single forecast run per day, and we did 10 ensemble runs plus a high-resolution run.”

“Ensembles have been used extensively in larger-scale models,” says Steven Weiss, science and operations officer of the NOAA Storm Prediction Center (SPC) in Norman. “But they have never before been used at the scale of storms. This was unique — both in terms of the forecast methodology and the enormous amount of computing. The technological logistics to make this happen were nothing short of amazing.”

Setting the Stage

Scientific Collaborators in Storm Forecasting

Kelvin Droegemeier, director of LEAD and former director of CAPS. “We’ve been doing this with Pittsburgh since the mid-90s, and there’s no question that had we not been developing this relationship over time, we wouldn’t be where we’re at today.

“PSC staff have been scientific collaborators in the deepest sense. They work hand-in-hand, not just to get our codes to run, but with networking and data-transfer, how the code is structured on the machine. You have to build-up trust, and people at Pittsburgh from Wendy Huntoon [director of networking], David O’Neal [PSC scientist], Ralph and Michael [PSC directors] — they all get a kick out of doing this work with weather.”

In 1989, when CAPS started its work, the prevailing view about storm-scale weather forecasting was skepticism; thunderstorms were thought to be inherently chaotic and unpredictable. Much has changed. CAPS developed innovative techniques to gather atmospheric data from Doppler radar, and they developed a forecast model to use this data at the scale of thunderstorms. In the mid-90s, via a series of spring collaborations with PSC, CAPS proved the feasibility of forecasting thunderstorms.

In 2005, in a large-scale collaborative experiment with NOAA, again with major support from PSC, CAPS took another leap forward by showing that it was possible, in some circumstances, to predict the details of a thunderstorm as much as a full day in advance. “We provided dramatic new evidence that the predictability of organized deep convection is, in some cases, an order of magnitude longer — up to 24 hours — than suggested by prevailing theories of atmospheric predictability,” says former CAPS director Kelvin Droegemeier.

An opportunity to build on this prior work, the 2007 HWT Experiment involved more than 60 researchers and forecasters from government agencies, universities and the private sector. Along with PSC, CAPS and SPC, other collaborators were the NOAA National Severe Storms Laboratory in Norman; the NOAA National Centers for Environmental Prediction Environmental Modeling Center; the National Center for Atmospheric Research; LEAD (Linked Environments for Atmospheric Discovery), an NSF Large Information Technology Research grant program and TeraGrid Science Gateway; and the National Center for Supercomputing Applications (NCSA) in Illinois, a lead TeraGrid resource provider.

To implement the CAPS daily forecast runs using the WRF (Weather Research and Forecast) model on the XT3, PSC provided technological and staff assistance at several levels:

  • PSC networking staff coordinated with OneNet, a regional network of the State of Oklahoma, and National Lambda Rail (NLR), a network initiative of U.S. universities, and with Cisco Systems, who contributed a dedicated “lambda” (a 10-gigabit-per-second optical-network) for up to a 12-month period.
  • PSC implemented and began testing the lambda at its end in January, using existing equipment in the Pittsburgh metro and local-area network. The backbone was provided by NLR and OneNet provided the link from Tulsa to Norman, Oklahoma.
  • This dedicated link — from the Cray XT3 to OneNet in Tulsa to a supercomputer at the University of Oklahoma (which ingested and post-processed the data) — made possible the transfer of 2.6 terabytes of data per forecast day.
  • PSC staff optimized the latest version of the WRF model to run on the Cray XT3, gaining a threefold speedup in input/output (I/O) of the WRF code, substantially improving overall performance.
  • PSC also optimized the I/O for post-processing routines used to visualize and analyze the forecast output, achieving 100-fold speedup.
  • PSC modified the reservation and job-processing logic of its job-scheduling software to automatically schedule the WRF runs and related post-processing, 760 separate jobs each day, demonstrating the TeraGrid’s ability to use the Cray XT3, a very large “capability” resource, on a scheduled, real-time basis.

Another Leap Forward

The daily runs of the 10-member WRF ensemble each used 66 processors of PSC’s Cray XT3 for 6.5 to 9.5 hours each day. The single high-resolution WRF forecast used 600 XT3 processors for about nine hours daily, including data dumps at five-minute intervals via the dedicated network to a system in Norman, where CAPS post-processed the data for visualization and analysis.

Ensemble Predictions: Spaghetti Plots & Postage Stamps

__ Forecast Gaphics

This plot [left] from the forecast for noon central standard time (18 universal time) on May 24, 2007 (from a model run on May 23) shows probability of radar reflectivity (proportional to intensity of precipitation) derived from the 10-member ensemble forecast 21 hours in advance compared to actual observed radar reflectivity on May 24 at noon CST [right].

The “spaghetti plot” [top] for May 24 at noon CST represents the distribution of the radar-reflectivity forecast among the 10 ensemble runs, each represented by a different color. The corresponding “postage stamp” display shows each of the 10 ensemble forecasts. Differences among them comprise a quantifiable range of the uncertainty associated with model errors and limitations on storm-scale predictability, as opposed to the traditional single-member (deterministic) forecast that produces only one prediction, good or bad.

It was an unprecedented marshaling of computing and network capability. “To do this amount of processing with the ability to transfer these huge volumes of data from Pittsburgh to Norman on a reliable, daily, timely basis was a technological success that we’re still appreciating,” says Weiss. “It hasn’t been done before. There were people who thought ‘How are they going to do that?’”

“It hasn’t been done before. There were people who thought ‘How are they going to do that?’”

Over its eight weeks, the HWT experiment produced massive amounts of data, and the analysis is ongoing. “Preliminary findings,” says Xue, “show successful predictions of the overall pattern and evolution of many of the convective-scale features, sometimes out to the second day. The ensemble shows good ability in capturing storm-scale uncertainties.”

David O’Neal, PSC scientist. A paper by Xue, Droegemeier, Weiss and others reporting preliminary findings of the HWT experiment acknowledged that O'Neal "provided ever present technical and logistic support that was essential to the success of the forecast experiment."

The case of a large-scale north-south storm-front that developed on May 23 and continued for several days across the Great Plains exemplifies the ability of the storm-scale runs to produce a forecast that holds up for the storm system as a whole as much as 33 hours in advance. “The rather successful prediction of the overall pattern and evolution of this squall-line case,” says Xue, “is very encouraging, especially considering that we are examining convective-scale predictions for up to 33 hours.”

The HWT experiment also used capabilities developed by LEAD (Linked Environments for Atmospheric Discovery), a TeraGrid Science Gateway, to test “on-demand” forecasts. Triggered automatically from daily SPC forecasts indicating regions within the overall HWT domain where storms were likely to develop, these forecasts used NCSA’s Ensemble Broker software and its 16-teraflop Tungsten system to run at two-kilometer resolution within the regions of high storm likelihood. More than 1,000 of these on-demand forecasts ran successfully.

“This experiment was an enormous leap forward,” says Droegemeier, who directs LEAD and as former CAPS director has led several other storm-forecast experiments over the past decade.