Manufacturing Process and Design Turn, Turn, Turn: Turbine Simulation at Westinghouse Science and Technology Center
Jobs that would take three months before now run in under 12 hours.

Electrical power generation is a multi-billion dollar global business, with developing countries creating a growing demand for the 21st century. To gain an edge in this fiercely competitive market, the key for companies like Westinghouse is more efficient turbines. Regardless of the energy source — hydroelectric, coal-fired or nuclear — turbines are the bulwark of modern power plants. These huge jet-engine like machines do the heavy-duty work of converting raw energy into megawatts of electricity, and even slight improvements in turbine efficiency translate into significant reductions in the cost of generating power.

Can high-performance computing help design more efficient turbines? That question confronted senior scientist Paul Cizmas of the Westinghouse Science and Technology Center in 1996. "We needed more realistic simulations and faster turnaround," says Cizmas. "We wanted to solve the problems involved with real turbine configurations, not just simplified versions."

Existing software was sequential, based on a single-processor computing paradigm. It simulated fluid flow interacting with the rotating and stationary blades inside the turbine on a blade-by-blade basis — calculating the aerodynamics one blade at a time. For more accurate results and faster turnaround, Cizmas realized, it made sense to parallelize the software. This would allow Westinghouse to take advantage of systems like Pittsburgh Supercomputing Center's 512-processor CRAY T3E, which puts many processors to work simultaneously on the same job.

"This was a change of paradigm," says Cizmas, "a big step. We believe we're the only company in the United States to have tackled this problem, and we're ahead of the game because we have a close relationship with PSC." After attending a June 1996 Pittsburgh Supercomputing Center parallel-processing workshop, Cizmas discussed the problem with PSC senior consultant Ravi Subramanya. In January 1997, they began collaborating and four months later had working parallel code that was more accurate and much faster than its sequential predecessor. "Jobs that would have taken three months before," says Cizmas, "now run in under 12 hours."

Part of this impressive gain in performance is due to "superlinear" speedup — on test cases, 10 processors together run 15 times faster than one by itself. This seemingly impossible result occurs because parallelizing the turbine aerodynamics, assigning a processor to each turbine blade, improves how data is handled in memory, an advantage that becomes more and more significant with larger simulations. Cizmas has begun tackling these larger problems, involving hundreds of turbine blades, and he expects these computations to provide new understanding of how shape and arrangement of the blades affect aerodynamics, knowledge that will improve the efficiency of Westinghouse turbines.

From Windmills to Power Plants

There's nothing new about turbines. The idea — fluid flow propelling a series of airfoils or blades on a rotating shaft — is at least as old as windmills. As early as 70 B.C., Romans used the waterwheel, an ancestor of water turbines, to grind grain, and Hero of Alexandria, a Greek inventor/mathematician, designed a precursor of the steam turbine in the first century A.D.

Turbines came to the fore in a new context in the 1880s. To generate electricity required rotational speeds beyond what reciprocating engines could produce. The technological innovation that met this need, invented by Sir Charles Parsons in 1884, was the steam turbine. There have been many improvements, but the basic idea has changed little in 100 years — rotating blades convert the energy of high-pressure steam or combustion gases into rapid circular motion.

Visualization of flow speed through rotors.
This visualization, with color corresponding to Mach number, shows the flow speeding up through the first stator passageway, slowing down through the rotors, where the passageway diverges, then speeding up even more through the second stage of differently shaped stators.
Download larger version (602KB) of this image.

View animated version (200KB GIF) of this image.

Modern turbines are comprised of alternating rows of stationary blades (stators) that direct flow into the rotating blades (rotors). The narrowing passageway between stators works like a nozzle to increase fluid velocity as it strikes the rotors. Each stator-rotor series is called a stage, and operational turbines have many stages, with hundreds of blades, to generate the constant 3,600 rpm rotation that is standard for most power plants.

Democratic Computing: One Processor, One Blade

Before Cizmas and Subramanya went to work, Westinghouse's ability to simulate rotor-stator interaction was limited to simplified cases. "Sequential codes are too slow and expensive," says Cizmas, "to provide useful input to the design process." A test case involving one-and-a-half stages — an eight-blade configuration, would take three months of computing on a CRAY C90 single-processor. This made it unrealistic to even think about simulating actual production turbines, often involving three or more stages with 150 or more blades.

To parallelize this software, Subramanya chose to take the approach of assigning each turbine blade to a separate processor, thereby avoiding a major problem with the sequential version. "The sequential code ran slowly to begin with," says Subramanya, "because whenever the calculation shifted from one blade to another, all the data in the processor cache became invalid. It lowered an almost new dataset into memory each time it moved to the next blade."

This difference, especially significant because of the time-dependent nature of the computation, accounts for the superlinear speedup. A set of mesh-like computational grids for each blade keeps track of fluid properties at that blade, and this data updates with each advance in time. Although one processor doesn't have enough associated memory to hold the grid data for 10 blades, it can handle the grids for one blade, greatly reducing the need to swap data back and forth from disk to memory as the computation advances in time.

On an SGI Challenge, for a test case the parallel code runs 15.6 times faster using 10 processors than with one. The CRAY T3E delivers similar speedups, and will perform markedly better than the SGI Challenge, notes Subramanya, for larger simulations that require more processors than available on the SGI machine.

The parallel software is not only faster, it's also more accurate — due to modifications built into the new code. "In the serial code," says Subramanya, "you're using an outdated boundary condition part of the time. This is not physically accurate, but you were forced to do it, because you had to compute sequentially. In parallel, we solve everything at the same time, so we get more realistic results."

On test cases, the new code gives good agreement with data from experimental studies. "We hope to reduce the turbine experimental investigations to a minimum," says Cizmas, who believes that simulations with the new software will save money by scaling down the experimental work.

This visualization represents simulation of a turbine with three stages of stators alternating with rotors. Color corresponds to entropy, which shows the wake shedding from each stage and interacting with the next stage.
This visualization represents simulation of a turbine with three stages of stators alternating with rotors. Color corresponds to entropy, which shows the wake shedding from each stage and interacting with the next stage.
Download larger version (313KB) of this image.

Simulations & Design

Larger simulations now underway on the T3E, involving actual turbine blade configurations rather than test cases should improve turbine design in several ways. Experiments have shown that turbine efficiency varies according to the radial spacing between blades. "A wake sheds from the blade's trailing edge and interacts with successive rows of blades," says Subramanya. "This interaction can be constructive or destructive. If it's destructive, then changing the relative position of a blade can improve efficiency."

Accurate simulations also predict hot spots. "Turbines operate at very high temperatures," says Cizmas, "and the flow patterns create hot gas cells that increase blade temperature in certain areas." Pinpointing these hot spots helps with the design of cooling conduits and blade coatings. Simulations can also predict unsteady forces on the blades, adds Cizmas, that cause "flutter" leading to mechanical failure.

This visualization represents simulation of a one-and-a-half stage turbine. Color corresponds to temperature. Such simulations pinpoint likely blade hot spots, which helps in the design of cooling conduits and blade coatings.
This visualization represents simulation of a one-and-a-half stage turbine. Color corresponds to temperature. Such simulations pinpoint likely blade hot spots, which helps in the design of cooling conduits and blade coatings.
Download larger version (587KB) of this image.

In current and future work, Cizmas expects to extend the new software from 2D to 3D simulations, and to use it to optimize the shapes of turbine blades, a factor that also — along with relative positioning — can affect efficiency. "We are no longer limited," says Cizmas, "in the number of blades and rows we can simulate. Time is no longer a problem. Our only restriction is the number of processors. From my point of view, what I'm waiting for now is parallel systems with more processors."

Researchers: Paul Cizmas, Westinghouse Science and Technology Center, Ravi Subramanya, PSC.
Hardware: CRAY T3E
Software: User developed code.
Related Material on the Web:
PaRSI: Parallel Rotor Stator Interaction
Westinghouse Science and Technology Center
Projects in Scientific Computing , PSC's annual research report.

References, Acknowledgements & Credits
© Pittsburgh Supercomputing Center (PSC)
Revised: July 15, 1998