FOR IMMEDIATE RELEASE CONTACT: November 12, 1998 Michael Schneider Pittsburgh Supercomputing Center 412-268-4960 email@example.com
Orlando, Fla., Supercomputing '98 Using the Transatlantic Metacomputing Testbed established since June 1997 between the High Performance Computing Center at Stuttgart University (HLRS) and the Pittsburgh Supercomputing Center (PSC), researchers have run an application on two coupled 512-processor CRAY T3Es with performance equivalent to a single 1,024 processor machine. This is a significant advance in the emerging technology of metacomputing, say scientists at the two centers.
This testbed, the first successful prototype for transatlantic metacomputing linking separate supercomputers to work together on the same job uses high-performance research networks in the United States, Canada and Germany to couple a 512-processor T3E at PSC with another at HLRS.
The researchers achieved a peak data transfer rate of 10 megabits per second (Mbps) for a molecular dynamics application that simulates granular particles. This is five times faster than the same application achieved a year ago at Supercomputing '97, when it used the Pittsburgh-Stuttgart testbed to simulate 1.75 billion particles of granular material, the largest such simulation ever done. This year, even the sustained transfer rate (four Mbps) is double the peak rate a year ago, say the researchers. "Seven years ago, systems couldn't transfer data internally among processors this fast," said Alfred Geiger, head of HLRS. "Now we're doing it across the ocean."
While improved bandwidth is part of the milestone speedup, the biggest part, say the researchers, is better communications processing and especially "latency hiding." Latency is time used for processors to "shake hands" open communications with each other and by careful algorithm engineering, to overlap communications with computation, the researchers made it unnoticeable for certain problems. For the granular particles simulation, communications overhead was only 5 percent, says Matthias Mueller, a scientist at the University of Stuttgart's Institute for Computer Applications: "This is essentially the same overhead we'd have with a 512-processor T3E."
These methods have broad applications, says Sergiu Sanielevici, PSC manager of parallel applications. "There's a hierarchy between fast-access memory, such as a processor cache, that has limited storage, and other slower, higher-storage memory. This kind of optimization will benefit many other projects, such as clusters, that must deal with the situation that not all memory is created equal."
The metacomputing application relied on a library of communications routines called PACX-MPI (PArallel Computer eXtension), developed by an HLRS team led by Michael Resch, that distinguish between internal and external communication. To reduce the latency of contacting processors on another machine, two processors on each 512-processor T3E handle the external communication. "This can be extended," said Resch, "from two T3Es to any number of heterogeneous systems."
A similar success was achieved by the HLRS-PSC testbed with another application, a Navier-Stokes solver called URANUS (Upwind Relaxation Algorithm for Nonequilibrium Flows of Stuttgart University), which simulates reentry of a space vehicle in a wide altitude-velocity range. During SC '98, a simulation of the European space-vehicle HERMES, ran with communications overhead under 10 percent.
Granular materials are ubiquitous in nature, industrial processing and everyday life, and the simulation technology tested at SC '98 has many applications. Examples include crack propagation in concrete and storage and shipment of foodstuffs such as grains, flour and sugar. Since early in this century, researchers have studied these materials to improve industrial processes, but fundamental questions remain. Scientists still lack a detailed theoretical understanding, for instance, of how pipes carrying granular materials become clogged or why grains of different size separate and gather in band-like patterns.
"Large-scale computation is the only way to deepen our insight," said Mueller. "One key problem is to understand the intermittent nature of the complex force network that keeps granulate packings stable. Breakdowns under load can give rise to silo failure, coffee powder spilling on the kitchen floor and earthquakes."
The Pittsburgh Supercomputing Center is a joint effort of Carnegie Mellon University and the University of Pittsburgh together with Westinghouse Electric Company. It was established in 1986 and is supported by several federal agencies, the Commonwealth of Pennsylvania and private industry.
The High-Performance Computing Center Stuttgart offers supercomputing services to academic users in Germany. It operates supercomputers together with debis Systemhaus, Porsche and the University of Karlsruhe in the framework of the company hww (High Performance Computing for Science and Industry).