# PSC Live! at SC05

Download the calendar of talks.

## Tuesday, November 15

### 10 am - 11 am

The ability to quickly do short, exploratory runs of PPM simulations of
turbulence will have a significant impact on scientific productivity. The
Cray XT3 machine at Pittsburgh Supercomputing Center has the potential to
compute a relatively small problem, on a grid of just 512^{3} cells, fast.
Previous systems are only able to achieve their best performance on
extremely large problems. LCSE demonstrates a prototype computational
steering, visualization and data analysis system that will be able to
produce volume-rendered images from this data at a rate of a few per frames
second. In this demonstration visualization data will be streamed from the
Cray XT3 in Pittsburgh directly to the show floor, and the visualization and
the computation will be interactively steered from the show floor via the
TeraGrid backbone network.

### 11 am - 12 pm

SPICE aims to understand the vital process of translocation of biomolecules across protein pores by computing the free energy profile of the translocating biomolecule. Without significant advances at the algorithmic, computing and analysis levels, progress on problems of such size and complexity will remain beyond the scope of computational science for the foreseeable future. Grid computing provides the required new computing paradigm as well as facilitating the adoption of new algorithmic and analytical approaches. SPICE uses sophisticated grid infra-structure to couple distributed high performance simulations, visualization and instruments used in the analysis to the same framework. We describe how we utilize the resources of a federated trans-Atlantic Grid to enhance our understanding of the translocation phenomenon in ways that have not been possible until now.

### 12 pm - 1 pm

BigBen, the Pittsburgh Supercomputing Center’s Cray XT3, entered production as an NSF TeraGrid resource on October 1. Users are now running a wide variety of applications, for example end-to-end earthquake simulation and achieving more than 8 teraflops modeling electronic structure in materials science. In this talk, we will survey applications running on PSC’s Cray XT3, emphasizing both scientific results and performance.

### 1:30 pm - 2:15 pm

Optimization of systems governed by large-scale PDE simulations presents manifold challenges and opportunities. We begin with illustrations of some driving applications involving contaminant transport, cardiac mechanics, artificial heart design, and design of linear accelerators. We then focus on a challenging PDE-constrained optimization problem arising in inverse earthquake modeling: that of estimating the earth model in earthquake simulations from observations of ground motion from past earthquakes. We discuss such compounding issues as extreme large scale, ill-posedness, discontinuous solutions, challenges in constructing preconditioners for the reduced Hessian, nonlinear convergence difficulties, and multiple local minima, and techniques for addressing them. We end with a discussion of some outstanding challenges and the conclusion that earthquake inversion to frequencies of engineering interest remains a major challenge for petascale computing.

### 2:30 pm - 3:15 pm

In contrast to traditional terascale simulations that have known, fixed data inputs, dynamic data-driven (DDD) applications are characterized by unknown data and informed by dynamic observations. DDD simulations give rise to inverse problems of determining unknown data from sparse observations. The main difficulty is that the optimality system is a boundary value problem in 4D space-time, even though the forward simulation is an initial value problem. We construct special-purpose parallel multigrid algorithms that exploit the spectral structure of the inverse operator. Experiments on problems of localizing airborne contaminant release from sparse observations in a regional atmospheric transport model demonstrate that 17-million-parameter inversion can be effected at a cost of just 18 forward simulations with high parallel efficiency. On 1,024 Alphaserver EV68 processors, the turnaround time is just 29 minutes. Moreover, inverse problems with 135 million parameters corresponding to 139 billion total space-time unknowns are solved in less than 5 hours on the same number of processors. These results suggest that ultra-high resolution data-driven inversion can be carried out sufficiently rapidly for simulation-based real-time hazard assessment.

### 3:30 pm - 4:15 pm

Traditional high-performance parallel scientific computing adopts an offline approach where files are used as interface between simulation components, such as meshing, partitioning, solving and visualizing. Unfortunately, such an approach results in time-consuming file transfers, disk I/O and data format conversions that consume large amounts of network, storage, and computing resources while contributing nothing to applications. We propose an end-to-end approach to parallel supercomputing. The key idea is to replace the cumbersome file interface with a scalable, parallel, runtime data structure, on top of which all simulation components are constructed in a tightly coupled way. We have implemented our new methodology within an octree-based finite element simulation system named Hercules. The only input to Hercules is material property descriptions of a problem domain; the only outputs are small jpeg-formatted images generated as they are simulated at every visualization time step. There is absolutely no other intermediary file I/O. Performance evaluation of Hercules on up to 2,048 processors on the Cray XT3 system and the AlphaServer system at Pittsburgh Supercomputing Center has shown good isogranular scalability and fixed-size scalability.

## Wednesday, November 16

### 10 am - 11 am

Cardiovascular disease accounts for almost fifty percent of deaths in the western world. The formation of arterial disease, such as atherosclerotic plaques, is strongly related to the blood flow patterns, and is observed to occur preferentially in regions of separated and re-circulating flow such as vessel branches and bifurcations. In this talk we will perform for first time simulations of blood flow in the entire human arterial tree through detailed three-dimensional computations at a number of arterial bifurcations, coupled by the wave-like nature of pulse information traveling from the heart to arteries that is modeled by a reduced set of one-dimensional equations. We employ MPICH-G2 and conduct geographically-distributed coupled cross-site simulations at major TeraGrid sites in the US and high-end systems in the UK. Flow visualizations on all arteries simultaneously will also be demonstrated using the TeraGrid resources.

### 11 am - 12 pm

Vortex knots and links are the most evident coherent structures of fluid turbulence. Like the elementary particles of high-energy physics, there are a wide variety of such structures; more exotic ones are found at higher energies, and they have a characteristic lifetime after which they decay into other such structures. An improved understanding of the creation, behavior, and interaction of vortical structures would elucidate fundamental questions in theoretical fluid dynamics, such as the reconnection problem and the detachment of “hairpin” vortices from boundaries in the intermittency approach to turbulence. It holds the potential for improved understanding in applied fluid dynamics, including the dynamics of meteorological structures such as tornadoes and hurricanes. It may also lead to improvements in computational fluid dynamics, such as naturally adaptive vortex algorithms that avoid vortex tangles and requisite “vortex surgery.” High-resolution direct numerical simulations of the dynamics of vortical structures at high Reynolds number are, however, among the most intensive scientific computations attempted today.

We describe the VORTONICS package for locating and tracking vortex cores, and addressing other fundamental problems of topological fluid dynamics. The package includes modules for generating, evolving, and identifying vortex cores and routines for Fourier-resizing and remapping the computational lattice. It is parallelized using MPI and geographically distributed using MPICH-G2. It also includes a methodology for computational steering, checkpointing, rewinding, and dynamic adjustment of resolution. We describe the geographical distribution of the tasks performed by this code, as well of the domain-decomposition of the computational lattice itself on the TeraGrid and other distributed HPC resources.

### 1 pm - 2 pm

Data transfer across high performance networks usually means being given a menu of fast, easy, and secure and only being able to pick two. HPN-SSH changes that by giving even novice users access to high speed, cryptographically secure, and easy to use data transfers. Speeds of more than 600 Mb/s have been documented. Additionally, HPN-SSH includes a new port forwarding utility that will automatically and transparently create secure communication tunnels for any sort of connection and in any environment. This allows for ubiquitous security for mobile users.

### 2 pm - 3 pm

A large-scale and long-running simulation of Separation by Implantation of Oxygen for creating a semiconductor structure was gridified and executed using both GridRPC and MPI. The experiment shows that the new programming approach enables applications to be (1) flexible: allow dynamic resource allocation/migration; (2) robust: detect errors and recover from faults automatically for long runs, and (3) efficient: manage thousands of CPUs distributed on an intercontinental scale.

### 3 pm - 4 pm

We describe an integrated combination of bioinformatics, molecular dynamics, and quantum mechanics calculations on the Aldehyde Dehydrogenase enzyme family that provides a comprehensive characterization of that enzyme family illuminates many open questions and incompletely understood experimental observations about that family. A comprehensive bio-informatics analysis of the Aldehyde Dehydrogenase enzyme superfamily was used to identify the sequence and structural elements that are essential to the functioning of all of the enzymes in the superfamily, as well as those sequence and structural elements critical to individual families. These essential and critical elements are described and integrated with structural and experi-mental information which serves as the basis for mixed molecular and quantum mechanical calculations that serve to define the geometry of substrate binding and elucidate the steps in the catalytic mechanism. These calculations show a unique catalysic mechanism, which will be described. These calculations indicate that distinct genetic diseases involving different mutations in different Aldehyde Dehydrogenases apparently have the same mechanistic basis.

### 4 pm - 5 pm

The ability to quickly do short, exploratory runs of PPM simulations of
turbulence will have a significant impact on scientific productivity. The
Cray XT3 machine at Pittsburgh Super-computing Center has the potential to
compute a relatively small problem, on a grid of just 512^{3} cells, fast.
Previous systems are only able to achieve their best performance on
extremely large problems. LCSE demonstrates a prototype computational
steering, visualization and data analysis system that will be able to
produce volume-rendered images from this data at a rate of a few per frames
second. In this demonstration visualization data will be streamed from the
Cray XT3 in Pittsburgh directly to the show floor, and the visualization and
the computation will be interactively steered from the show floor via the
TeraGrid backbone network.

### 5 pm - 6 pm

Simulation is often called the “third pillar of science,” along with theory and experimentation. Simulation of the human body would enable a virtual experimental setup that would have applications in biology and medicine. While a full simulation of the human body is far from possible today, individual models exist of many of the organs within the body. One class of problems that arise in such simulations is the modeling of fluid flow within an organ, often when that fluid contains immersed elastic structures such as muscle, membrane, or other tissue. The computational cost of modeling the fluid dynamics even within a single organ is very high, requiring the use of today’s fastest parallel machines.

In this talk I will describe a scalable parallel algorithm for the immersed boundary method. The method, due to Peskin and McQueen, has been used to simulate blood flow in the heart, blood clotting, the motion of bacteria and sperm, embryo growth, and the response of the cochlea to sound waves. Our parallel implementation uses a novel programming language called Titanium, which is a high performance extension of Java. I will describe the Titanium language and compiler as well as our computational framework for the immersed boundary method, which is designed to be extensible and is publicly available along with the Titanium compiler. I will also talk about some of the remaining open problems in Computer Science and Applied Mathematics motivated by this application domain.

This work was done in collaboration with Ed Givelberg, Armando Solar, Charles Peskin and Dave McQueen.

## Thursday, November 17

### 10 am - 11 am

The dynamics of self gravitating systems play a significant role over a wide range on length scales in Astrophysics. On the largest scales, connecting the fluctuations in the Cosmic Microwave Background to the galaxies we observe today requires following gravitational collapse for over 10 orders of magnitude in density. Furthermore, the physics of gas dynamics and star formation must be accurately modeled in order to fully make the connection between “mass” and “light”. On the other end of the size spectrum, protoplanetary systems must be followed for millions of years to understand the formation history of planets.

I will describe how simulations on the PSC supercomputers have contributed to our understanding of planet formation and galaxy formation. I will also discuss some of the computational challenges we still face in these disciplines and strategies for tackling them.

### 11 am - 12 pm

The ability to quickly do short, exploratory runs of PPM simulations of
turbulence will have a significant impact on scientific productivity. The
Cray XT3 machine at Pittsburgh Suprcomputing Center has the potential to
compute a relatively small problem, on a grid of just 512^{3} cells, fast.
Previous systems are only able to achieve their best performance on
extremely large problems. LCSE demonstrates a prototype computational
steering, visualization and data analysis system that will be able to
produce volume-rendered images from this data at a rate of a few per frames
second. In this demonstration visualization data will be streamed from the
Cray XT3 in Pittsburgh directly to the show floor, and the visualization and
the computation will be interactively steered from the show floor via the
TeraGrid backbone network.

### 12 pm - 1 pm

Over the last two-three decades there has been significant progress in the first principles methods to calculate the properties of materials at the quantum mechanics level. These methods have largely been based on the local density approximation (LDA) to density functional theory (DFT). However, nanoscience places new demands on these first principles methods because of the large numbers of atoms, in the range of thousands to millions of atoms, present in even the simplest of nano-structured materials. Fortunately, recent advances in the locally self-consistent multiple scattering (LSMS) method are making the direct quantum mechanical simulation of nano-structured materials possible. The LSMS method is an order-N approach to the first principles electronic structure calculation. It is highly scalable on massively parallel processing super-computers and is best suited for performing large unit cell simulations to study the electronic and magnetic properties of materials with complex structure. In this presentation, I will give a brief introduction of the LSMS method and will show that this method, aided with the state of the art teraflop computing technology, effectively accomplishes the first step towards understanding the electronic and magnetic structure of nanoparticles with dimension size up to 10 nanometers.

I will demonstrate, as an example, the electronic and magnetic structure calculated for an iron nanoparticle embedded in iron aluminide crystal matrix. I will also explain to what extent future petaflop computing systems may enable the realistic quantum mechanical simulation of real nano-structured materials.

### 1 pm - 2 pm

We performed large scale simulations of Aldehyde Dehydrogenase chemistry on the Cray XT3 using a hybrid Quantum Mechanical/Molecular Mechanical (QM/MM) potential energy function. The simulations used a software package called DYNAMO and performed over 900 processors. This enabled us to examine a large area of critical atomic interactions that govern a crucial step in ALDH catalysis. Genetic mutations that cause changes in these interactions result in two known metabolic diseases, Sjogren-Larsson Syndrome and Type II Hyperprolinemia.