One-Two Punch

A software innovation combined with advances in supercomputing hardware has opened a door to better understanding of the fundamental molecules of life. That's the news from pharmaceutical chemist Peter Kollman and his colleague Tom Cheatham

In computations that simulate the structure of DNA and its dance-like oscillations inside living cells, Kollman and Cheatham used a software innovation called particle-mesh Ewald (PME). Developed by Tom Darden of the National Institute of Environmental Health Science and initially tested at Pittsburgh, PME is an efficient, accurate method to account for the electrical attractions and repulsions between atoms that aren't bonded to each other in a large biomolecule. "This method makes a huge difference," says Kollman. "It provides stability and it leads to dramatic structural improvement over prior methods."

Working closely with Darden and Cheatham, PSC biomedical scientist Mike Crowley implemented PME on the CRAY T3D at Pittsburgh Supercomputing Center, bringing unprecedented computing capability to bear on large biomolecule research. Applying this one-two punch - PME and the T3D, Kollman and colleagues have simulated DNA, RNA and proteins with results that herald new possibilities for computational biology.

"Using the T3D," says Cheatham, "is the difference between getting results in less than a week versus months on a workstation. With PME, this opens a new level of what we can investigate in nucleic acids. We can look at larger protein-DNA complexes, because we now have a way to properly represent the DNA. There's a host of mechanisms in the body where these are important, including gene regulation and DNA repair." Such simulations, which among other things can lead the way to finding new drug therapies for genetic diseases, are now underway in Kollman's research group.

Problem Solved: Electrostatics and Cutoffs

Simulating a large protein or DNA molecule surrounded by water often involves so many atoms (10,000 or more) that before PME it cost too much in computing time to calculate all the forces acting between atoms not bonded to each other. These push-pull interactions, which arise due to positive and negative charges, are called electrostatics. The same basic effect accounts for static electricity. The standard approach has been to assume that these forces, which decrease with distance, don't make much difference when two atoms are farther apart than a certain "cutoff radius" - usually around 10 angstroms.

This assumption made it possible to get worthwhile results, but problems tended to arise as researchers extended their simulations, especially with highly charged molecules such as DNA. Often, it's desirable to track biomolecular motions over a period of nanoseconds (billionths of a second), yet in many cases the molecule would unravel, losing its structural integrity, before the simulation could get that far. Starting in 1993, in a series of computations using PSC's CRAY C90, Darden zeroed in on electrostatics as the source of this problem and devised PME as a way to fix it.

A New Wave in Biomolecular Modeling

Darden's central insight was to see how Ewald summation, a method for summing up the charges in a crystal structure, could be combined with a very fast computational method, fast Fourier transforms (FFTs). In early runs on the C90, Darden's innovative synthesis demonstrated remarkable promise: high accuracy - because it computes all the electrostatic interactions - with only a modest increase in computing cost over a 10 angstrom cutoff.

The next step was the CRAY T3D. The starting point was a version of AMBER (widely used "molecular dynamics" software) adapted for parallel computing by Jim Vincent and Ken Merz at Penn State and refined by the developers of AMBER. Crowley attacked the task of parallelizing PME and overcame a major obstacle by creating an efficient, parallel three-dimensional FFT routine. With Crowley, Cheatham and Darden working as a team, the pieces came together, and in fall 1995, Cheatham and Kollman ran T3D simulations of DNA and RNA that showed stunning improvement in accuracy and stability over using a cutoff. "Without PME, these molecules just crumpled up in a few hundred picoseconds (trillionths of a second)," says Kollman. "With PME, they remain stable throughout the simulation."

Converged "average" DNA structures obtained from PME simulations starting with A and B form DNA.
Another study looked at two different forms of DNA. In experiments, DNA occurs in several variant structures depending on its cellular environment, the two most common being A form (high salt, low humidity environment) and B form (low salt, high humidity). Starting with both an A and B sequence, Cheatham ran simulations that converged to the same structure, a close cousin of the B form, within half a nanosecond - exactly what should happen according to theory. "They converged dead on," says Cheatham, "which shows that this method allows you to sample between different conformations and get to the lowest free-energy form - the most representative structure for a given sequence."

Cheatham is exhilarated by the possibilities for new research opened by PME and scalable parallel computing. Likewise, Kollman has a full slate of problems he can barely wait to attack. Some of these studies are already underway, addressing sequence-specific effects, detailed structural variations that depend on the particular base-pairs in a DNA strand. This new level of realism and detail will lead researchers toward solving such mysteries as protein-DNA recognition, a mechanism involved in myriad biological processes, and that depends on a protein's intriguing ability to recognize a specific base-pair sequence along the extremely long helical strands of DNA in our cells. "This is a new wave," says Cheatham, "an exciting time to be in this field."

The Spine of Hydration

In recent computations, Cheatham and Kollman simulated a DNA sequence composed of the same base-pair (A-T) repeated ten times. This image shows the average structure of this sequence overlapped with two related snapshots from the final nanosecond of a two nanosecond simulation. Clearly visible, twisting from lower left to upper right, is DNA's "spine of hydration" in the so-called "minor groove" that runs between the double-helical ridges. Density contours for water show the most probable (red) and slightly less probable (yellow) positions for water molecules. Only since implementation of particle-mesh Ewald on the CRAY T3D, say the researchers, is it feasible to carry out simulations like this that reproduce sequence-specific DNA structure and dynamics.

Researchers: Peter A. Kollman & Thomas E. Cheatham, University of California-San Francisco.
Hardware: CRAY T3D
Software: PME
Keywords: protein, DNA, RNA, particle-mesh Ewald, PME, nucleic acids, genetic disease, electrostatics, biomolecular modeling, fast Fourier transforms, FFT, AMBER, spine of hydration, protein DNA recognition, base-pair, minor groove.

Related Material on the Web:
Kollman Group Home Page
Information on AMBER, including T3D implementation of PME.
Projects in Scientific Computing, PSC's annual research report.

References, Acknowledgements & Credits