Their results provide the most comprehensive picture yet of protein folding.

Protein Folding

Anyone who's ever twisted, twirled and tied a strand of shoe-lace licorice knows it can be twisted, twirled and tied only so many times before becoming a tight-knit glob. In biology, the structural equivalent of that glob is the protein, and the twisting process is called protein folding.

Whether serving as enzymes or fundamental units of tissue, proteins are key to successful cellular processes. The progression that takes a loose collection of linked molecular chains and turns it into a snugly-packed, ordered bundle, is an area of keen interest, because only after folding occurs can a protein carry out its assigned task.

Using Pittsburgh Supercomputing Center's C90 and T3D, Charles Brooks, Erik Bozcko and William Young are helping unravel the mystery of how these strands of amino acids ravel in the first place. In addition to aiding future drug design efforts, as well as providing road maps for custom-designed enzymes, deconstructing folding offers the chance to better understand crucial biological processes.

Filling in the Blanks

Previous experiments have shown that certain proteins, if unfolded then allowed to refold, resume their original shape. But folding occurs so rapidly -- a protein will assume millions of different shapes in milliseconds before settling on its final shape, the native state -- that only the unfolded and native states have been observed experimentally. For this incomplete photo essay, supercomputing is providing the missing snapshots. "Experiments provide the real information about the system," says Brooks, "and modeling deepens understanding of what the experiments are telling us."

Using the C90, Brooks and Bozcko performed the equivalent of submerging an intact protein in water to examine its folding process. Where past efforts calculated movements for a piece of protein in no medium, Brooks and Bozcko accounted for an entire protein surrounded by water. Their study focused on protein A, which resides on the outer wall of bacterial cells. Their results, published in Science, support the energy funnel theory, which explains folding as successive restrictions on a protein's potential to change shape. They also provide the most comprehensive picture yet of protein folding.

The project demanded thousands of multi-processor C90 hours during a year-and-a-half period. "That was one of the largest problems ever tackled in computational biophysics," says Brooks, "well in excess of 3,000 hours. Erik's work represents a tour de force calculation never before attempted and possible only because of the resources made available by PSC. We wouldn't have done it otherwise."

The leucine zipper parallel coiled-coil dimer with "zipped" leucines (yellow).

Zip-Locked Proteins

In work currently underway, Brooks and Young are using the T3D to study the leucine zipper coiled-coil dimer, a piece of ubiquitous protein architecture. The structure consists of two identical helical strands of protein wrapped around one another and linked by a succession of leucine molecules that "zip" the two units together. Coiled coils can form within the same protein or connect separate proteins and are found in muscle, hair, skin, blood-clotting components and DNA. Though the leucine links keep the helices from touching, the effect of this coupling is akin to two entwined pieces of rotini.

Brooks and Young want to learn whether the helices form before, during or after this molecular marriage. Experiments suggest the coils meet as strands and form helices while wrapping around each other to create the coiled coil. Using supercomputing to heat and hence unfold the protein, the researchers worked backward, capturing snapshots along the way, to examine the progression.

Nucleus of Folding
Initiation of folding (top) occurs at nascent helices, represented as purple ribbon and orange cord. As folding proceeds (bottom), the coiled-coil dimer starts to form. The orange cord indicates the still unstructured portion of the chain.
The job totaled about 40,000 T3D processing hours with the molecular dynamics package CHARMM. With "load-balancing algorithms" developed by Young, the parallel version of CHARMM scales well, with only slight performance drop from the communications overhead of using more processors. Going from 64 to 128 processors for this project, resulted in an 85 percent speedup, enabling the researchers to meet a self-imposed deadline. T3D parallelism, says Young, was the key to getting results: "We got efficient use of large numbers of processing elements, and that makes a big difference."

The computations confirm and expand upon the experimental results, providing more information about the forces involved in protein folding. "Simulations show that these two helical strands are all stretched out, except for one or two turns," says Brooks. "This small amount of helical structure, which falls below the detection limits of lab tools, is the nucleus for folding." By showing that folding depends on the helical interface, the simulations underscore the interplay between experiment and computation, and they put the ball back in the experimentalists' court. If experiments confirm the computational finding, says Brooks, "we will have learned much more about how this protein assembles."



Researchers: Charles L. Brooks III, Scripps Research Institute.
Erik Boczko, Georgia Institute of Technology.
William Young, Pittsburgh Supercomputing Center.
Hardware: C90 and T3D
Software: CHARMM
Keywords: protein, enzymes, antibodies, oxygen, protein-folding, amino acid, sequence, three-dimensional structure, molecular dynamics (MD), DNA structure, apomyoglobin, myoglobin, nuclear magnetic resonance (NMR), CHARMM.

Related Material on the Web:
The Scripps Research Institute
Georgia Institute of Technology
Projects in Scientific Computing, PSC's annual research report.
Projects in Scientific Computing: How Proteins Get In Shape

References, Acknowledgements & Credits