Quantum computations combined with chemist’s know-how offer the prospect of enzymes designed to produce useful hydrocarbons called “terpenes” in one fast step, the way nature does it

Parsley, sage, rosemary and thyme — four spices from an old song. Although the anonymous lyricist from hundreds of years ago wasn’t an organic chemist, it happens that each of these spices is related to a large class of hydrocarbon compounds called terpenes. The essential oils of violets and roses, peppermint, pine trees (hence “turpentine”), eucalyptus, oranges, and also frankincense and myrrh — all of them contain terpenes that are among thousands found in nature, many entering our lives as pungent fragrances and flavors.

PHOTO: Dean Tantillo
Dean Tantillo, University of California, Davis

“Most terpenes are plant-derived and have interesting, pleasant flavors or smells,” says University of California, Davis chemist Dean Tantillo, “but they have activities all over the map. Some have the ability to fight cancer.”

The terpene taxol, a widely used anti-cancer drug, effective because it inhibits cell division, exemplifies Tantillo’s interest in these compounds. Taxol comes from the bark of the Pacific Yew tree, and when chemists first isolated it in 1967, they faced huge obstacles in making it available as a pharmaceutical. Early efforts required 1,200 kilograms of bark to produce 10 grams of taxol. Estimates by the mid-1980s were that it would take 360,000 yew trees a year to meet the anticipated need for ovarian cancer alone, an amount impossible to sustain. Creative chemists eventually sidestepped this problem by developing ways to make taxol from available small molecules or a precursor chemical in yew-tree needles. The latter approach exploits nature’s ability to produce taxol’s complex core architecture, but the former involves a long sequence of chemical steps, and for most complex terpenes, chemists have been forced to follow this difficult approach.

In nature, explains Tantillo, the core of a complicated terpene is assembled by an enzyme catalyst in one step. “If you could harness the enzyme to do it for you, synthesizing a complex terpene would be much more efficient.” Tantillo’s research focuses on understanding these reactions, with the aim of unraveling a central mystery: Nature uses hundreds of different enzymes to produce hundreds of different terpenes, but all from only a handful of starting compounds. How does a given enzyme make an abundance of only one product when hundreds are possible?

“It’s rare that biology uses the same compound to make hundreds of different complex things so efficiently,” says Tantillo. “If we can understand why one enzyme is producing one terpene and a different enzyme is producing a different terpene, we should be able to apply that knowledge to rationally redesign the enzymes. The goal would be to take a readily available enzyme that makes one complicated organic molecule that perhaps has an interesting structure but no useful biological activity, understand how that enzyme works, and change it in ways that will cause it to produce a different molecule that is highly valued.”

A comparison between the transition state in a terpene-forming reaction and a stable molecule meant to mimic it. The blue parts of the surfaces are the most positively charged.

With demanding quantum computations on Pople, PSC’s SGI Altix system, Tantillo has explored the detailed, atomic-level chemistry of terpenes. His recent papers from this work — Nature Chemistry (August 2009) and JACS (March 2010) — report unanticipated findings that alter the standard view of terpene-forming reactions and point toward new understanding of the mysterious chemistry at play in producing terpenes in nature.

Pencil, Paper & Supercomputer

Terpenes are derived biosynthetically from units of isoprene, a biological building block that consists of five carbons — four in a row and one on the side —and eight hydrogen atoms, C5H8. The basic formula of terpenes is to multiply that, (C5H8)n — where n is the number of linked isoprene units.

Much of Tantillo’s recent work has focused on “sesquiterpenes,” a large family of terpenes formed from three isoprenes, 15 carbon and 24 hydrogen atoms, rearranged through reactions into complex configurations of carbon rings, fused together or linked by straight hydrocarbon chains. (Tantillo has also examined taxadiene, a diterpene, 20 carbons and 32 hydrogens, that’s a precursor to taxol.) The starting compound for this work is farnesyl diphosphate (FPP), a straight chain of three isoprenes ending with a diphosphate group, from which nature constructs hundreds of different sesquiterpenes.

For his computational modeling of sesquiterpene reaction pathways, Tantillo relies on GAUSSIAN03, software developed originally by Carnegie Mellon University chemist John Pople, who won the 1998 Nobel Prize in Chemistry for this work — and who, coincidentally, is the namesake of PSC’s supercomputing system that Tantillo’s team uses, usually eight processors at a time. “Pople has been the best computer at PSC for us,” says Tantillo, “because we’ve always been able to access it without a long wait. Also, it has many processors [384 dual-core], and although we don’t use all that many at a time, we sometimes are running tens of simultaneous calculations with eight processors each, so we need access to a lot of computing power.”

Reaction Pathway
Ball-and-stick representations of a transition state and two possible terpenes formed from it at a “fork in the road.” The green “sticks” in the transition state correspond to bonds being made or broken.

Within the framework of GAUSSIAN03, Tantillo relies on density-functional theory (B3LYP and other methods), a much-used approach that makes it possible to resolve with reasonable accuracy quantum calculations that would otherwise be unsolvable on a useful timescale. Tantillo and his co-workers first write the structures of FPP and the desired sesquiterpene end-product on paper, and then make educated guesses at the intermediate structures and transition states that connect them along the reaction pathway. Then, essentially, they build models of these structures in the computer and turn to GAUSSIAN03 to do the quantum heavy lifting.

“You have to be a good chemist to do this kind of work — knowledgeable and logical, but also open-minded,” says Tantillo. “You start with a carefully chosen guess at each structure — based on your experience and trained intuition. Then you use the quantum mechanics calculations to refine that. Sometimes, though, the results push you in unexpected directions.”

For each proposed structure, the software accounts for, among other things, attraction between electrons and nucleus within an atom and repulsive forces between electrons and between atomic nuclei of the molecule. After what can be hundreds of iterations, it arrives at a low-energy structure and provides the geometry and energy to the researcher.

“In general,” says Tantillo, “to find a transition state structure can take from hours to weeks, usually on the order of days. So in a five-step pathway, multiply that by five. From start to finish, a project that leads to one of our papers usually takes months.”

Changes in Surprising Places

Ultimately, an intrinsic reaction coordinate (IRC) plot of the energy versus molecular structure for the entire pathway from reactant to product is constructed. The IRC plot maps out the topography of the energy surface, like a mountain range with transition states at different peaks, and the reactant, intermediates, and product molecules in valleys — minimum energy regions — down the mountainsides.

This finding might be a key to the main puzzle of terpene chemistry.

Their calculations with FPP reacting to form various sequiterpenes often showed that the reaction path avoided expected intermediate steps. “The IRC calculations are super important for us,” says Tantillo. “One of the things we discovered is that when you write out a reasonable mechanistic proposal it might have, say, six steps, but we often find that two or three of the steps that any self-respecting chemist would write down actually merge into one step. That was unexpected.”

This unexpected finding led them to some careful double-checking. Following a pathway down from a transition state to the intermediate to which it is directly connected using the IRC procedure, they are able to show convincingly that they have not missed intermediate molecules. They were also surprised to find, however, that changes to molecular structure that in most reactions would take place simultaneously were, for the sesquiterpene reactions, spread out along the reaction path between a transition state and the intermediates connected to it.

“Let’s say you have three events combining into one,” says Tantillo. “As the transition state is approached, you might see the geometry of the molecule changing in a way that corresponds to only one of those events, like the formation of a new carbon-carbon bond. But as you walk down the surface from the transition state to the next intermediate, you see the geometry changing in different ways that correspond to the other two events. A lot of the interesting stuff that is happening — the bonds that are being made or broken — is often happening not near the intermediate or even near the transition state structure but somewhere in between, and that’s part of the energy surface that’s often not well characterized.”

IRC Plot
The results of an IRC calculation from a terpene reaction in which two events combine into one process. The numbers on each structure are computed distances in Angstroms.

Tantillo believes that this finding might be a key to the main puzzle of terpene chemistry. “The process we have discovered avoids some intermediate structures that would lead to undesirable products,” he explains. The relative lack of intermediates, each of which could act as a fork in the road of the reaction pathway, directs the reaction toward only a few products instead of many. “That’s the biggest general theme that has come out of this research so far.”

Tantillo envisions producing hydrocarbons with carbon skeletons as complex as that in taxol simply and quickly, with just one enzymatic step, as nature does it. “Our attitude is that if we can answer the question ’Why does one terpene-synthesizing enzyme make only one of the 400 or so possible terpene products,’ then we ought to be able to put that knowledge to use and say ’OK, well, let’s change these few parts of the enzyme active site, and it ought to produce this different terpene.’ Successfully redesigning an enzyme, a very complicated thing to do, would be the ultimate test of our models.”