More Power to Us

Greenfield, Bridges Models Pinpoint Inefficiencies in Electric Power Storage

Why It’s Important:

If everyone used electricity at a constant rate, generating power would be simple. Generators that supply a fixed level of energy would be fairly cheap to run. But people use electricity at very different rates during the day. Spikes in use, for example, on a hot afternoon have to be met by firing up seldom-used and expensive generators. It’s even more complicated for commercial and industrial customers: They’re not just charged for the total energy they use, but also if they exceed set levels of energy use. Recent drops in the cost of batteries have made it possible for those customers to smooth out their need for grid power, potentially saving money for everyone. Michael Fisher, working with faculty advisor Jay Apt at Carnegie Mellon University, set out to understand how different scenarios and assumptions about batteries and energy use would affect the economics of “behind the meter” (BTM) batteries—batteries that belong to users or third parties as opposed to power companies—for commercial users.

“We wanted to investigate how BTM battery systems would be used in a commercial application. Residential users just pay energy charges; but commercial users pay for their peak usage as well as how much energy they use ... These peak charges can be up to 50 percent of an industrial or commercial customer’s bill. Batteries can make sense in mitigating those charges.”
—Michael Fisher, Carnegie Mellon University

How PSC and XSEDE Helped:

The CMU researchers used PSC’s interim Greenfield/DXC system and the new XSEDE-allocated Bridges system to model how a fleet of BTM batteries would behave under different assumptions using meter data from 665 commercial and industrial buildings. With help from XSEDE Extended Collaborative Support Service expert Roberto Gomez of PSC, they moved their model, which they’d built on personal computers using the popular statistical software MATLAB, onto the PSC supercomputers. Unusually for a supercomputer, both of these systems run MATLAB directly and so don’t require rewriting the software. The Greenfield run allowed the group to identify the factors most likely to affect the economics of the batteries. In later runs, Bridges’ size allowed the investigators to run computations on many different buildings in parallel, greatly speeding the calculations. The time savings made it possible for the investigators to test many more possible scenarios. The Bridges work showed that most of the wasted energy associated with battery storage (measured via pollutant emissions) stemmed from internal energy losses in the batteries and not the timing of charging and discharging. This points toward what technology improvements may be necessary to make BTM batteries economical in more markets, and the regulatory environment necessary to encourage their development. A paper describing the work is now in the second round of review at an academic journal.

“We forecast what the [electric] load would be for part of the day ... to see what the optimal step would be for the next 15-minute period ... and continued over the whole year. That’s 35,000 steps for each of 665 buildings ... Each step didn’t take very long, but it adds up. It would have taken months to run on a laptop. With Greenfield and Bridges we were able to do it in an hour.”

—Michael Fisher, Carnegie Mellon University

Read the journal article.

Gut Check

Bridges Reveals Interaction between Diabetes and Digestive-Tract Microbes

Sept. 7, 2016

Why It's Important:

Many people appreciate how serious diabetes is: it can cause blindness, nerve pain and lack of circulation in the limbs that leads to amputation, to say nothing of heart attack or stroke. But patients with diabetes also have a high risk of acid reflux, abdominal pain, nausea, ulcers, Candida infections and diarrhea. "Diabetic gut" poses a mix of digestive problems that causes suffering, disability and great medical expense.

One very important aspect of diabetes is that it affects the population of microbes that live in healthy intestinesthe "microbiome." Both the species of microbes and their relative abundances can change, leading to a vicious cycle in which diabetes alters the microbiome, which in turn leads to more changes. Worse, doctors don't really know much about how the microbial population changes in diabetesor whether these changes can in turn affect the course of the disease elsewhere in the body. Many of the species of gut bacteria involvedhelpful and harmfulcan't be grown in the laboratory and so have never been identified.

"People have invested a lot of money and effort on the genetic factors that affect diabetes. There have not been so many studies on environmental factors such as the microbiome. We know that the microbial distribution in diabetes and the healthy gut are different. What we'd like to know is the causalitywhat changes are caused by the disease and how changes affect the course of the disease."

Wenxuan Zhong, University of Georgia

How PSC and XSEDE Helped:

Scientists Wenxuan Zhong and Ping Ma of the University of Georgia's Franklin College, with Zhong's graduate student Xin Xing, set out to identify and sequence the DNA of virtually all the microbes in the healthy and diabetes-affected human gut. Working with XSEDE Extended Collaborative Support Service expert Phil Blood of PSC, they used advanced DNA sequencing technology and PSC's Bridges supercomputer to assemble DNA from many microbial species in human fecal samples at once. In this method, called "metagenomic assembly," researchers use massive computation to sort the DNA-sequence fragments into their proper species at the same time that they assemble each species' total DNA sequenceits genome.

The Georgia team has now used Bridges to run their new computer algorithm, assembling all the DNA sequences extracted from samples from 145 people with and without diabetes. They have identified about 2,100 microbial species' genomes in these samples. Two of these species appear to be present in lower numbers in patients with type 2 diabetes than in people without the disease. One of these two microbes is closely related to Roseburia intestinalis, a helpful bacterium known to strengthen immune responses that are weak in people with diabetes. Future goals include using such differences to glean clues to treating and preventing gastrointestinal problems in diabetes.

"I think Bridges has a great capability to help us finish our metagenomic assemblies. Overall it's been a great experiencethe large memory nodes have been really, really helpful for usand the human support is very good. Dr. Blood has been very helpful in optimizing the code."

Xin Xing, University of Georgia

Deeper Dive: How Do You Untangle Thousands of Genomes?

The usual tool for assembling genetic sequences from small DNA fragments is to line up the fragments where their sequences overlap, stringing together the whole genome bit by bit. Like fitting puzzle pieces together, it's a memory-intensive process. When the computer looks at a new sequence, it must remember whether earlier fragments contained a matching sequence. The University of Georgia scientists used Bridges' 3-terabyte large memory nodes to carry this task out.

Because the metagenomic assembly task contains DNA fragments from numerous microbe species, though, they needed other clues as well to avoid assigning a given sequence to the wrong species. With Blood's help, they wrote a computational algorithm to run on Bridges that used clues such as the known sequences from related, previously studied microbes; the amount of DNA from each species versus its portion of the whole population; and other pieces of information, to sort the fragments and assemble each species' genome.

Wild Things

Bridges Connects Evolutionary Biologists with Genomes of Wild Species

July 13, 2016

Why the Sumatran Rhinoceros Is Important

The Sumatran rhinoceros, as first examined and drawn by William Bell in 1793. By William Bell [Public domain], via Wikimedia Commons.The Sumatran rhinoceros, as first examined and drawn by William Bell in 1793. By William Bell [Public domain], via Wikimedia Commons.
Depressing but true: things don’t look good for the Sumatran rhinoceros. This unique tropical species is all but extinct in the wild. To make matters worse, they aren’t doing well in zoos either. Recently the world zoo community started shipping their rhinos to Malaysia, so that the surviving captive animals can be maintained in a central location—and in a climate more suitable to their survival in captivity.

Which is tragic, because the Sumatran rhino almost certainly has a lot to tell us about evolutionary survival and other species’ responses to climate change. Its closest relative is the Ice Age wooly rhinoceros of Eurasia. Almost certainly, the ancestors of the Sumatran rhinoceros started out as a cool-weather grassland species that adapted to a warmer and wetter climate. Herman Mays, an evolutionary biologist at Marshall University in West Virginia, teamed up with Jim Denvir, co-director of the genomics core facility at Marshall’s School of Medicine, to use PSC’s Bridges system to piece together the DNA sequence of this animal before it vanishes—and when it may yet offer clues to the species’ survival.

“The whole-genome approach is a whole new world for evolutionary biologists studying wild animals. We can look at functional differences in the entire genome at once. This allows us to look at how species specialized and how they got to be the way they are today.”

—Herman Mays, Marshall University

Why the Narcissus Flycatcher Is Important

Narcissus Flycatcher (Ficedula narcissina) in Osaka, Japan. By Kuribo via Wikimedia Commons
The Narcissus flycatcher is a bird with multiple personalities. As far north as the island of Hokkaido in Russia, they spend summers in high latitudes, then migrate south, to South China, Indochina and Borneo, for the winter. But there’s another population in the southern Ryukyu Islands of Japan that doesn’t follow these rules. Enjoying the more consistent, warm subtropical climate of these lower latitudes,the Ryukyu population doesn’t migrate.

This unique case—a single species, with both migratory and non-migratory populations—offers biologists insights into the genetic traits that have evolved to make long-distance migration possible. A team led by Herman Mays and Jim Denvir of Marshall University decided to apply advanced sequencing techniques and PSC’s Bridges system to create the first genetic sequence for the Narcissus flycatcher. Their aim is to understand how climate changes and genes interact to create this amazing phenomenon of migration.

 

  

 

“Working with Bridges I got a lot of help from [PSC’s] Phil Blood. He was very helpful in writing the code, as I didn’t have experience working on the [Bridges] environment. He also helped me determine how to do the job, estimate how long it would take and how to monitor its status.”

—Swanthana Rekulapally, Marshall University

How PSC and XSEDE Helped

Swanthana Rekulapally and undergraduate Megan Justice, working in Denvir’s team at the Marshall genomics facility, began by trying to assemble the sequence of the flycatcher first. The bird genome has 1 billion DNA bases, while the rhino genome has 3.3 billion.

But the team soon ran into some serious problems with the computing resources they had available. These limitations had to do with the nature of the “brute force” sequence assembly they needed to perform. Most “high-throughput” sequencing technology can only read 200 to 250 base pairs of nucleic acid at a time. When assembling a billion-base genetic sequence, you get a huge jumble of tens of millions of overlapping DNA “reads” that you need to piece together by computer, much as a person would assemble the pieces of a jigsaw puzzle.

For well-studied “model” species, such as humans, lab mice, fruit flies and the like, scientists already know the genetic sequence. A researcher assembling the genome of a new individual or a related species can use the known sequence to guide the assembly, much as we’d use the picture on the cover of a jigsaw puzzle box as a guide. But there aren’t existing assembled genomes for these wild species. There’s no box cover. Instead of a jigsaw puzzle, the task becomes more like the game Concentration. To find the overlaps, the computer needs to remember all the sequences it’s already looked at when it looks at a new DNA fragment. The more it can hold in its memory at once, the faster it can assemble the genome.

“Almost everything we do at our core facility has medical applications. We tend to work with human samples or mouse samples in which the genome is much better annotated and known. Studying rhinos and songbirds was a nice departure.”

—Jim Denvir, Marshall University

The supercomputers available to the Marshall scientists had 500 gigabytes of shared memory. This is powerful compared with the 16 gigabytes of RAM you might find on a high-end laptop computer. But it wasn’t enough—the flycatcher assembly kept crashing the system by running out of memory. That’s when Jack Smith, the National Science Foundation XSEDE network’s campus champion at Marshall, suggested they check out the supercomputers available through XSEDE. With help from XSEDE Extended Collaborative Support Service member and genomics expert Phil Blood at PSC, the team reviewed XSEDE’s resources. They decided the 3-terabyte (3,072 gigabyte) large memory nodes of PSC’s new Bridges system were what they needed.

With Blood’s help adapting their software to the Bridges environment, the large memory nodes did the trick: Assembling the flycatcher genome, which had crashed other systems, finished in only 6.6 hours. That’s at least five times faster than the failed assembly had been going, and much faster than they expected. Another surprise was that the rhino genome assembly went faster as well, with more than three times as much sequence assembled in 11 hours.

Now that the scientists have their data, they can begin analyzing it to answer a number of critical evolutionary questions: What genes contribute to a species adapting from one environment to another, and how does long-distance migration evolve? What traits contribute to a species being more or less resilient to climate change? Does looking at the entire genome at once give a fuller picture of why some species survive while others don’t? And can we use that knowledge to save more wild species?