Early Success on PSC’s Game-Changing Computational System
May 31, 2016
Scientists have reported progress in fields such as genomics, public health, chemistry, machine learning and more in the first two months of use for PSC’s new supercomputer, Bridges. As of May 26, PSC had allocated time for 245 projects on Bridges, with many more expected.
Rather than just an incremental improvement in performance, Bridges represents a new way of doing business in high-performance computing (HPC). Researchers can adapt its flexible architecture to their specific needs, in effect creating a “custom supercomputer.” Users can select from the system’s features including massive computing power, components optimized for different types of computation and common computational tools not normally found on supercomputers, such as databases and popular Big Data software packages. And all of this comes with an unsurpassed ease of entry for researchers who never before needed HPC. Bridges’ early successes include:
- Bridges’ first users were the infectious disease experts of the National Institutes of Health-funded MIDAS network. In their first Public Health Hackathon at PSC in February, they gave twelve teams from across the U.S. and India the task of using Bridges to visualize data in a way that transformed understanding of an issue in public health. A team from Carnegie Mellon University’s Department of Statistics took first place with their SPEW VIEW tool, which maps the historical spread of diseases in the U.S. Teams from the University of Pittsburgh Department of Biomedical Informatics and PSC’s Public Health Applications Group took second and third place, respectively. Intel® and Dell sponsored the hackathon.
Wenxuan Zhong and Xin Xing of the University of Georgia used Bridges to assemble 378 billion base pairs of bacterial DNA from the intestines of healthy patients and those with diabetes. Such “metagenome assembly” doesn’t even try to chemically separate the DNA from many microbial species in a sample. Instead, the scientists sequence short DNA fragments of all the species at once, using computation to sort out the different microbes’ sequences as they assemble them. This massive task leveraged Bridges’ Intel® Omni-Path Architecture internal connections—the first such installation in the world—linking 20 computational nodes to finish the calculation in a blistering 16 hours. The team is now using Bridges to test a new statistical method on the sequence data to identify critical differences in gut microbes associated with diabetes.
Timothy Hele and Eric Fuemmeler of Cornell University used Bridges to calculate the electronic structure of TIPS pentacene, a large organic semiconductor molecule with applications in solar power cells. On Bridges, they were able to run two major computations in one job with the same memory requirements, saving time and improving the productivity of their research. Previously, they hadn’t been able to run those calculations on any other supercomputer available to them.
The PSC Public Health Application Group’s own Jay DePasse has used Bridges to model the possible benefits of flu vaccine choice in Washington D.C., Allegheny County, Pa., and Salt Lake City. DePasse used “agent-based modeling,” in which every person in an area is represented by a realistic virtual human in the simulation. He tested the results of offering both adults and children a choice between the new quadrivalent vaccine—which protects against more strains but is also more expensive—and the earlier trivalent vaccine. His initial results suggest that such a policy offering vaccine choice would be more cost-effective than alternatives such as no choice of vaccine, choice offered to children only and choice offered to adults only.
James Denvir and Swanthana Rekulapa of Marshall University in West Virginia assembled the genetic sequences of two species, the Narcissus flycatcher and the critically endangered Sumatran rhinoceros. They used a “de novo assembly” method, in which scientists first sequence millions of small fragments of DNA. Then the method uses brute computational force to piece together the fragments’ order via the sequences where they overlap. Using Bridges’ 3-terabyte large memory nodes, the researchers pieced together the 1 billion DNA bases of the bird genome in 6.6 hours—almost five times faster than possible with other available resources. The rhino assembly was also faster than possible elsewhere, with 3 billion bases assembled in 11 hours.
Bridges has also powered two workshops to help scientists learn to use the system for Big Data projects:
The March 31 “Introduction to Bridges” workshop offered a primer on the new system and how to use it to 184 researchers at 15 national sites in the NSF XSEDE supercomputing network.
An April 5 “Big Data” workshop made extensive use of Bridges by 243 researchers at 15 national sites in the NSF XSEDE supercomputing network, employing popular Big Data tools such as Hadoop and Spark.
About Bridges: Supported by a $9.65-million National Science Foundation award, Bridges is being delivered in a two-phase process by HPE (Hewlett Packard Enterprise) based on an architecture designed by PSC. The initial installation began operation in March 2016; the complete system will come online this Fall. Using software developed at PSC and hardware developed by HPE, Intel® and NVIDIA, Bridges provides flexible performance capabilities, combining large-memory nodes that can be applied to analyze huge amounts of data at unprecedented speeds with hundreds of smaller nodes for analyses that can be partitioned. This highly specialized system, consisting of HPE Apollo 2000 scale-out, density optimized servers; ProLiant DL580 scale-up, four socket servers; and Integrity Superdome X 12TB memory, 2-16 sockets scale-up servers was made possible through the HPE and Intel® HPC Alliance.