Welcome once again to Pittsburgh Supercomputing Center’s biannual report, featuring the pathbreaking projects being pursued by our users, our staff and our students. We’re very pleased in this issue to cover a broad spectrum of work, including research, technical milestones and ongoing programs to train new generations of high performance computing (HPC)-savvy scientists and engineers.
The work described in the following pages ranges from HPC staples such as cosmology ( p. 10) and molecular dynamics simulations ( pp. 18 and 24)to fields of study that are new to HPC, such as the humanities (p.6), public health and financial services (p.26). Many of these applications fit under an over-arching category, and technical challenge, of HPC: “Big Data.”
While Big Data has a number of definitions, one critical aspect is that researchers are dealing with amounts of data so vast that retrieving specific information in a practical timeframe is difficult if not impossible with current infrastructure and technologies. PSC is involved in a number of projects intended to broaden and deepen our capabilities to extract information from the ballooning data being generated in science—and in many other human endeavors—more quickly and efficiently.
Since our last report PSC received a $7.6-million National Science Foundation grant to design and build the Data Exacell (DXC, p. 14). A new pilot system pursued with select scientific collaborators, DXC will identify optimal methods and hardware to securely and accessibly store extremely large-scale data with specialized analytics optimized for their study. Analytic resources will include Blacklight, still the largest shared-memory supercomputer available to researchers, and Sherlock, our new graph analytics supercomputer. (For a list of our supercomputing resources, see “PSC’s Supercomputing Resources”.)
Communication between resources—networking— is a major component of any Big Data system. PSC’s renowned networking group has been producing new software and configuring hardware systems that speed, widen and prioritize network connections to avoid bottlenecks and support the largest data users (p. 16). Our efforts in this field are second to none in the HPC community.