PSC Symposium: Bridges

 

1pm January 27, 2017
300 S Craig St, Pittsburgh PA

Please join us on Friday, January 27, 2017 for a technical symposium on our latest supercomputer, Bridges, funded by a $9.65-million grant from the National Science Foundation.

Bridges offers new computational capabilities to researchers working in diverse, data-intensive fields, such as genomics, the social sciences and humanities. Bridges represents a new way of doing business in high performance computing. Researchers can adapt its flexible architecture to their specific needs, in effect creating a "custom supercomputer."

Bridges has already seen its first few months of use by the national scientific community. In that short time, users have reported progress in fields such genomics, public health, chemistry, machine learning and more.

The symposium will begin at 1pm at PSC, 300 S Craig St., Pittsburgh. The program will include an overview of the system along with research presentations from the University of Pittsburgh’s Center for Causal Discovery and Carnegie Mellon’s School of Computer Science and Language Technologies Institute. 

 

Bridges Overview

 

A Converged HPC & Big Data System for Nontraditional and HPC Research

Nick Nystrom, PhD
Sr. Director of Research, PSC

Bridges is a uniquely capable, data- and memory-intensive high-performance computing (HPC) system designed to integrate HPC with Big Data, deliver “HPC-as-a-Service”, and help researchers facing challenges in Big Data to work more intuitively. Funded by the National Science foundation, Bridges features large memory – 4 compute nodes with 12 TB of RAM, 42 with 3 TB, and 800 with 128 GB – and powerful new Intel® Xeon CPUs and NVIDIA Tesla GPUs for exceptional performance. Bridges also includes database and web servers to support gateways, collaboration, and data management, and it includes 10 PB usable of shared, parallel storage, plus local storage on each of its compute nodes. Bridges is the first production deployment of the Intel® Omni-Path Architecture (OPA) Fabric, which interconnects all of its compute, storage, and utility nodes. Prioritizing usability and flexibility, Bridges support a high degree of interactivity, gateways and tools for gateway-building, and a very flexible user environment. Widely-used languages and frameworks such as Hadoop, Spark, Python, R, MATLAB, and Java benefit transparently from Bridges’ large memory and its high-performance OPA fabric. Virtualization and containers enable hosting web services, NoSQL databases, and application-specific environments, enhance reproducibility, and support interoperation with clouds. Access to Bridges is available at no charge to the open research community and by arrangement to industry through PSC’s corporate programs.

 

Research Presentations

 

Graphical Causal Discovery from Big Biomedical Data

Gregory F. Cooper, MD, PhD
Director, Center for Causal Discovery
Professor of Biomedical Informatics, University of Pittsburgh

Science is centrally concerned with the discovery of causal relationships in nature. In the past 25 years there has been tremendous progress in the development of graphical methods for representing and discovering causal relationships from data, including big biomedical data. The Center for Causal Discovery (CCD) is developing and making available state-of-the-art graphical causal discovery software that is capable of analyzing very large biomedical datasets. This talk will provide an overview of the CCD and give a concrete example of the use of its tools to analyze biomedical data.

 

Recurrent Neural Networks for Time Series Prediction

Florian Metze, PhD
Associate Research Professor, School of Computer Science and Language Technologies Institute (LTI), Carnegie Mellon University

Speech recognition can be seen as a sequence-to-sequence translation task, where a sequence of input features (“frames”) are mapped to a sequence of output symbols, namely phonemes or words. In the past, work has focused on improving the frame-level predictions, while the accuracy of the actual output sequence was an indirect effect. Recurrent Neural Networks are beginning to change that, and offer computationally tractable ways of predicting entire sequences under a single optimization criterion.

In this talk, I will give a brief introduction to the theory of Recurrent Neural Networks and present their most common and interesting use cases in speech recognition, natural language processing, multi-media analysis, or other “big-data” applications. I will outline tasks and data sets, and attempt to distill the key factors that contributed to making Deep Learning so successful. I will also look to other domains that involve (time) series modeling and speculate about possible uses of Recurrent Neural Networks there.