Data-driven computational pipelines on Bridges-2

June 26, 2023

2:00 pm – 3:00 pm Eastern time


Join us for this webinar describing how use data-driven computational pipelines on Bridges-2.

Ivan Cao-Berg, Pittsburgh Supercomputing Center

In the era of big data, computational pipelines have become indispensable for efficiently processing and analyzing vast amounts of data. With the advent of high-performance computing systems like Bridges-2, researchers now have access to unprecedented computing power and resources. However, designing and executing data-driven computational pipelines on such systems can be challenging.
This presentation aims to explore the advantages and some use cases of three popular workflow management systems: NextFlow, Snakemake, and cwltool, all within the context of Bridges-2. These systems provide a streamlined approach to building scalable and reproducible computational pipelines for processing biological data.
Additionally, we will discuss best practices for deploying these systems on Bridges-2, including resource management, job scheduling, and data management strategies. We will also address the challenges and potential solutions encountered when integrating these workflow management systems with Bridges-2’s unique features and constraints.
By the end of this presentation, attendees will have a generic understanding of NextFlow, Snakemake, and cwltool, and how these frameworks can empower researchers to build robust and scalable data-driven computational pipelines on Bridges-2.

About Ivan Cao-Berg

Ivan is a research software specialist in the Biomedical Applications Group tinkering with technology in scientific related projects. At the moment, Ivan is involved in several projects HuBMAPThe Brain Image LibrarySenNet and on occasion, with the National Center for Multiscale Modeling of Biological Systems.