Data-driven computational pipelines on Bridges-2
June 26, 2023
2:00 pm – 3:00 pm Eastern time
Join us for this webinar describing how use data-driven computational pipelines on Bridges-2.
Ivan Cao-Berg, Pittsburgh Supercomputing Center
In the era of big data, computational pipelines have become indispensable for efficiently processing and analyzing vast amounts of data. With the advent of high-performance computing systems like Bridges-2, researchers now have access to unprecedented computing power and resources. However, designing and executing data-driven computational pipelines on such systems can be challenging.
This presentation aims to explore the advantages and some use cases of three popular workflow management systems: NextFlow, Snakemake, and cwltool, all within the context of Bridges-2. These systems provide a streamlined approach to building scalable and reproducible computational pipelines for processing biological data.
Additionally, we will discuss best practices for deploying these systems on Bridges-2, including resource management, job scheduling, and data management strategies. We will also address the challenges and potential solutions encountered when integrating these workflow management systems with Bridges-2’s unique features and constraints.
By the end of this presentation, attendees will have a generic understanding of NextFlow, Snakemake, and cwltool, and how these frameworks can empower researchers to build robust and scalable data-driven computational pipelines on Bridges-2.