Webinar: Neocortex CS-2 Overview
Presented on Tuesday, March 29, 2022, 2:30 – 3:30 pm (ET), by Dr. Natalia Vassilieva from Cerebras.
This webinar gives an overview of the recent Neocortex System upgrade, an NSF-funded AI supercomputer deployed at PSC, now featuring two Cerebras CS-2 systems. in order to help researchers better understand the benefits of the new servers and changes to the system.
The webinar recording can be found on the Neocortex portal.
|Table of Contents
|Code of Conduct
|Cerebras Wafer Scale-Engine 2
|Cerebras CS-1 and CS-2: Cluster-scale Performance in a Single System
|The Cerebras Software Platform
|Execution Mode on CS-1 for DNNs
|Execution Modes on CS-2 for DNNs
|Comparing Execution Modes
|CS-2 advantages for Pipelined
|Can fit larget models. How much larger?
|Can fit larget inputs. How much larger?
|Faster training. How much faster?
|CS-2 and Weight Streaming advantages
|Wafer Memory Management
|No layer partitioning
How do we request additional disk storage on the new CS2 machine? and identify if the system is a CS1 or CS2?
Neocortex is now CS2 only. The storage is on the SDFlex front-end, as before.
Does CS-2 enable significantly less allocation wait times (due to the availability of more cores etc)?
If the same-sized problem can be decomposed onto more processing elements, it will run faster. However, the larger size may allow for larger models to be run that were not able to be run before. We don’t know how the use will change to know the timing changes with any level of certainty.
So the ability to stream weights is due to new software and more cores, not fundamental changes to the hardware?
Yes, that is right, the software stack handles how the model is mapped and the availability of more cores and bandwidth allows us to do this with bigger models.
Are the weights/gradients synchronized in the multi-replica setting per batch (i.e all-reduce)?
Yes, that is right.
Not sure if I understand correctly, but for multi-replica, you need to aggregate gradients and update weights iteratively, correct? If so, how often?
In a single replica setting, updates happen every step (one passes through a batch). In multi-replica, one batch is distributed across all the replicas, and each replica process samples sequentially.
This question has been answered live at 43:03.
How many weights does the U-Net have here?
Around 31 million weights.
We mentioned 3D volumes here, are we going to support more on operation on these data types? Video, dynamic images, etc.
This question has been answered live: [45:01]
Is the weight streaming mode available with PyTorch code? Can I just import my model, and ask the CS-2 to run in weight-streaming mode?
This question has been answered live: [45:51]
Why proportional to batch size? You are streaming the data in also, right?
This question has been answered live: [47:25]
How fast can weights stream onto the cs2 chip from the external memory?
This question has been answered live: [48:25]
Is there a demo codebase and documentation we can get to utilize CS-2s?
This question has been answered live: [49:50]
Is there a way to request to certain types of models (computer vision-related) to be included in the releases? I have a specific model in mind that could benefit from weight streaming
This question has been answered live: [51:28]
Are you considering interfacing CS-2 to a quantum computer for hybrid quantum-classical processing for algorithms like Variational Quantum Eigensolver to find the ground energy state of small molecules?
This question has been answered live: [52:07]
If the model works in pipelined mode, is it likely to work with weight streaming? So I can check if all the operations are supported by the CS compiler
About the instructor
Dr. Vassilieva is the Director of Product, Machine Learning at Cerebras Systems, an innovative computer systems company dedicated to accelerating deep learning. Natalia’s main interests and expertise are in machine learning, artificial intelligence, analytics, and application-driven software-hardware optimization and co-design. Prior to Cerebras, Dr. Vassilieva was affiliated to Hewlett Packard Labs where she led the Software and AI group from 2015 till 2019 and served as the head of HP Labs Russia from 2011 to 2015. From 2012 to 2015, Natalia also served as a part-time Associate Professor at St. Petersburg State University and a part-time lecturer at the Computer Science Center, St. Petersburg, Russia. Before joining HP Labs in 2007, Natalia worked as a Software Engineer for different IT companies in Russia from 1999 till 2007. Natalia holds a Ph.D. in Computer Science from St. Petersburg State University.
Neocortex: An Innovative Resource for Accelerating AI and HPC Development for Rapidly Evolving Research
All Campus Champions Community Call Presentation
This presentation gives an overview of Neocortex for the Campus Champions community. Neocortex is an NSF-funded AI supercomputer at PSC. Neocortex targets the acceleration of AI-powered scientific discovery by vastly shortening the time required for deep learning training and fostering greater integration of deep learning with scientific workflows.
This webinar presents the Spring 2023 Call for Proposals and gives a system overview of Neocortex.
This webinar gives an overview of the recent Neocortex System upgrade to now feature two Cerebras CS-2 systems, in order to help researchers better understand the benefits of the new servers and changes to the system.