PSC Machine Timeline
In keeping with its mission of providing cost-effective leading-edge capability to the national scientific community, PSC is continuously implementing more powerful and more cost-effective platforms. Here’s a look back at the wide array of computing resources we’ve supported.
Cray X-MP/48, 1986 – 1989
The original machine at Pittsburgh Supercomputing Center, the X-MP could perform up to 840 million arithmetic operations every second. It had eight million words of memory and was connected to sixteen DD-49s. It was also equipped with a 128 million word SSD (solid state storage device). This SSD could transfer data to the main processor 100 times faster than the disks, and effectively expanded the X-MP’s memory to 128 million words. The X-MP had four independent processors, each of which had fourteen independent functional units.
Cray Y-MP/832, 1989 – 1993
Three times the power and four times the memory of the X-MP, this machine was next big thing in supercomputing. The Y-MP retained software compatibility with the X-MP, but extended the address registers from 24 to 32 bits. High-density VLSI ECL technology was used and a new liquid cooling system was devised. The Y-MP ran the Cray UNICOS operating system. The Y-MP could be equipped with two, four or eight vector processors, with two functional units each and a clock cycle time of 6 ns (167 MHz). Peak performance was thus 333 megaflops per processor. Main memory comprised 128, 256 or 512 MB of SRAM.
Connection Machine CM-2, 1990 – 1992
The CM-2 was the first major component for PSC’s plans for heterogeneous computing. With 32,000 separate processing units, the CM-2 was a “massively parallel” computer, in a sense the antithesis of the Cray Y-MP, which had 8 extremely powerful processors. Each CM-2 processor was less powerful than a personal computer, but for appropriate problems it attained supercomputer performance via the team approach: All 32,000 processors could compute simultaneously, working on independent segments of the job.
A high-speed link connected the CM-2 and the Y-MP, allowing users to divide tasks between the machines, taking advantage of the strengths of each for appropriate parts of their research problems.
Connection Machine CM-5, 1992 – 1993
The Connection Machine CM-5 was a major step in massively parallel computing, moving beyond its predecessor, the CM-2, in a number of ways. Perhaps the most significant was its ability to be both a lock-step or SIMD (single instruction, multiple data) data-parallel machine, like the CM-2, in which all processors run identical instructions, and also a MIMD (multiple instruction, multiple data) machine, in which each processor can apply its own set of instructions to its own data.
The CRAY C90, 1992 – 1999
The PSC’s CRAY C90 (or, more correctly, C916/512), nicknamed Mario, ran UNICOS, based on AT&T UNIX System V, with Berkeley extensions and Cray Research, Inc. enhancements. Compared to the Y-MP, its predecessor at PSC, the C90 processor had a dual vector pipeline and a faster 4.1 ns clock cycle (244 MHz), which together gave three times the performance of the Y-MP processor. The maximum number of processors in a system was also doubled from eight to 16. The C90 series used the same Model E IOS (Input/Output Subsystem) and UNICOS operating system as the earlier Y-MP Model E.
PSC was the first non-government site in the U.S. to receive a Cray C90.
The CRAY T3D, 1993 – 1999
The CRAY T3D system was the first in a series of massively parallel processing (MPP) systems from CRAY Research. PSC’s T3D prototype machine was tightly coupled to CRAY Y-MP and C90 systems through a high speed channel, creating a powerful heterogeneous environment.
AlphaCluster, 1995 – 1998
The PSC’s AlphaCluster was a collection of DEC Alpha workstations that offered a supercomputing class resource. With its ability to execute both single- threaded and distributed codes, the Cluster complemented the PSC’s Cray C90 vector machine and massively parallel Cray T3D. Applications suited to a single high performance scalar machine, or to a loosely coupled set of such machines, were good candidates for the AlphaCluster.
The CRAY J90s, 1995 – 2002The last of PSC’s Cray J90s was decommissioned in July 2002.
The CRAY T3E, 1996 – 2004
PSC’s T3E, nicknamed Jaromir, was the first production T3E shipped from Cray Research, Inc. The T3E was a scalable, massively parallel distributed memory machine using a 3D torus topology interconnection network. The initial 256-processor configuration expanded to 512 processors. Over the span of its service Jaromir provided 25 million CPU hours to more than 3,000 researchers.
The Sequence Analysis Resource, 1999 – 2006
The PSC Sequence Analysis Resource, an Alphaserver 8400 5/300 system, supported much of PSC’s early intensive efforts in bioinformatics.
Lemieux, 2001 – 2006
The Terascale Computing System, also known as Lemieux, comprised 610 Compaq Alphaserver ES45 nodes and two separate front end nodes. Each computational node was a 4 processor SMP, with 1-GHz Alpha EV68 processors and 4 Gbytes of memory. A dual-rail Quadrics interconnect linked the nodes.
Lemieux was primarily intended to run applications with very high levels of parallelism or concurrency (512 – 2048 processors).
At the time of its installation in 2001, Lemieux was the most powerful system in the world committed to unclassified research.
Marvels, 2003 – 2008
Jonas, named for Jonas Salk, and Rachel, named for Rachel Carson, were GS1280 AlphaServers from Hewlett-Packard. They had a shared memory architecture and exceptional “memory bandwidth” (the speed at which data transfers between hardware memory and the processor) — five to ten times greater than comparable systems of the time. Rachel and Jonas were among the first GS1280s to roll out of HP production.
When they arrived at PSC in 2003, each had 32 Gbytes of shared memory and 16 EV7 processors. By 2008, the Rachel and Jonas systems had each grown to be a loosely coupled set of machines. Each machine held 64 processors and 256 Gbytes of shared memory. Jonas was dedicated to biomedical research, while Rachel supported NSF science and engineering.
Cray XT3, 2005 – 2010
Nicknamed Bigben, the Cray XT3 MPP system had 2068 compute nodes linked by a custom-designed interconnect. Each node contained one dual-core 2.6 GHz AMD Opteron processor (model 285). Each core had its own cache, but the two cores on a node shared 2 Gbytes of memory and a network connection. Nineteen dedicated IO nodes were also connected to this network.
Bigben was primarily intended to run applications with very high levels of parallelism or concurrency (512-4136 cores).
Pople, 2008 – 2011
Pople was an SGI Altix 4700 shared-memory NUMA system comprising 192 blades. Each blade held 2 Itanium2 Montvale 9130M dual-core processors, for a total of 768 cores. Each core had a clock rate of 1.66 GHz and could perform 4 floating point operations per clock cycle, bringing the total floating point capability of Pople to 5.1 Tflops.
The four cores on each blade shared 8 Gbytes of local memory. The processors were connected by a NUMAlink interconnect. Through this interconnect, the local memory on each processor was accessible to all the other processors on the system.
Salk, 2008 – 2015
Salk was an SGI Altix 4700 shared-memory NUMA system dedicated to biomedical research. It comprised 36 blades; each blade held 2 Itanium2 Montvale 9130M dual-core processors, for a total of 144 cores.
The four cores on each blade shared 8 Gbytes of local memory. The processors were connected by a NUMAlink interconnect. Through this interconnect the local memory on each processor was accessible to all the other processors on the system.
Warhol, 2009 – 2013
Warhol was an 8-node Hewlett-Packard BladeSystem c3000. Each node had 2 Intel E5440 quad-core 2.83 GHz processors, for a total of 64 cores on the machine. The 8 cores on a node shared 16 Gbytes of memory, and the nodes were interconnected by an InfiniBand communications link.
Warhol was provided as a resource for researchers in Pennsylvania.
Blacklight, 2010 – 2015
Blacklight was an SGI Altix UV 1000 supercomputer designed for memory-limited scientific applications in fields as different as biology, chemistry, cosmology, machine learning and economics. Funded by the National Science Foundation (NSF), Blacklight carried out this mission with partitions with as much as 16 Terabytes of coherent shared memory.