Cornell Theory Center Becomes TeraGrid Science Gateway

The Cornell Theory Center (CTC), an interdisciplinary research center at Cornell University focused on providing cyberinfrastructure resources for research and education, today announced its connection and partnership with the NSF's TeraGrid.

January 24, 2007 — The Cornell Theory Center (CTC), an interdisciplinary research center at Cornell University focused on providing cyberinfrastructure resources for research and education, today announced its connection and partnership with the National Science Foundation’s TeraGrid. As a TeraGrid Science Gateway Partner, CTC will initially provide data from the Arecibo Observatory to the national community as an integrated service provider within the TeraGrid facility. CTC’s Science Gateway, through the TeraGrid facility, will enable national and international researchers, students and the general public to use this information through a common internet Interface. As importantly, the integration of the CTC digital assets with the TeraGrid facility will allow users to develop applications that leverage TeraGrid computational systems to analyze data collections at CTC.

Charlie Catlett, Director of the NSF TeraGrid Project said, “The TeraGrid was established as a cyberinfrastructure foundation to integrate the computational, data, and visualization capabilities available to the scientific community. We anticipate the data collections from the CTC will be in great demand and we are pleased to partner with CTC to develop services and capabilities that will begin to weave the nation’s digital assets into a national data framework, analogous to today’s networking and computational frameworks.”

The Arecibo Observatory, the world’s largest and most sensitive single-dish radio/radar telescope is operated by the National Astronomy and Ionosphere Center (NAIC) at Cornell under a cooperative agreement with the NSF. Arecibo provides state-of-the-art observing facilities for scientists in radio astronomy, solar system radar astronomy and atmospheric studies. The volume of information being gathered in astronomy today is estimated to be doubling every 1.5 years or so. This huge growth in data volume is accompanied by a great increase in data complexity. Cornell astronomers, along with consortia of national and international researchers, use the Arecibo telescope to conduct data-intensive surveys. These surveys will produce on the order of thousands of terabytes of data. The TeraGrid will provide a single source of entry to this information, allowing it to reach a larger astrophysical community. Arecibo data and refined data products on pulsars and galaxies will be a unique resource for years to come, providing synergistic opportunities with other large-scale surveys that have been done and with telescopes of the future, including the Gamma-ray Large Area Space Telescope, to be launched later this year. Access to astronomical data at the CTC will be done in accordance with virtual observatory methods that are now being developed.

CTC will use a 10 G wave, acquired from the National LambdaRail (NLR), to link into the TeraGrid. Cornell University is one of 14 members of the NLR. External access to the Arecibo dataset requires high bandwidth connectivity and CTC’s connection to the TeraGrid will provide this performance. The Arecibo data can be accessed by users via a web portal on the TeraGrid site (

CTC’s eScience unit (eSU) has worked closely with the Arecibo group to develop database structures and procedures. eSU provides the umbrella under which the Arecibo collection lives. It provides a breadth of services to researchers with data-intensive applications – data management, database programming and consulting, data-driven application design and development, data curation, and data mining. eSU conducts leading-edge research in related data management and data-mining topics. “With the TeraGrid we will be able to give the scientific community access to Cornell’s data resources and analysis tools,” comments Johannes Gehrke, Associate Director of CTC.

“The Arecibo data collection and other Cornell data collections soon to be available via the TeraGrid are producing large, diverse datasets. It is important that such data be archived for use by current science teams, but also those scientists who may wish to use them for wholly different, even unanticipated, science applications,” says CTC acting director, Anthony Ingraffea.

A second data compilation, a combined library and laboratory or “Web Lab” based on historical collections of the Internet Archive funded by a National Science Foundation’s cybertools grant to CTC will be available on the TeraGrid later this year.

The improved architecture of the latest Intel Xeon dual-core processors and the Dell blades, combined with the high-speed InfiniBand interconnect, will result in improved performance and scalability of applications that run on Lonestar.

By the end of September, all current blades will be replaced with new blades containing state-of-the-art, dual-core Intel processors with increased floating point capability, and additional nodes will be integrated via a larger InfiniBand fabric. The resulting cluster will have 1300 nodes and offer much greater performance and memory to enhance research capabilities.