Opening the Flood Gates

Argonne, PSC Staff Shepherd Internet2 Migration, Give XSEDE Network Bandwidth Needed for Big Data Era

Monday, June 24, 2013

Thanks to personnel at Argonne National Laboratory and PSC — chiefly Linda Winkler, senior network engineer, Argonne; Joseph Lappa, principal network design engineer, PSC and Kathy Benninger, network performance engineer, PSC — the National Science Foundation’s network of supercomputing sites now has the “pipe capacity” it will need to keep pace with the Big Data era.

XSEDE, the National Science Foundation’s U.S.-wide network of high performance computing centers, which includes Argonne and PSC, has migrated its data network to Internet2, a vastly higher-capacity system than the previous carrier. XSEDE’s improved network will enable sites to achieve connection rates of up to 100 Gigabit per second (100 GE) — 10 times faster than currently possible. The architecture of the new system will also enable a number of upgrades that will help the transfer of data through the system.

As part of the Internet2 migration, Lappa has taken on new responsibilities for the XSEDE network. Newly appointed as XSEDE’s operations networking manager, he will be XSEDE’s main contact with Internet2. In this role, he and his team will monitor the performance of the new network, oversee details of transitioning sites to 100 GE, assist with campus bridging and help Internet2’s programmers and service representatives optimize and tailor the network to XSEDE and its users’ needs.

The approaching bottleneck

In 2006, Senator Ted Stevens made the mistake of referring to the Internet as “a series of tubes.” He instantly became the brunt of jokes about a guy who grew up in a time when people communicated via post, in cursive script, trying to make sense of an email world. But to be fair, it isn’t such a bad metaphor.

Information — data — is as critical to our economy and society as fresh drinking water is to our homes. Like the plumbing running through our houses, the Internet transports data through “pipes” that are limited both by their size and by the capacity their “faucets” can deliver.

Users at XSEDE sites employ some of the largest, fastest computers in the world to generate vast volumes of data. Moving those data between researchers, the supercomputers and storage sites is no small mission. To accomplish that job, XSEDE originally built what was then one of the highest-capacity, most reliable networks in the world.

“Advanced networking is critical … to support the researchers and educators who are making innovative use of our … resources,” says John Towns, XSEDE project director, noting that XSEDE supplies about 8,000 users with 17 supercomputers, data storage and management tools and networking resources.

In the Information Age, though, technology ages quickly. As the XSEDE network and its demands grew, it began to approach the limits of its infrastructure: in particular, a potential bottleneck between XSEDE sites in Denver and Chicago loomed large.

“As far as the technical reasons for migrating to Internet2, it was the ‘speeds and feeds’ problem,” Lappa says. A factory, for example, can perform an operation on a product quickly (speed). But if it can’t then move the next product up the line (feed) fast enough, that speed is wasted. Similarly, the blinding speed of XSEDE’s computing machines was in danger of being made far less relevant by the approaching difficulty of getting data into and out of them.

Unclogging the pipes

Internet2’s 100 GE backbone proved to be the solution to the problem, Benninger says. “With 100 GE, there is a clearer path to allow us to operate.”

While not all the sites will initially have 100 GE connections to the new backbone, she adds, the system will have room to grow to meet the next three years’ needs. Currently, Indiana University and Purdue University share a 100 GE connection, with a number of other sites planning to upgrade over the next several years.

In addition to supplying the leadership for the migration process, PSC also served as one of the first sites on the new network, testing out and helping Internet2 improve and customize the system to serve XSEDE’s needs.

Internet2’s architecture offers a big plus in terms of managing data flow with what’s known as “dynamic provisioning capability.” If a particular network path between two sites is congested with large data flows, a network engineer can establish a virtual local area network (VLAN) to route additional data transfers over an alternate path.

Future upgrades

In addition to optimizing the network and helping sites connect with the backbone or upgrade to 100 GE, Benninger and Lappa will support efforts by a number of PSC and XSEDE staff to add new functions that take advantage of the higher bandwidth.

  • The XSEDE-wide File System (XWFS) will allow the increasingly large files required by researchers to be moved rapidly between XSEDE sites.
  • Web 10G, developed by Chris Rapier, PSC network programmer, Andrew K. Adams, PSC network engineer and John Estabrook, network programmer at the National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, will monitor data flowing from servers to the network to help pinpoint sources of slowdown even as they happen.
  • VLAN (virtual local area network) provisioning will allow any two XSEDE sites to set up a “virtual network” between the two sites that performs as if it were a direct, hard-wire data connection, avoiding the need to set up potentially complex routing through the network.