Just in Time

Virtual File System Will Save Vast Computer Storage Space by Processing Images on the Fly

July 19, 2016

Researchers analyzing complex multidimensional images may be able to save hundreds of terabytes of disk space, a team from PSC reported at the XSEDE16 supercomputing conference in Miami today. Their “virtual file system” software now in development will carry out image processing on the fly, for any viewing software, saving vast data storage by making it unnecessary to maintain multiple copies of processed datasets.

“Let’s say you have 100 terabytes of electron microscopy data,” says Arthur Wetzel, principal computer scientist at PSC and first author of the peer-reviewed paper accompanying the presentation. Users will begin to analyze the images as soon as they become available; but as the image processing progresses, better images become available. “Yet there’s something that they want to keep before working on the new images … Pretty soon this 100 terabytes has multiplied by at least eight times. That’s not practical for long-term storage.”

The virtual file system will solve this problem by keeping the raw images unchanged, storing only the data required to reproduce a processed image rather than the entire image, according to coauthor Jennifer Bakal, PSC public health applications programmer. It does this while producing output that can be processed by any application expecting files. The software will in effect trade computational power for storage space, re-generating desired processed images on the fly instead of storing them.

The work, with coauthor Markus Dittrich, formerly director of PSC’s Biomedical Applications Group and now at BioTeam Inc. of Middleton Mass., builds on PSC’s image processing effort in the National Library of Medicine’s Visible Human project of the 1990s.

XSEDE16 is the fifth annual conference of the National Science Foundation’s Extreme Science and Engineering Discovery Environment (XSEDE) project. The conference showcases the discoveries, innovations, challenges and achievements of those who use and support XSEDE resources and services, as well as other digital resources and services throughout the world. The theme of XSEDE16 is “Diversity, Big Data & Science at Scale: Enabling the Next-Generation of Science and Technology.”

A Virtual Filesystem for On-demand Processing of Multi-dimensional Datasets