ZEST

Overview

The Zest file system is a patented, highly scalable parallel file system designed for maximum efficiency with write-intensive application workloads such as checkpointing.

The name “Zest” was chosen due to Zest’s nature of preferring writing to the outer most cylinders of its storage disks.

     >> View graph of disk write bandwidth vs. cylinder location

The following techniques are used to maximize scalability:

  • Parity for data protection in case of disk failure is calculated directly at the compute resource generating the data.
  • A non-deterministic data placement strategy is used in I/O server selection based upon each server’s eagerness to receive data.
  • Another non-deterministic data placement strategy is used after an I/O server has been selected. Disks to write client data are chosen based upon each disk’s eagerness to receive data, so long as two members of a parity group would not end up on the same disk.

Acknowledgements/Publications

Deployments

Zest has been outfitted to work with the following PSC machines:

  • Blacklight, a shared memory SGI UV system
  • BigBen, a Cray XT3
  • Pople, an SGI Altix shared memory NUMA system

Future Work

As Zest offers no direct read(2) support, a metadata server would be required to obviate the third party file system (such as Lustre) that is currently required to stage where I/O can be accessed in a POSIX read(2) fashion after being processed by Zest.

The MDS would track which chunks of data were resident on which I/O servers as a result of the non-deterministic data placement strategy that Zest uses to maximize efficiency.

Similar/Influenced Work

Contact Information

The PSC Advanced Systems group can be reached at This e-mail address is being protected from spambots. You need JavaScript enabled to view it .

Last Updated on Wednesday, 10 April 2013 09:39  

Systems & Operations