Data Collections

Bridges hosts both public and private datasets, providing rapid access for individuals, collaborations and communities with appropriate protections.

Data collections are stored on pylon2, Bridges' persistent file system.  The space they use counts toward the Bridges storage allocation for the grant hosting them.

If you would like to store a large data collection on Bridges, submit the Community Dataset Request form.  


Publicly available datasets

Some data collections are available to anyone with a Bridges' account.  They include:

Natural Languge Tool Kit Data

NLTK comes with many corpora, toy grammars, trained models, etc. A complete list of the available data is posted at:

Available on Bridges at /pylon2/datasets/community/nltk

