MOSAIC - the MOlecular SimulAtion Interchange Conventions

Mosaic is a modular set of data models and file formats for molecular simulation. The two main goals of the Mosaic project are

For more background information, see the Mosaic design criteria and the paper "MOSAIC: A Data Model and File Formats for Molecular Simulations". For those who don't have a subscription to the Journal of Chemical Information and Modeling, the paper is available for free (limited to 50 downloads during the first year) through the ACS Articles on Request service (registration required).

Contributing to Mosaic development

If you would like to contribute to the development of Mosaic, please join the mailing list.

Current status

Mosaic has been used successfully in real research projects and publications. Future developments of the specification are most likely to be extensions. The Python library should be considered alpha-level, because details of the interface can still change.

Mosaic overview

Mosaic currently defines five kinds of data items:

The Mosaic data model defines how these data items are expressed in terms of basic data items (integers, floats, text strings) and basic data structures (lists, arrays, trees, ...). The Mosaic file format definitions describe how the data model is represented in files. Conversion between different Mosaic file formats is exact; no information is lost and no information needs to be deduced.

Currently there are two Mosaic file formats:

The Mosaic Python library

The current Mosaic Python library implements three in-memory representations of the Mosaic data model and I/O in the two supported file formats. It also contains an import module for structures from the Protein Data Bank. A simple command-line tool can convert between the file formats and write imported PDB structures to Mosaic files.

Related projects

Acknowledgements

Mosaic development was supported by the Agence Nationale de la Recherche under contract N° ANR-2010-COSI-001-01.