Digital Data Infrastructure

The center is building a digital data infrastructure compliant with the requirements of long-term preservation, reproducibility, searching, and sharing.

Long-term preservation: Research data is stored on a per-publication basis using a hierarchical scheme: a folder for documentation, a folder for collection of raw datasets (generated by a given simulation code or an experimental data acquisition set-up), a folder for collection of scripts used to post-process raw simulation data, and a folder that collects all the data that are displayed in the paper figure or tables.

Reproducibility: In order to reduce errors and shorten the time required to transfer knowledge among researchers, we perform analysis with reproducible scripts using a jupyter notebook server. Notebooks are connected to charts and can be discovered and downloaded on demand using the Data Hub.

Data Sharing and Searching: In order to make both stored data and jupyter notebooks publicly accessible to researchers we created a three-step framework: Data Curation, Metadata Collection, and Data Hub. Data Curation is a web application developed by MICCoM to guide users in the creation of metadata from their stored data. Data Hub displays research data associated to each project and enables both coarse searches across any metadata field and also a series of granular queries by: author, title, collection, keywords or reference details.


MICCoM focuses its data activity on validation, data production and collection, using public databases, and data analysis tools (scripts and codes to analyze data will be provided online).

At present the Center focuses on: