Best Practices

Data Collections Development

Accepted Data 

The DDR accepts engineering datasets generated through simulation, hybrid simulation, experimental, and field research methods regarding the impacts of wind, earthquake, and storm surge hazards, as well as debris management, fire, and blast explosions. We also accept social and behavioral sciences (SBE) data encompassing the study of the human dimensions of hazards and disasters. As the field and the expertise of the community evolves we have expanded our focus to include datasets related to COVID-19. Data reports, publications of Jupyter notebooks, code, scripts, lectures, and learning materials are also accepted and can be stored in relation to data publications or as standalone products (See Data Models). 

Accepted and Recommended File Formats 

Due to the diversity of data and instruments used by our community, there are no current restrictions on the file formats users can upload to the DDR. However, for long-term preservation and interoperability purposes, we recommend and promote storing and  publishing data in open formats, and we follow the Library of Congress Recommended Formats

In addition, we suggest that users look into the Data Curation Primers, which "detail a specific subject, disciplinary area or curation task and that can be used as a reference to curate research data." Importantly, the primers include curation practices for documenting data types that while not open or recommended, are very established in the academic fields surrounding Natural Hazards research such as Matlab and Microsoft Excel. 

Below is an adaptation of the list of recommended formats for data and documentation by Stanford Libraries. For those available, we include a link to the curation primers:

Data Size

Currently we do not pose restrictions on the volume of data users upload to and publish in the DDR. This is meant to accommodate the vast amount of data researchers in the natural hazards community can generate, especially during the course of large-scale research projects. 

However, for data curation and publication purposes users need to consider the sizes of their data for its proper reuse. Publishing large amounts of data requires more curation work (organizing and describing) so that other users can understand the structure and contents of the dataset. In addition, downloading very large projects may require the use of Globus.  We further discuss data selection and quality considerations in the Data Curation section.