Data preservation encompasses diverse activities carried out by all the stakeholders involved in the lifecycle of data, from data management planning to data curation, publication and long-term archiving. Once data is submitted to DDR, we have functionalities and Guidance in place to address the long-term preservation of the submitted data.
Data in the Data Depot Repository (DDR) is preserved according to state-of-the art digital library standards and best practices. DesignSafe is implemented within the reliable, secure, and scalable storage infrastructure at the Texas Advanced Computing Center (TACC), with 20 years of experience and innovation in High Performance Computing. TACC is currently over 20 years old, and TACC and its predecessors have operated a digital data archive continuously since 1986 – currently implemented in the Corral Data Management system and the Ranch tape archive system, with capacity of approximately half an exabyte. Corral and Ranch hold the data for DesignSafe and hundreds of other data collections and research projects. For details about the digital preservation architecture and procedures for DDR go to Data Preservation Best Practices.
Within TACC’s storage infrastructure a Fedora repository, considered a standard for digital libraries, manages the preservation of the published data. Through its functionalities, Fedora assures the authenticity and integrity of the digital objects, manages versioning, identifies file formats, records preservation events as metadata, maintains RDF metadata in accordance to standard schemas, conducts audits, and maintains the relationships between data and metadata for each published research project and its corresponding datasets. Each published dataset in DesignSafe has a Digital Object Identifier, whose maintenance we understand as a firm commitment to data persistence.
The DDR has been operational since 2016 and is currently supported by the NSF from October 1st, 2020 through September 30, 2025. During this award period, the DDR will continue to preserve the natural hazards research data published since its inception, as well as supporting preservation of and access to legacy data and the accompanying metadata from the Network for Earthquake Engineering Simulation (NEES), a NHERI predecessor, dating from 2005. The legacy data comprising 33 TB, 5.1 million files,2 and their metadata was transferred to DesignSafe in 2016 as part of the conditions of the original grant. See NEES data here.
Once the current award period ends, the NSF may release a competitive call and select a different awardee to support natural hazards infrastructure. In that case, the DDR published data and corresponding metadata will be transferred to the awardee. Fedora has export capabilities for transfer of data and metadata to another repository in a complete and validated fashion.
While at the moment DDR is committed to preserve data in the format in which it is submitted, we procure the necessary authorizations from users to conduct further preservation actions as well as to transfer the data to other organizations if applicable. These permissions are granted through our Data Publication Agreement, which authors acknowledge and have the choice to agree to at the end of the publication workflow and prior to receiving a DOI for their dataset.
Data sustainability is a continuous effort that DDR accomplishes along with the rest of the NHERI partners. In the natural hazards space, data is central to new advances, which is evidenced by the data reuse record of our community and the following initiatives:
In the unlikely case in which the NSF and/or the other stakeholders involved in this community decide not to continue the program, we would cease “active” curation activities, but continue to preserve the data and access to the DOIs through DesignSafe’s host center at the University of Texas at Austin. TACC management has committed to preserving the data with landing pages on online storage and data files at least on tape indefinitely if project funding does not continue.