A Novel Tool for Publishing Social Science, Engineering, and Interdisciplinary Natural Hazards Data
Published on January 13, 2021
Natural Hazards Center researchers Rachel Adams (left) and Jennifer Tobin (right) upload qualitative interview audio files to the secure DesignSafe Data Depot after a day of conducting fieldwork in Anchorage, Alaska. Source: Lori Peek, 2020
A collaborative vision becomes reality in the DesignSafe-Cyberinfrastructure
During field research in the natural hazards space, scientists and engineers from different disciplines use sophisticated equipment and diverse methods to collect differing types of data from 35mm photos and lidar imagery, to survey responses from community members, to biometrics. Although qualitatively distinct, these data need to be managed together to facilitate accurate analysis and curation workflows which culminate in publicly shared data and findings. Throughout the research process, a significant challenge is to bridge complementary perspectives for interdisciplinary teams that work collaboratively, and for users to discover field research data.
The R&D team was a partnership among research scientists with DesignSafe at the University of Texas Austin, CONVERGE at the University of Colorado Boulder, and the RAPID facility based at the University of Washington. All are components of the NSF-funded Natural Hazards Engineering Research Infrastructure, NHERI.
NHERI CONVERGE seeks to advance social science, engineering, and interdisciplinary work in natural hazards and is working to further a vision for managing and publishing field research data to enable such collaboration. This vision seeks to cohere interdisciplinary work as it is conducted across geographic space and time: from fieldwork in disaster-affected places, to data analysis, to publication. The vision has taken shape in the form of a novel field research data model for social science, engineering, and interdisciplinary natural hazards research.
For instance, a post-disaster reconnaissance team may consist of social scientists who are assessing resident access to clean water and engineers who are assessing residential homes and the electrical grid. With a single data model, the teams research can be examined in a much more holistic way, said Lori Peek, principal investigator of the NHERI CONVERGE facility.
ROBUST DATA MODEL
Users can think about the data model as the template that standardizes data organization and description and clarifies how different data and documentation components relate to one another.
DesignSafes model is robust enough for researchers to publish qualitative and quantitative data as well as data collection protocols, research instruments, and Institutional Review Board (IRB) protocols. In this way, the final publication reveals the structure of data in relation to how the research project was conducted. All data, protocols, and instruments published via DesignSafe Cyberinfrastructure are assigned a permanent Digital Object Identifier (DOI), allowing researchers to share and others to cite their work.
DesignSafes novel data model was developed with the expertise of Maria Esteva, data curator for the DesignSafe data repository, and Craig Jansen, who specializes in user interface design. Multiple researchers within the NHERI community contributed their knowledge, field experience, and feedback to the design.
This new data model will help advance collaboration across disciplines, geographic sites, and hazards within the disaster social science and engineering fields, said Peek.
DESIGNING FOR COLLABORATION
The CONVERGE vision was executed in the DesignSafe Cyberinfrastructure. DesignSafe, the National Science Foundation-supported online platform dedicated to natural hazards research, allows hazards and disaster researchers to securely store, analyze, publish, preserve, and share their data along with associated data collection instruments and research protocols.
D esigning and implementing a model to curate and share field research data required interdisciplinary collaboration. Social scientists, engineers, data curators, user-experience designers, and developers brainstormed, drew, discussed, implemented, and tested the field research data publication pipeline in DesignSafe.
Then, in 2019 at the annual Natural Hazards Workshop, over 70 researchers provided feedback on a mock-up of the social science and interdisciplinary data model. That feedback was then implemented before its 2020 release in DesignSafe.
At the 2020 Natural Hazards Researchers Meeting, Maria Esteva and Craig Jansen, along with team members Nathanael Rosenheim and Elaina Sutley, presented a discussion on the complexities involved in the development of the data model.
RAPID expertise. The NHERI RAPID facility at the University of Washington also played an integral role in partnering with DesignSafe and CONVERGE to develop the novel data model. Not only did they provide engineering feedback on the field research portion of the data model, they also ensured that the RAPID Application (RApp) for mobile data collection would seamlessly integrate with DesignSafe.
With the new data model, researchers can curate and publish preliminary field reports, field data, research protocols, and research instruments. This example data model clarifies how engineers and social scientists can work together to structure an interdisciplinary, multi-component project. Source: Craig Jansen, 2020.
Researchers Jennifer Tobin, Elaina Sutley, Mason Mathews, and David Hondula draw images of data collection to publication processes to help inform the development of the field research data model. Source: Lori Peek, 2018.
FOSTERING AN INTERDISCIPLINARY CULTURE
In order to introduce the new data publication capabilities to the natural hazards research community, in the summer of 2020, CONVERGE and DesignSafe hosted a series of Publish Your Data! events for social scientists and interdisciplinary researchers. Over 40 researchers from academia and the federal government, and graduate students, took part. Several have already begun publishing their research data and protocols from recent as well as legacy studies using the new data model. Upon completion of the training and data publication process, researchers are designated as CONVERGE Data Ambassadors.
Data Ambassadors are now sharing their new knowledge with others and helping to shift the culture in the field, Peek said. Ellen Rathje, who is the principal investigator for NHERI DesignSafe, agreed and is impressed by the eager participation. This is really exciting to see the interest from the social science community, and it has helped us to expand our capabilities even further at DesignSafe, she said.
Nathanael Rosenheim, one of the first research team members to publish a data collection instrument using the new data model, described the new data model as intuitive and easy to use. He added, I can really see the value of data sharing and publication. I hope that other researchers will use the household survey instrument we developed for the Longitudinal Community Resilience Focused Technical Investigation of the Lumberton, North Carolina Flood of 2016. Other motivating examples of recent data publications under the new data model can be found at the CONVERGE website.
LONGTERM VISION FOR NATURAL HAZARDS RESEARCH
The team encourages all researchers to contribute and publish field research data, instruments, and research protocols on DesignSafe. Those materials that are already published via the Data Depot on DesignSafe have received a permanent DOI and are available for download and reuse by members of the community.
As the number of published datasets and other materials grows, so too will the opportunities for collaboration and replication across field sites, Peek said. This possibility for expanding interdisciplinary collaboration and encouraging data reuse is central to the vision motivating this effort.
Moving forward, the leaders of DesignSafe, CONVERGE, and RAPID will continue to train and support researchers who wish to securely store and share their data. This work will influence the natural hazards field in two ways: It will help ensure that the results of publicly funded research are made available to the public, and it will enable others to ask new questions with existing data. Ultimately, this ongoing effort is about advancing the state of research and innovation, one dataset at a time.