Usecase2

ADCIRC Datasets

ADCIRC Use Case - Creating an ADCIRC DataSet on DesignSafe

Clint Dawson, University of Texas at Austin
Carlos del-Castillo-Negrete, University of Texas at Austin
Benjamin Pachev, University of Texas at Austin

Overview

The following use case demonstrates how to compile an ADCIRC data-set of hind-casts on DesignSafe. This workflow involves the following steps:

Finding storm-surge events.
Compiling meteorological forcing for storm surge events.
Running ADCIRC hind-casts using meteorological forcing.
Organize and publish data on DesignSafe, obtaining a DOI for your research and for others to cite your data when re-used.

The workflow presented here is a common one performed for compiling ADCIRC data-sets for a variety of purposes, from Uncertainty Quantification to training Surrogate Models. Whatever your application is of ADCIRC data, publishing your dataset on DesignSafe allows you to re-use your own data, and for others to use and cite your data as well.

To see a couple of Example data-sets, and associated published research using the datasets, see the following examples:

Texas FEMA Storms - Synthetic storms for assessing storm surge risk. Used recently in Pachev et. al 2023 to train a surrogate model for ADCIRC for the coast of Texas.
Alaska Storm Surge Events - Major storm surge events for the coast of Alaska. Also used in Pachev et. al 2023 for creating a surrogate model for the coast of Alaska.

An accompanying jupyter notebook for this use case can be found in the ADCIRC folder in Community Data under the name Creating an ADCIRC DataSet.ipynb.

Resources

Jupyter Notebooks

The following Jupyter notebooks are available to facilitate the analysis of each case. They are described in details in this section. You can access and run them directly on DesignSafe by clicking on the "Open in DesignSafe" button.

Scope	Notebook
Create an ADCIRC DataSet	Creating an ADCIRC DataSet.ipynb

DesignSafe Resources

The following DesignSafe resources were leveraged in developing this use case.

Jupyter notebooks on DS Juypterhub.

Background

Citation and Licensing

Please cite Rathje et al. (2017) to acknowledge the use of DesignSafe resources.
This software is distributed under the GNU General Public License.

ADCIRC Overview

For more information on running ADCIRC and documentation, see the following links:

ADCIRC is available as a standalone app accessible via the DesignSafe front-end.

ADCIRC Inputs

An ADCIRC run is controlled by a variety of input files that can vary depending on the type of simulation being run. They all follow the naming convention fort.# where the # determines the type of input/output file. For a full list of input files for ADCIRC see the ADCIRC documentation. At a high level the inputs compose of the following:

Base Mesh input files - Always present for a run. It will be assumed for the purpose of this UseCase that the user starts from a set of mesh input files.
- fort.14 - ADCIRC mesh file, defining the domain and bathymetry.
- fort.15 - ADCIRC control file, containing (most) control parameters for the run. This includes:
  - Solver configurations such as time-step, and duration of simulation.
  - Output configurations, including frequency of output, and nodal locations of output.
  - Tidal forcing - At a minimum, ADCIRC is forced using tidal constituents.
- Additional control files (there are a lot more, just listing the most common here):
  - fort.13 - Nodal attribute file
  - fort.19, 20 - Additional boundary condition files.
Meteorological forcing files - Wind, pressure, ice coverage, and other forcing data for ADCIRC that define a particular storm surge event.
- fort.22 - Met. forcing control file.
- fort.221, fort.222, fort.225, fort.22* - Wind, pressure, ice coverage (respective), and other forcing files.

The focus of this use case is to compile sets of storm surge events, each comprising different sets of forcing files, for a region of interest defined by a set of mesh control files.

PyADCIRC

The following use case uses the pyADCIRC python library to manage ADCIRC input files and get data from the data sources mentioned above. The library can be installed using pip:

$ pip install pyadcirc

The pyadcirc.data contains functions to access two data sources in particular. First is NOAAs tidal gauge data for identifying storm surge. They provide a public API for accessing their data, for which pyADCIRC provides a python function and CLI (command line interface) wrapper around. The tidal signal at areas of interest over our domain will allow us to both identify potential storm surge events, and verify ADCIRC hind-casts with the real observations.

NOAA API CLI provided by the pyadcirc library. The noaa_data executable end point is created whenever pyadcirc is installed as library in an environment, providing a convenient CLI for interacting with the NOAA API that is well documented.

NOAA API CLI provided by the pyadcirc library. The noaa_data executable end point is created whenever pyadcirc is installed as library in an environment, providing a convenient CLI for interacting with the NOAA API that is well documented.

The second data source is NCAR’s CFSv1/v2 data sets for retrieving meteorological forcing files for identify storm surge events. An NCAR account is required for accessing this dataset. Make sure to go to NCAR's website to request an account for their data. You'll need your login information for pulling data from their repositories. Once your account is set-up, you'll want to store your credentials in a json file in the same directory as this notebook, with the name .ncar.json. For example the file may look like:

{"email": "user@gmail.com", "pw": "pass12345"}

Example Notebook: Creating ADCIRC DataSet

The example within this use case comprises of 4 main steps to create a data-set starting from a set of ADCIRC control input files. The notebook can be found at in the ADCIRC Use Case’s folder with the name Creating an ADCIRC DataSet.ipynb . Note that the notebook should be copied to the users ~/MyData directory before being able to use it (these steps are covered in the notebook).

The notebook covers the first two steps of this use case, namely identifying storm surge events and creating base input data sets to run using ADCIRC. We briefly overview the notebook’s results below.

Identifying storm surge events

The first stage of the notebook involves using the NOAA API wrapper provided by pyADCIRC to find storm surge events by looking at tidal gauge data in a region of interest. An example of an identified storm surge event, corresponding to Typhoon Merbok that hit the coast of Alaska in September 2022, is shown below.

Result of identification algorithm for the range of dates containing Typhoon Merbok. The algorithm operates by defining a trigger threshold, along with other heuristics, by which to group distinct groups of storm surge events.

The algorithm presented is run on the storms that see the most frequent storm-surge activity over the coast of Alaska, Nome, Red Dog Dock, and Unalakleet. All events are compiled to give date ranges of storm surge events to produce ADCIRC hind-casts for.

Getting data forcing data

Having identified dates of interest, the notebook then uses the ncar library endpoint to pull meteorological forcing for the identified potential storm surge events. These are then merged with ADCIRC base input files (available at the published data set), to create input runs for an ensemble of ADCIRC simulations, as covered in the use case documentation on running ADCIRC ensembles in DesignSafe.

Organizing Data for publishing

Having a set of simulated ADCIRC hind-casts for one or more events, along with any additional analysis performed on the hind-cast data, the true power of DesignSafe as a platform can be realized by publishing your data. Publishing your data allows you and other researchers to reference its usage with a DOI. For ADCIRC, this is increasingly useful as more Machine Learning models are being built using ADCIRC simulation data.

This section will cover how to organize and publish an ADCIRC hind-cast dataset as created above. Note this dataset presented in this use case is a subset of the Alaska Storm Surge Data set that has been published, so please refrain from re-publishing data.

The steps for publishing ADCIRC data will be as follows

Create a project directory in the DesignSafe data repository.
Organize ADCIRC data and copy to project directory.
Curate data by labeling and associating data appropriately.

While DesignSafe has a whole guide on how to curate and publish data, we note that the brief documentation below gives guidance on how to apply these curation guidelines to the particular case of ADCIRC simulation data.

Setting up Project Directory

First you’ll want to create a new project directory in the DesignSafe data repository.

Creating a new project in DesignSafe’s Data Depot.

Next we want to move ADCIRC inputs/outputs from your Jupyter instance where they were created into this project directory. We note that you must first restart your server if your moving data to a project directory that didn’t exist at the time from your server started, as that project directory won’t be in your ~/projects directory. Furthermore you’ll want to organize your folder structure in the command line before moving it to the project directory. See below for the recommended folder structure and associated data curation labels for publishing ADCIRC datasets.

.
├── Report.pdf -> Label as Report - PDF summarizing DataSet
├── mesh -> Label as Simulation Input (ADCIRC Mesh Type)
│   ├── fort.13
│   ├── fort.14
│   ├── fort.15
│   ├── fort.22
│   ├── fort.24
│   └── fort.25
├── inputs -> Label as Simulation Input (ADCIRC Meteorological Type)
│   ├── event000
│   │   ├── fort.15
│   │   ├── fort.221
│   │   ├── fort.222
│   │   └── fort.225
│   └── event001
│       ├── fort.15
│       ├── fort.221
│       ├── fort.222
│       └── fort.225
└── outputs -> Label as Simulation Output (ADCIRC Output)
    ├── event000
    │   ├── fort.61.nc
    │   ├── ...
    │   ├── maxele.63.nc
    │   ├── maxrs.63.nc
    │   ├── maxvel.63.nc
    │   ├── maxwvel.63.nc
    │   └── minpr.63.nc
    └── event001
        ├── fort.61.nc
        ├── ...
        ├── maxele.63.nc
        ├── maxrs.63.nc
        ├── maxvel.63.nc
        ├── maxwvel.63.nc
        └── minpr.63.nc
└── Analysis ->  Label as Analysis  any notebooks/code/images.
    ├── OverviewNotebook.ipynb - Analysis over all events.
    ├── event000
    │   ├── ExampleNotebook.ipynb - Event specific analysis.
    │   ├── ...

Example data relation diagram for an ADCIRC Simulation DataSet