DesignSafe supports multiple ways of moving data in and out of the Data Depot – which one is best depends on how you will use DesignSafe. While the web interface in the DesignSafe portal is easy for moving small numbers of modest size files, if you need to move large volumes of data, large numbers of files (> 25) or move folders, the Globus, Cyberduck or Command Line tools are the recommended way of moving data in and out of DesignSafe.
This document details the various methods you can use to import and export your data from DesignSafe.
While the web interface in the DesignSafe portal is easy for moving small numbers of modest size files, if you to move large volumes of data, or large numbers of files or directories, the Globus tools are the recommended way of moving data in and out of DesignSafe.
Globus supplies high speed, reliable, asynchronous transfers to DesignSafe. Globus is fast, for large volumes of data, as it uses multiple network sockets simultaneously to transfer data. It is reliable for large numbers of directories and files, as it can automatically fail and restart itself, and will only notify you when the transfers are completed successfully.
Cyberduck is an open source graphical user interface tool to accomplish large data transfers.
Click on the "Open Connection" button in the top right corner of the Cyberduck window to open a connection configuration window transfer mechanism, and type in the server name "data.tacc.utexas.edu". Add your username and password in the spaces provided. Multi-Factor Authentication is required. If the "more options" area is not shown click the small triangle or button to expand the window; this will allow you to enter the path to your project area so that when Cyberduck opens the connection you will immediately see your data. Then click the "Connect" button to open your connection.
Once connected, you can navigate through your remote file hierarchy using familiar graphical navigation techniques. You may also drag-and-drop files into and out of the Cyberduck window to transfer files to and from DesignSafe.
User can easily transfer files and directories to their My Data or to specific projects via web browser. Data Depot provides a user interface with a familiar desktop metaphor for manipulating files. The UI for a typical "My Data" window is shown in below.
DesignSafe provides user the capability to connect to your favorite cloud storage provider. We currently support integrating your data from Box.com and Dropbox.com
Follow the steps below to integrate your Dropbox and/or Box.com accounts into DesignSafe portal:
Select either Dropbox or Box.com, then grant access to the application by logging in with the respective application's credentials
Now your account is integrated into DesignSafe's Data Depot. Go to your Data Depot and view the files that are in Box.com or Dropbox in your workspace
At this point, your data is still in Box.com or Dropbox and is simply viewable. We do not actively sync your data. In order to use your data in DesignSafe, you will need to copy the data from Box or Dropbox into My Data or My Projects.
You are just copying the data to DesignSafe, which means original data still resides in Box or Dropbox. If you make any changes to those files in My Data or My Projects, it will not be replicated at Box or Dropbox. You can copy files back to Box or Dropbox from My Data or My Projects.
Users can take advantage of some common command line utilities such as scp, sftp, rsync to achieve higher performance and to transfer large amount of data seamlessly. Web browsers have some restrictions on transferring data on large volume. Command line utilities may come in handy in those situations. TACC requires Multi-Factor Authentication (MFA) for command line access.
To use command line transfers, you must first have an allocation on Corral (or the relevant storage resource). DesignSafe access alone is not sufficient.
Data transfer from any Linux system can be accomplished using the scp utility to copy data to and from the DesignSafe. A file can be copied from your local system to the remote server by using the command:
scp filename email@example.com:/path/to/project/directory localhost$ scp test.txt firstname.lastname@example.org:/gpfs/corral3/repl/projects/NHERI/shared/username
The above command will transfer your files/folders to your home directory.
Consult the man pages for more information on scp.
localhost$ man scp
rsync command is a reliable method to transfer files, especially if you transfer files in stages, as it compares files automatically to make sure that files in the source and destination are the same. It is recommended for users who need checksum method since it involves calculating checksum for each file to make sure that the transfer is complete and accurate.
Below is the usage of rsync command for transferring a file named "myfile.c" from the current location on your desktop to your project directory in the DesignSafe
localhost$ rsync myfile.c email@example.com:/gpfs/corral3/repl/projects/NHERI/projects/project_id
The above command will transfer your files/folders to your specified project directory.
An entire directory can be transferred from source to destination by using rsync as well. For directory transfers the options "-avtr" will transfer the files recursively ("-r" option) along with the modification times ("-t" option) and in the archive mode ("-a" option) to preserve symbolic links, devices, attributes, permissions, ownerships, etc. The "-v" option (verbose) increases the amount of information displayed during any transfer. The following example demonstrates the usage of the "-avtr" options for transferring a directory named "Nheri" from the current location on your desktop to DesignSafe's project area.
localhost$ rsync -avtr Nheri \ firstname.lastname@example.org:/gpfs/corral3/repl/projects/NHERI/projects/project_id
rsync options and command details, run the command "rsync -h" or:
localhost$ man scp localhost$ man rsync
** If rsync is used as a mode of data transfer, existing data written to the staging area will be overwritten only if the contents change. Use
--ignore-existing to change this behavior. If any other data transfer protocol is used, files or folders with an existing name will be overwritten.
Last update: August 15, 2017