Data Transfer
Guide

Transferring Files to DesignSafe's Data Depot

 

DesignSafe supports multiple ways of moving data in and out of the Data Depot – which one is best depends on how you will use DesignSafe. While the web interface in the DesignSafe portal is easy for moving small numbers of modest size files, if you need to move large volumes of data, large numbers of files (> 50) or move folders, the Globus tools are the recommended way of moving data in and out of DesignSafe.

This document details the various methods you can use to import and export your data from DesignSafe.

  • Globus bulk data transfer
  • DRAG AND DROP/ file select from the browser
  • Integrating BOX.com
  • Integrating DROPBOX.com
  • UNIX Command-line Utilities
  • Cyberduck

 

Globus

See the DesignSafe Globus Data Transfer Guide

While the web interface in the DesignSafe portal is easy for moving small numbers of modest size files, if you to move large volumes of data, or large numbers of files or directories, the Globus tools are the recommended way of moving data in and out of DesignSafe.

Globus supplies high speed, reliable, asynchronous transfers to DesignSafe. Globus is fast, for large volumes of data, as it uses multiple network sockets simultaneously to transfer data. It is reliable for large numbers of directories and files, as it can automatically fail and restart itself, and will only notify you when the transfers are completed successfully.

 

Browser-based File Transfer to Data Depot

User can easily transfer files and directories to their My Data or to specific projects via web browser. Data Depot provides a user interface with a familiar desktop metaphor for manipulating files. The UI for a typical "My Data" window is shown in below.

  • Login to DesignSafe - Use one of the following credentials to log into DesignSafe
  • Click Research Workbench and navigate to Data Depot
  • Navigate to the directory where you want your data to be uploaded. If no folder exist, create a new one by adding new folder/new project
  • Click add then choose file upload or folder upload based on your data then click begin upload.
  • The "upload" button will open a separate form. From the upload window, you may select files via a traditional "file manager" browser by clicking the "choose files" button, or simply drag and drop files from your desktop into the "Drop files here" section. Once the upload is done, you can see your data in the work space

Integrating Dropbox and Box.com with the DesignSafe's Data Depot

DesignSafe provides user the capability to connect to your favorite cloud storage provider. We currently support integrating your data from Box.com and Dropbox.com

Follow the steps below to integrate your Dropbox and/or Box.com accounts into DesignSafe portal:

  1. Click Manage account on your DesignSafe account profile, then 3rd party apps as shown in the figure. Next step click manage app settings under Box.com tab
  2. Select either Dropbox or Box.com, then grant access to the application by logging in with the respective application's credentials

  3. Now your account is integrated into DesignSafe's Data Depot. Go to your Data Depot and view the files that are in Box.com or Dropbox in your workspace

At this point, your data is still in Box.com or Dropbox and is simply viewable. We do not actively sync your data. In order to use your data in DesignSafe, you will need to copy the data from Box or Dropbox into My Data or My Projects.

You are just copying the data to DesignSafe, which means original data still resides in Box or Dropbox. If you make any changes to those files in My Data or My Projects, it will not be replicated at Box or Dropbox. You can copy files back to Box or Dropbox from My Data or My Projects.

Linux Command-Line Transfer Utilities

Users can take advantage of some common command line utilities such as scp, sftp, rsync to achieve higher performance and to transfer large amount of data seamlessly. Web browsers have some restrictions on transferring data on large volume. Command line utilities may come in handy in those situations. TACC requires Multi-Factor Authentication (MFA) for command line access.

scp

Data transfer from any Linux system can be accomplished using the scp utility to copy data to and from the DesignSafe. A file can be copied from your local system to the remote server by using the command:

scp filename username@data.tacc.utexas.edu:/path/to/project/directory 

localhost$ scp test.txt siva@data.tacc.utexas.edu/corral-repl/projects/NHERI/projects/523309190170-242ac114-0001

Consult the man pages for more information on scp.

localhost$ man scp

rsync

The rsync command is a reliable method to transfer files, especially if you transfer files in stages, as it compares files automatically to make sure that files in the source and destination are the same. It is recommended for users who need checksum method since it involves calculating checksum for each file to make sure that the transfer is complete and accurate.

Below is the usage of rsync command for transferring a file named "myfile.c" from the current location on your desktop to your project directory in the DesignSafe

localhost$ rsync myfile.c username@data.tacc.utexas.edu:/corral-repl/projects/NHERI/projects/project_id

An entire directory can be transferred from source to destination by using rsync as well. For directory transfers the options "-avtr" will transfer the files recursively ("-r" option) along with the modification times ("-t" option) and in the archive mode ("-a" option) to preserve symbolic links, devices, attributes, permissions, ownerships, etc. The "-v" option (verbose) increases the amount of information displayed during any transfer. The following example demonstrates the usage of the "-avtr" options for transferring a directory named "Nheri" from the current location on your desktop to DesignSafe's project area.

localhost$ rsync -avtr Nheri \
username@data.tacc.utexas.edu:/corral-repl/projects/NHERI/projects/project_id

For more rsync options and command details, run the command "rsync -h" or:

localhost$ man scp
localhost$ man rsync

** If rsync is used as a mode of data transfer, existing data written to the staging area will be overwritten only if the contents change. Use --ignore-existing to change this behavior. If any other data transfer protocol is used, files or folders with an existing name will be overwritten.

Cyberduck

Cyberduck is an open source graphical user interface tool to accomplish large data transfers.

Click on the "Open Connection" button in the top right corner of the Cyberduck window to open a connection configuration window (as shown below) transfer mechanism, and type in the server name "data.tacc.utexas.edu". Add your username and password in the spaces provided, and if the "more options" area is not shown click the small triangle or button to expand the window; this will allow you to enter the path to your project area so that when Cyberduck opens the connection you will immediately see your data. Then click the "Connect" button to open your connection.

Once connected, you can navigate through your remote file hierarchy using familiar graphical navigation techniques. You may also drag-and-drop files into and out of the Cyberduck window to transfer files to and from DesignSafe.

 

Last update: August 15, 2017