File Storage and Transfer¶
Previous section: How to Access the Clusters
Now that we can get onto the cluster, we want to get our data and files onto it as well.
There are four main areas where you may want to store data/files: 1. Home 2. Scratch 3. Depot 4. Fortress
We will discuss in more detail what each of these areas is in a later section. For now, we will put everything into our home directories, as that is where we land whenever we log into the clusters.
There are 6 main ways to get data and files onto and off of the clusters: 1. Open on Demand 2. scp 3. sftp 4. rsync 5. SMB 6. Globus
Open on Demand¶
In the Files tab of the Open on Demand page, there are upload and download buttons, but they are limited in what they can do (e.g., a file size limit of 100 GB to upload). If your connection is flaky at all, you're going to have a bad time.
scp¶
scp stands for Secure Copy Protocol and is the server version of cp. It needs a source and a destination, but one of them may be a server.
Copying to a cluster:
sftp¶
sftp stands for Secure File Transfer Protocol and is a reliable way to transfer files between the cluster and another computer.
Essentially, sftp starts a file transfer shell on a remote computer. Use the command sftp USERNAME@CLUSTER.rcac.purdue.edu to start the file transfer session:
sftp, the transferring on the side of your local computer will be relative to the directory you were in when you initiated the sftp session.
rsync¶
rsync is similar to scp, but much more fully-featured. It includes features like auto-resume transfers in case of disconnection:
SMB¶
SMB, also known as Samba, is a way to connect a remote drive to your computer to transfer files back and forth to the clusters in a graphical way.
To learn more about this option, please visit SMB drives.
Globus¶
For transferring large data to the cluster, you will want to use the Globus transfer service. If you want to transfer files from your local machine to the cluster, you will need to install the Globus Connect Personal software on your local computer.
From the Globus transfer service, you can select a source and a destination. It will handle the actual transferring of the file(s) for you, resuming if there's network connectivity problems.
Helpful RCAC Programs¶
The following two programs can be helpful as you navigate using the clusters:
myquota¶
myquota is run without any arguments and tells you where you have access to read and write files:
flost¶
RCAC regularly backs up data in home and depot spaces:
Next section: Applications