Transferring files to and from OSC
1 Introduction
There are several ways of transferring files between your computer and OSC:
Method | Transfer size | CLI or GUI | Ease of use | Flexibility/options |
---|---|---|---|---|
OnDemand Files menu | smaller (<1GB) | GUI | Easy | Limited |
Remote transfer commands | smaller (<1GB) | CLI | Moderate | Extensive |
SFTP | larger (>1GB) | Either | Moderate | Limited |
Globus | larger (>1GB) | GUI | Moderate 1 | Extensive |
This page will cover each of those in more detail below.
If you need files that are at a publicly accessible location on the internet (for example, NCBI reference genome data), you don’t need to download these to your computer and then upload them to OSC.
Instead, you can use commands for downloading files directly to OSC, like wget
or curl
. This will be covered in one of the main sessions.
3 Remote transfer commands
For small transfers, you can also use a remote transfer command like scp
, or a more advanced one like rsync
or rclone
. Such commands can provide a more convenient transfer method than OnDemand if you want to keep certain directories synced between OSC and your computer.
The reason you shouldn’t use this for very large transfers is that the transfer will happen using a login node.
3.1 scp
One option is scp
(secure copy), which works much like the regular cp
command, including that you’ll need -r
for recursive transfers.
The key difference is that we have to somehow refer to a path on a remote computer, and we do so by starting with the remote computer’s address, followed by :
, and then the path:
# Copy from remote (OSC) to local (your computer):
scp <user>@pitzer.osc.edu:<remote-path> <local-path>
# Copy from local (your computer) to remote (OSC)
scp <local-path> <user>@pitzer.osc.edu:<remote-path>
Here are two examples of copying from OSC to your local computer:
# Copy a file from OSC to a local computer - namely, to your current working dir ('.'):
scp jelmer@pitzer.osc.edu:/fs/ess/PAS0471/jelmer/mcic-scripts/misc/fastqc.sh .
# Copy a directory from OSC to a local computer - namely, to your home dir ('~'):
scp -r jelmer@pitzer.osc.edu:/fs/ess/PAS0471/jelmer/mcic-scripts ~
And two examples of copying from your local computer to OSC:
# Copy a file from your computer to OSC --
# namely, a file in from your current working dir to your home dir at OSC:
scp fastqc.sh jelmer@pitzer.osc.edu:~
# Copy a file from my local computer's Desktop to the Scratch dir for PAS0471:
scp /Users/poelstra.1/Desktop/fastqc.sh jelmer@pitzer.osc.edu:/fs/scratch/PAS0471
Some nuances for remote copying:
As the above code implies, in both cases (remote-to-local and local-to-remote), you will issue the copying commands from your local computer.
For the remote computer (OSC), the path should always be absolute, whereas that for your local computer can be either relative or absolute.
Since all files can be accessed at the same paths at Pitzer and at Owens, it doesn’t matter whether you use
@pitzer.osc.edu
or@owens.osc.edu
in thescp
command.
If your OneDrive is mounted on or synced to your local computer (i.e., if you can see it in your computer’s file brower), you can also transfer directly between OSC and OneDrive.
For example, the path to my OneDrive files on my laptop is:
/Users/poelstra.1/Library/CloudStorage/OneDrive-TheOhioStateUniversity
.
So if I had a file called fastqc.sh
in my top-level OneDrive dir, I could transfer it to my Home dir at OSC as follows:
scp /Users/poelstra.1/Library/CloudStorage/OneDrive-TheOhioStateUniversity jelmer@pitzer.osc.edu:~
3.2 rsync
Another option, which I can recommend, is the rsync
command, especially when you have directories that you repeatedly want to sync: rsync
won’t copy any files that are identical in source and destination.
A useful combination of options is -avz --progress
:
-a
enables archival mode (among other things, this makes it work recursively).-v
increases verbosity — tells you what is being copied.-z
enables compressed file transfer (=> generally faster).--progress
to show transfer progress for individual files.
The way to refer to remote paths is the same as with scp
. For example, I could copy a dir_with_results
in my local Home dir to my OSC Home dir as follows:
rsync -avz --progress ~/dir_with_results jelmer@owens.osc.edu:~
rsync
One tricky aspect of using rsync
is that the presence/absence of a trailing slash for source directories makes a difference for its behavior. The following commands work as intended — to create a backup copy of a scripts
dir inside a dir called backup
2:
# With trailing slash: copy the *contents* of source "scripts" into target "scripts":
rsync -avz scripts/ backup/scripts
# Without trailing slash: copy the source dir "scripts" into target dir "backup"
rsync -avz scripts backup
But these commands don’t:
# This would result in a dir 'backup/scripts/scripts':
rsync -avz scripts backup/scripts
# This would copy the files in "scripts" straight into "backup":
rsync -avz scripts/ backup
4 SFTP
The first of two options for larger transfers is SFTP. You can use the sftp
command when you have access to a Unix shell on your computer, and this what I’ll cover below.
If you have Windows without e.g. WSL or Git Bash (see the top of the SSH page on this site for more details), you can use a GUI-based SFTP client instead like WinSCP, Cyberduck, or FileZilla. CyberDuck also works on Mac, and FileZilla works on all operating systems, if you prefer to do SFTP transfers with a GUI, but I won’t cover their usage here.
4.1 Logging in
To log in to OSC’s SFTP server, issue the following command in your local computer’s terminal, substituting <user>
by your OSC username:
sftp <user>@sftp.osc.edu # E.g., 'jelmer@sftp.osc.edu'
The authenticity of host 'sftp.osc.edu (192.148.247.136)' can't be established.
ED25519 key fingerprint is SHA256:kMeb1PVZ1XVDEe2QiSumbM33w0SkvBJ4xeD18a/L0eQ.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])?
If this is your first time connecting to OSC SFTP server, you’ll get a message like the one shown above: you should type yes
to confirm.
Then, you may be asked for your OSC password, and after that, you should see a “welcome” message like this:
******************************************************************************
This system is for the use of authorized users only. Individuals using
this computer system without authority, or in excess of their authority,
are subject to having all of their activities on this system monitored
and recorded by system personnel. In the course of monitoring individuals
improperly using this system, or in the course of system maintenance,
the activities of authorized users may also be monitored. Anyone using
this system expressly consents to such monitoring and is advised that if
such monitoring reveals possible evidence of criminal activity, system
personnel may provide the evidence of such monitoring to law enforcement
officials.
******************************************************************************
Connected to sftp.osc.edu.
Now, you will have an sftp
prompt (sftp>
) instead of a regular shell prompt.
Familiar commands like ls
, cd
, and pwd
will operate on the remote computer (OSC, in this case), and there are local counterparts for them: lls
, lcd
, lpwd
— for example:
# NOTE: I am prefacing sftp commands with the 'sftp>' prompt to make it explicit
# these should be issued in an sftp session; but don't type that part.
sftp> pwd
Remote working directory: /users/PAS0471/jelmer
sftp> lpwd
Local working directory: /Users/poelstra.1/Desktop
4.2 Uploading files to OSC
To upload files to OSC, use sftp
’s put
command.
The syntax is put <local-path> <remote-path>
, and unlike with scp
etc., you don’t need to include the address to the remote (because in an stfp
session, you are simultaneously connected to both computers). But like with cp
and scp
, you’ll need the -r
flag for recursive transfers, i.e. transferring a directory and its contents.
# Upload fastqc.sh in a dir 'scripts' on your local computer to the PAS0471 Scratch dir:
sftp> put scripts/fastqc.sh /fs/scratch/PAS0471/sandbox
# Use -r to transfer directories:
sftp> put -r scripts /fs/scratch/PAS0471/sandbox
# You can use wildcards to upload multiple files:
sftp> put scripts/*sh /fs/scratch/PAS0471/sandbox
sftp
is primitive
The ~
shortcut to your Home directory does not work in sftp
!
sftp
is generally quite primitive and you also cannot use, for example, tab completion or the recalling of previous commands with the up arrow.
4.3 Downloading files from OSC
To download files from OSC, use the get
command, which has the syntax get <remote-path> <local-path>
(this is the other way around from put
in that the remote path comes first, but the same in that both use the order <source> <target>
, like cp
and so on).
For example:
sftp> get /fs/scratch/PAS0471/mcic-scripts/misc/fastqc.sh .
sftp> get -r /fs/scratch/PAS0471/sandbox/ .
4.4 Closing the SFTP connection
When you’re done, you can type exit
or press Ctrl+D to exit the sftp
prompt.
5 Globus
The second option for large transfers is Globus, which has a browser-based GUI, and is especially your best bet for very large transfers. Some advantages of using Globus are that:
- It checks whether all files were transferred correctly and completely
- It can pause and resume automatically when you e.g. turn off your computer for a while
- It can be used to share files from OSC directly with collaborators even at different institutions.
Globus does need some setup, including the installation of a piece of software that will run in the background on your computer.
- Globus installation and configuration instructions: Windows / Mac / Linux
- Globus transfer instructions
- OSC’s page on Globus