pracs-sp21: OSC, SLURM, and Conda

Relevant course modules

Week 6

Using SLURM

Submitting batch (non-interactive) jobs

SLURM directives can be provided:

As arguments to sbatch when submitting the job (see below), or
Inside the script on lines starting with #SBATCH (see below).

If the same directive is provided in both places, the command-line (sbatch call) value will override the one in the script.

Most important `sbatch` options

For more options, see the SLURM documentation.

Resource/use	short	long	default	example / remarks
Project to be billed	`-A`	`--account`	N/A	`--account=PAS0471` `-A PAS1855`
Time limit	`-t`	`--time`	1 hour	`-t 60` (60 min) `-t 2:30` (2 min and 30 sec) `-t 5:00:00` (5 h) `-t 2-12` (2 days and 12 h) `--time=60` (60 min)
Number of nodes	`-N`	`--nodes`	1	`--nodes=2` Only ask >1 node if you have explicit parallelization with e.g. MPI (uncommon in bioinformatics).
Number of cores	`-c`	`--cpus-per-task`	1	`--cpus-per-task=4` For jobs with multi-threading (common).
Number of “tasks” (processes)	`-n`	`--ntasks`	1	`--ntasks=2` For jobs with multiple processes (not as common).
Number of tasks per node	-	`--ntasks-per-node`	1	`--ntasks-per-node=2` For jobs with multiple processes (not as common).
Memory limit per node	-	`--mem`	(4G)	`--mem=40G` The default unit is MB (MegaBytes) – use “G” for GB.
Log output file	`-o`	`--output`	`slurm-%j.out`	`--output=slurm-fastqc-%j.out` (It’s useful to include a descriptive name, but be sure to also include `%j`, the job number.)
Error output file	`-e`	`--error`	N/A	`--error=slurm-fastqc-%j.err` (Note: by default, stderr is included with stdout in `-o` / `--output`; use `-e`/`--error` to separate.)
Job name	-	`--job-name`	N/A	`--job-name=fastqc` (Useful to distinguish jobs when looking at the queue.)
Partition (queue type)	-	`--partition`	any	`--partition=longserial` See these OSC docs for more info.
Get email when job starts/ends/fails	-	`--mail-type`	N/A	`--mail-type=START` (When job starts) `--mail-type=END` (When job ends) `--mail-type=FAIL` (When job fails) `--mail-type=ALL` (Any event)
Job can’t start until specified time	-	`--begin`	N/A	`--begin=2021-02-01T12:00:00`
Job can’t start until dependency job has finished ddddddddddddddddddddd	-	`--dependency`	N/A	`--dependency=afterany:123456` dddddddddddddddddddddddddddddddddd

Submitting the job with `sbatch`

To submit a script to the queue with sbatch, simply prepend sbatch, optionally with sbatch options, before the call to the script:

sbatch [sbatch-options] <script> [script-arguments]

Some examples with and without sbatch options and script arguments:

# No sbatch options (must be provided in script) and no script arguments:
sbatch myscript.sh

# No sbatch options, one script argument:
sbatch myscript.sh sampleA.fastq.gz

# Two sbatch options, no script arguments:
sbatch -t 60 -A PAS1855 --mem=20G myscript.sh

# Two sbatch options, one script argument:
sbatch -t 60 -A PAS1855 --mem=20G myscript.sh sampleA.fastq.gz

Example script header with SLURM directives

Note: Because SLURM directives are a special type of comments, they need to occur before any lines that are executed in order to be parsed. For instance, they should be placed above the set header below:

#!/bin/bash

#SBATCH --account=PAS1855
#SBATCH --time=00:45:00
#SBATCH --mem=8G

set -e -u -o pipefail

SLURM environment variables

Inside the script, SLURM environment variables will be available, such as:

Variable	Corresponding option	Description
`$SLURM_SUBMIT_DIR`	N/A	Path to dir from which job was submitted.
`$TMPDIR`	N/A	Path to a dir available during the job (fast I/O).
`$SLURM_JOB_ID`	N/A	Job ID assigned by SLURM.
`$SLURM_JOB_NAME`	`--job-name`	Job name supplied by the user.
`$SLURM_CPUS_ON_NODE`	`-c` / `--cpus-per-task`	Number of CPUs (~ cores/threads) available on 1 node.
`$SLURM_NTASKS`	`-n` / `--ntasks`	Number of tasks (processes).
`$SLURM_MEM_PER_NODE`	`--mem`	Memory per node.
`$SLURMD_NODENAME`	N/A	Name of the node running the job.

As an example of how these environment variables can be useful, the command below uses $SLURM_CPUS_ON_NODE in its call to the program STAR inside the script:

STAR --runThreadN "$SLURM_CPUS_ON_NODE" --genomeDir ...

This way, we don’t risk having a mismatch between the resources requested and the resources (attempting to be) used, and we only have to modify the number of threads in one place (in the resource request to SLURM).

Submitting interactive jobs

Command	Explanation	Example
`sinteractive`	OSC convenience wrapper around `srun` / `salloc`. Only accepts short options (e.g. `-A`, not `--account`). Default time is 30 minutes and maximum time is 60 minutes.	`sinteractive -A PAS1855 -t 60`
`srun`	Start an interactive job with any set of options; needs `--pty /bin/bash` to enter a Bash shell on the reserved node. ddddddddddddddddddddddddddddddddddddd	`srun -A PAS1855 -t 60 --pty /bin/bash`

Monitoring and managing SLURM jobs

Command	Explanation	Example
`squeue`	Check the SLURM job queue: will only show queued and running jobs, no finished jobs. (Use `-u` or you will see everyone’s jobs!)	`squeue -u $USER` (show all my jobs) `squeue -u $USER -l` (long format, more info)
`scancel`	Cancel one or more jobs.	`scancel 2526085` (Cancel job `2526085`) `scancel -u $USER` (Cancel all my jobs)
`scontrol`	Information about any job, mostly a summary of the resources available to the job and other options that were implicitly or explicitly set.	`scontrol show job 2526085` (Show stats for job 2526085) `scontrol show job $SLURM_JOB_ID` (for usage INSIDE a script)
`sstat`	Information about running jobs, including memory usage (=> can be included in script).	`sstat -j $SLURM_JOB_ID --format=jobid,avecpu,averss,maxrss,ntasks`
`sacct`	Information about finished jobs, including memory usage. ddddddddddddddddddddddddddddddddddddd	`sacct -j 2978487`

Notes:

If you need to check CPU and memory usage of your jobs, see also the XDMoD tool on OSC Ondemand.
In sstat and sacct, MaxRSS is the maximum amount of memory used by the job.

Using modules to load software

Command	Explanation	Example
`module spider`	See all modules (software available through the module system).	`module spider` (See all modules) `module spider python` (All modules matching “Python”)
`module avail`	See modules that can be loaded given the current software environment.	`module avail` (See all modules) `module avail python` (All modules matching “Python”)
`module load`	Load a specific module. After loading the module, the software will be available in your `$PATH` and can thus be called directly.	`module load python` (Load default version) `module load python/3.7-2019.10` (Load a specific version)
`module list`	List currently loaded modules	`module list`
`module unload`	Unload a currently loaded module. ddddddddddddddddddddddddddddddddddddd	`module unload python` dddddddddddddddddddddddddddddddddddddddddd

Software management with Conda

To use Conda at OSC, first load the module python/3.6-conda5.2 using module load python/3.6-conda5.2.

Conda commands and options:

Command	Explanation	Example
`create`	Create a new Conda environment.	`conda create -n my-env` `conda create cutadapt-env cutadapt`
`env create`	Create a new Conda environment using a YAML file describing the environment (see `conda export` below).	`conda env create --file environment.yml`
`-y`	Don’t prompt for confirmation when installing/removing things.	`conda install -y cutadapt`
`activate`	Activate a specific Conda environment, so you can use the software installed in that environment. Note: the examples use `source activate` rather than `conda activate`, see below for additional setup to enable `conda activate`.	`source activate cutadapt-env` `source activate cutadapt-env` Activate -“stack”- a second environment: `source activate --stack <second-env-name>`
`install`	Install software into the currently active environment.	`conda install python=3.7` Specify channel for installation: `conda install -c bioconda cutadapt`
`config`	Configure Conda (see below).	`conda config --add channels bioconda`
`export`	Export a YAML file that describes the environment.	Export the active environment: `conda env export > environment.yml` Export any environment by name: `conda env export -n multiqc-env > multiqc-env.yml`
`env list`	List all your environments.	`conda env list`
`list`	List all packages (software) in an environment.	`conda list -n multiqc-env`
`deactivate`	Deactivate the currently active environment.	`conda deactivate`
`remove`	Remove an environment entirely.	`conda env remove -n multiqc-env`
`search`	Search for a software package. ddddddddddddddddddddddddddddddddddddd	`conda search 'bwa*'`

Conda channels

Conda “channels” are like repositories, each of which carry overlapping sets of software. A one-time setup step to set the channel priorities in the order that is generally desired – run these lines in the following order:

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge  # Highest priority

Channels can also be specified for individual installations, in order to override these defaults:

conda install -c bioconda cutadapt

Conda setup to enable `conda activate`

To enable conda activate to work (in addition to source activate), add the following lines to your Bash configuration file at ~/.bashrc (which you can open with VS Code or Nano and edit):

if [ -f /apps/python/3.6-conda5.2/etc/profile.d/conda.sh ]; then
      source /apps/python/3.6-conda5.2/etc/profile.d/conda.sh
elif [ -f /usr/local/python/3.6-conda5.2/etc/profile.d/conda.sh ]; then
      source /usr/local/python/3.6-conda5.2/etc/profile.d/conda.sh
fi

For immediate effects, you’ll need to run the ~/.bashrc file:

source ~/.bashrc

OSC, SLURM, and Conda