Topic overview
SLURM directives can be provided:
sbatch
when submitting the job (see below), or#SBATCH
(see below).If the same directive is provided in both places, the command-line (sbatch
call) value will override the one in the script.
sbatch
optionsFor more options, see the SLURM documentation.
Resource/use | short | long | default | example / remarks |
---|---|---|---|---|
Project to be billed | -A |
--account |
N/A | --account=PAS0471 -A PAS1855 |
Time limit | -t |
--time |
1 hour | -t 60 (60 min) -t 2:30 (2 min and 30 sec) -t 5:00:00 (5 h) -t 2-12 (2 days and 12 h) --time=60 (60 min) |
Number of nodes | -N |
--nodes |
1 | --nodes=2 Only ask >1 node if you have explicit parallelization with e.g. MPI (uncommon in bioinformatics). |
Number of cores | -c |
--cpus-per-task |
1 | --cpus-per-task=4 For jobs with multi-threading (common). |
Number of “tasks” (processes) | -n |
--ntasks |
1 | --ntasks=2 For jobs with multiple processes (not as common). |
Number of tasks per node | - | --ntasks-per-node |
1 | --ntasks-per-node=2 For jobs with multiple processes (not as common). |
Memory limit per node | - | --mem |
(4G) | --mem=40G The default unit is MB (MegaBytes) – use “G” for GB. |
Log output file | -o |
--output |
slurm-%j.out |
--output=slurm-fastqc-%j.out (It’s useful to include a descriptive name, but be sure to also include %j , the job number.) |
Error output file | -e |
--error |
N/A | --error=slurm-fastqc-%j.err (Note: by default, stderr is included with stdout in -o / --output ; use -e /--error to separate.) |
Job name | - | --job-name |
N/A | --job-name=fastqc (Useful to distinguish jobs when looking at the queue.) |
Partition (queue type) | - | --partition |
any | --partition=longserial See these OSC docs for more info. |
Get email when job starts/ends/fails | - | --mail-type |
N/A | --mail-type=START (When job starts) --mail-type=END (When job ends) --mail-type=FAIL (When job fails) --mail-type=ALL (Any event) |
Job can’t start until specified time | - | --begin |
N/A | --begin=2021-02-01T12:00:00 |
Job can’t start until dependency job has finished ddddddddddddddddddddd |
- | --dependency |
N/A | --dependency=afterany:123456 dddddddddddddddddddddddddddddddddd |
sbatch
To submit a script to the queue with sbatch
, simply prepend sbatch
, optionally with sbatch options, before the call to the script:
sbatch [sbatch-options] <script> [script-arguments]
Some examples with and without sbatch options and script arguments:
# No sbatch options (must be provided in script) and no script arguments:
sbatch myscript.sh
# No sbatch options, one script argument:
sbatch myscript.sh sampleA.fastq.gz
# Two sbatch options, no script arguments:
sbatch -t 60 -A PAS1855 --mem=20G myscript.sh
# Two sbatch options, one script argument:
sbatch -t 60 -A PAS1855 --mem=20G myscript.sh sampleA.fastq.gz
Note: Because SLURM directives are a special type of comments, they need to occur before any lines that are executed in order to be parsed. For instance, they should be placed above the set
header below:
#!/bin/bash
#SBATCH --account=PAS1855
#SBATCH --time=00:45:00
#SBATCH --mem=8G
set -e -u -o pipefail
Inside the script, SLURM environment variables will be available, such as:
Variable | Corresponding option | Description |
---|---|---|
$SLURM_SUBMIT_DIR |
N/A | Path to dir from which job was submitted. |
$TMPDIR |
N/A | Path to a dir available during the job (fast I/O). |
$SLURM_JOB_ID |
N/A | Job ID assigned by SLURM. |
$SLURM_JOB_NAME |
--job-name |
Job name supplied by the user. |
$SLURM_CPUS_ON_NODE |
-c / --cpus-per-task |
Number of CPUs (~ cores/threads) available on 1 node. |
$SLURM_NTASKS |
-n / --ntasks |
Number of tasks (processes). |
$SLURM_MEM_PER_NODE |
--mem |
Memory per node. |
$SLURMD_NODENAME |
N/A | Name of the node running the job. |
As an example of how these environment variables can be useful, the command below uses $SLURM_CPUS_ON_NODE
in its call to the program STAR inside the script:
STAR --runThreadN "$SLURM_CPUS_ON_NODE" --genomeDir ...
This way, we don’t risk having a mismatch between the resources requested and the resources (attempting to be) used, and we only have to modify the number of threads in one place (in the resource request to SLURM).
Command | Explanation | Example |
---|---|---|
sinteractive |
OSC convenience wrapper around srun / salloc . Only accepts short options (e.g. -A , not --account ). Default time is 30 minutes and maximum time is 60 minutes. |
sinteractive -A PAS1855 -t 60 |
srun |
Start an interactive job with any set of options; needs --pty /bin/bash to enter a Bash shell on the reserved node. ddddddddddddddddddddddddddddddddddddd |
srun -A PAS1855 -t 60 --pty /bin/bash |
Command | Explanation | Example |
---|---|---|
squeue |
Check the SLURM job queue: will only show queued and running jobs, no finished jobs. (Use -u or you will see everyone’s jobs!) |
squeue -u $USER (show all my jobs) squeue -u $USER -l (long format, more info) |
scancel |
Cancel one or more jobs. | scancel 2526085 (Cancel job 2526085 ) scancel -u $USER (Cancel all my jobs) |
scontrol |
Information about any job, mostly a summary of the resources available to the job and other options that were implicitly or explicitly set. | scontrol show job 2526085 (Show stats for job 2526085) scontrol show job $SLURM_JOB_ID (for usage INSIDE a script) |
sstat |
Information about running jobs, including memory usage (=> can be included in script). |
sstat -j $SLURM_JOB_ID --format=jobid,avecpu,averss,maxrss,ntasks |
sacct |
Information about finished jobs, including memory usage. ddddddddddddddddddddddddddddddddddddd |
sacct -j 2978487 |
Notes:
If you need to check CPU and memory usage of your jobs, see also the XDMoD tool on OSC Ondemand.
In sstat
and sacct
, MaxRSS
is the maximum amount of memory used by the job.
Command | Explanation | Example |
---|---|---|
module spider |
See all modules (software available through the module system). | module spider (See all modules) module spider python (All modules matching “Python”) |
module avail |
See modules that can be loaded given the current software environment. | module avail (See all modules) module avail python (All modules matching “Python”) |
module load |
Load a specific module. After loading the module, the software will be available in your $PATH and can thus be called directly. |
module load python (Load default version) module load python/3.7-2019.10 (Load a specific version) |
module list |
List currently loaded modules | module list |
module unload |
Unload a currently loaded module. ddddddddddddddddddddddddddddddddddddd |
module unload python dddddddddddddddddddddddddddddddddddddddddd |
To use Conda at OSC, first load the module python/3.6-conda5.2
using module load python/3.6-conda5.2
.
Conda commands and options:
Command | Explanation | Example |
---|---|---|
create |
Create a new Conda environment. | conda create -n my-env conda create cutadapt-env cutadapt |
env create |
Create a new Conda environment using a YAML file describing the environment (see conda export below). |
conda env create --file environment.yml |
-y |
Don’t prompt for confirmation when installing/removing things. | conda install -y cutadapt |
activate |
Activate a specific Conda environment, so you can use the software installed in that environment. Note: the examples use |
source activate cutadapt-env source activate cutadapt-env Activate -“stack”- a second environment: source activate --stack <second-env-name> |
install |
Install software into the currently active environment. | conda install python=3.7 Specify channel for installation: conda install -c bioconda cutadapt |
config |
Configure Conda (see below). | conda config --add channels bioconda |
export |
Export a YAML file that describes the environment. | Export the active environment: conda env export > environment.yml Export any environment by name: conda env export -n multiqc-env > multiqc-env.yml |
env list |
List all your environments. | conda env list |
list |
List all packages (software) in an environment. | conda list -n multiqc-env |
deactivate |
Deactivate the currently active environment. | conda deactivate |
remove |
Remove an environment entirely. | conda env remove -n multiqc-env |
search |
Search for a software package. ddddddddddddddddddddddddddddddddddddd |
conda search 'bwa*' |
Conda “channels” are like repositories, each of which carry overlapping sets of software. A one-time setup step to set the channel priorities in the order that is generally desired – run these lines in the following order:
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge # Highest priority
Channels can also be specified for individual installations, in order to override these defaults:
conda install -c bioconda cutadapt
conda activate
To enable conda activate
to work (in addition to source activate
), add the following lines to your Bash configuration file at ~/.bashrc
(which you can open with VS Code or Nano and edit):
if [ -f /apps/python/3.6-conda5.2/etc/profile.d/conda.sh ]; then
source /apps/python/3.6-conda5.2/etc/profile.d/conda.sh
elif [ -f /usr/local/python/3.6-conda5.2/etc/profile.d/conda.sh ]; then
source /usr/local/python/3.6-conda5.2/etc/profile.d/conda.sh
fi
For immediate effects, you’ll need to run the ~/.bashrc
file:
source ~/.bashrc
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".