Introduction to Slurm¶

All work on Cheaha must be submitted to the queueing system, Slurm. This doc gives a basic overview of Slurm and how to use it.

Slurm is software that gives researchers fair allocation of the cluster's resources. It schedules jobs based using resource requests such as number of CPUs, maximum memory (RAM) required per CPU, maximum run time, and more.

The main Slurm documentation can be found at the Slurm site. The Slurm Quickstart can also be helpful for orienting researchers new to queueing systems on the cluster.

Batch Job Workflow¶

Stage data to $USER_DATA, $USER_SCRATCH, or a /data/project/... directory.
Research how to run your directives in 'batch' mode. In other words, how to run your analysis pipeline from the command line, with no GUIs or researcher input.
Identify the appropriate resources necessary to run the jobs (CPUs, time, memory, etc)
Write a job script specifying these parameters using Slurm directives.
Submit the job (sbatch)
Monitor the job (squeue)
Review the results, and modify/rerun if necessary (sacct and seff)
Remove data from $USER_SCRATCH

For more details, please see Submitting Jobs.

For details on managing and reviewing jobs, please see Job Management.

The Slurm Queue¶

When working on Cheaha and with Research Computing, you will often hear references to the Slurm Queue. By its name, you might think that the Slurm Queue is a first-in-first-out (FIFO) queue like when waiting in line at an event or place of business. And, some institutions use a FIFO queue, as it is the default configuration for Slurm.

At UAB Research Computing, we use a multifactor priority queue, meaning that those users with top priority are first to receive service, regardless of when they entered the queue.

Slurm measures priority as a single number, and the highest value generally is first to receive service. Multiple factors play into the queue. The most important factors are given in the table below, in no particular order.

Factor	Description	What Gives Higher Priority	Example
Age	Lenght of time job has spent in queue.	Longer time in queue	Job in queue for 2 days will start faster than for 4 hours.
Resources	Quantity of resources requested for job.	Request fewer resources	1 CPU will start faster than 4 CPUs. 2 hour time limit will start faster than 10 hours.
Partition	Which partition was requested for job.	Lower resource partitions	Express partition will start faster than Long. NOT related to Priority Tier.
Fair Share	What fraction of cluster resources are already being used by you.	Fewer jobs currently running	Having 1 job already running will start faster than having 10 of the same job.

The fastest way to queue a job is to request minimal resources and time, have a smaller share of total resources already used, and use the shortest partition possible.

Given two or more jobs with equal priority, the job on the partition with the largest "Priority Tier" value goes first.

The scheduler cannot predict the future. If a job enters the queue with a higher priority than yours, it will start before yours. This may lead to a situation where your job no longer fits on any of the nodes. If this happens your job will have to wait until sufficient space opens regardless of its priority value. A possible strategy to minimize the risk of preemption is to request fewer resources per node, to more readily fill available space.

If you are unsure of the best queueing strategy for your workflow, please Contact Us for a consultation, we are happy to help.