SLURM

The Elysium HPC system utilizes SLURM as a resource manager, scheduler, and accountant in order to guarantee fair share of the computing resources.

If you are looking for technical details regarding the usage and underlying mechanisms of SLURM we recommend participating in the Introduction to HPC training course.

Examples of job scripts for different job types that are tailored to the Elysium cluster can be found in the Training Section.

List of Partition

All nodes in the Elysium cluster are grouped by their hardware kind, and job submission type. This way users can request specific computing hardware, and multi node jobs are guaranteed to run on nodes with the same setup.

In order to get a list of the available partitions, their current state, and available nodes, the sinfo command can be used.

 1[login_id@login001 ~]$ sinfo
 2PARTITION      AVAIL  TIMELIMIT  NODES  STATE NODELIST
 3cpu               up 7-00:00:00      4  alloc cpu[033-034,037-038]
 4cpu               up 7-00:00:00    280   idle cpu[001-032,035-036,039-284]
 5cpu_filler        up    3:00:00      4  alloc cpu[033-034,037-038]
 6cpu_filler        up    3:00:00    280   idle cpu[001-032,035-036,039-284]
 7fat_cpu           up 2-00:00:00     13   idle fatcpu[001-013]
 8fat_cpu_filler    up    3:00:00     13   idle fatcpu[001-013]
 9gpu               up 2-00:00:00     20   idle gpu[001-020]
10gpu_filler        up    1:00:00     20   idle gpu[001-020]
11fat_gpu           up 2-00:00:00      1 drain* fatgpu005
12fat_gpu           up 2-00:00:00      5    mix fatgpu[001,003-004,006-007]
13fat_gpu           up 2-00:00:00      1   idle fatgpu002
14fat_gpu_filler    up    1:00:00      1 drain* fatgpu005
15fat_gpu_filler    up    1:00:00      5    mix fatgpu[001,003-004,006-007]
16fat_gpu_filler    up    1:00:00      1   idle fatgpu002
17vis               up 1-00:00:00      3   idle vis[001-003]

Requesting Nodes of a Partition

SLURM provides two commands to request resources. srun is used to start an interactive session.

1[login_id@login001 ~]$ srun -N 1 --partition=cpu --job-name=test --time=00:05:00 --account=testproj_0000 --pty bash
2[login_id@cpu001 ~]$

sbatch is used to request resources that will execute a job script.

1[login_id@login001 ~]$ sbatch -N 1 --partition=cpu --job-name=test --time=00:05:00 --account=testproj_0000 myscript.sh
2Submitted batch job 10290

For sbatch the submission flags can also be incorporated into the job script itself. More information about job scripts, and the required and some optional flags can be found in the Training/SLURM Header section.

On Elysium several flags are mandatory. sbatch and srun will refuse to queue the job and give a detailed explanation which flag is missing and how to incorporate it into your command or script.

Use spredict myscript.sh to estimate the start time of your job.

Shared Nodes

All nodes are shared by default. If a user requests fewer CPU-cores than a node provides, other users may use these resources at the same time. To ensure that the requested nodes are not shared use the --exclusive flag. If more than one node is requested the --exlusive flag is mandatory.

GPU Nodes

For requesting resources on a GPU node the --gpus=<number of GPUs> flag is required. In order to allow for fairly shared resources the number of CPUs per GPU is limited. Thus the --cpus-per-gpu=<number of CPU cores per GPU> is required as well. For multi node jobs --gpus-per-node=<number of GPUs per node> option needs to be set.

Visualization Nodes

For requesting resources on a visualization node no --gpu parameter is needed. The available GPU will automatically be shared between all jobs on the node.

List of Currently Running and Pending Jobs

If requested resources are currently not available, jobs are queued and will start as soon as the resources are available again. To check which jobs are currently running, and which ones are pending and for what reason the squeue command can be used. For privacy reasons only the user’s own jobs are displayed.

1[login_id@login001 ~]$ squeue
2             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
3             10290       cpu     test login_id  R       2:51      1 cpu001

Users/Projects/Groups/Institutes are billed for computing resources used. To check how many resources a user is entitled to and how many they have already used the sshare command is used. For privacy reasons only the user’s own shares are displayed.

1[login_id@login001 ~]$ sshare
2Account                    User  RawShares  NormShares    RawUsage  EffectvUsage  FairShare
3-------------------- ---------- ---------- ----------- ----------- ------------- ----------
4testproj_0000          login_id       1000    0.166667    20450435      0.163985   0.681818

List of Project Accounts

Due to technical reasons the project names on Elysium have rather cryptic names, based on the loginID of the project manager and a number. In order to make it easier to select a project account for the --account flag for srun, or sbatch, and to check the share and usage of projects, the RUB-exclusive rub-acclist command can be used.

1[login_id@login001 ~]$ rub-acclist
2Project ID    | Project Description
3--------------+--------------------------------------------------
4testproj_0000 | The fundamental interconnectedness of all things
5testproj_0001 | The translated quaternion for optimal pivoting