Similarly to Balena, Janus, the phase 1 cloud HPC service uses a scheduler to manage how jobs are run and resources are allocated.
If multiple jobs ran on a single node at the same time users would be competing for the same resources and jobs take longer to run overall. A scheduler manages individual jobs, which are allocated to the resources they need as they become available. This results in a higher overall throughput and more consistent performance.
The scheduler used by Janus is the same as that used by Balena, slurm: _Simple Linux Utility for Resource M_anagment.
Interacting with the sceduler is done through the terminal using an array of commands.
Below are a number of key slurm commands:
| Slurm command | Function |
|---|---|
sinfo |
View information about SLURM nodes and partitions |
squeue |
List status of jobs in the queue |
squeue --user [userid] |
Jobs by user |
squeue --job [jobid] |
Jobs by jobid |
sbatch [jobscript] |
Submit a jobscript to the scheduler |
scancel [jobid] |
Cancel a job in the queue |
scontrol hold [jobid] |
Hold a job in the queue |
scontrol release [jobid] |
Release a held job |
scontrol show job [jobid] |
View information about a job |
scontrol show node nodename |
Get information of a node |