University of Bath

Research Computing Team (DDaT)

Anatra HPC Documentation

Running GPU Jobs

What is CUDA?

CUDA (Compute Unified Device Architecture) is NVIDIA's parallel computing platform that allows software to use GPUs for general-purpose processing. Many scientific applications in molecular dynamics, machine learning, and computational chemistry use CUDA to accelerate calculations that would take much longer on CPUs alone.

GPU Partitions on Anatra

Anatra provides two GPU partitions depending on your research area:

  • lifescigpu - For Life Sciences users - Uses Nvidia Hopper (H100, sm_90) architecture
  • chempgpu - For Chemistry users - Uses Nvidia Ada Lovelace (L40S, sm_89) and Nvidia Ampere (A40, sm_86) architectures

Example GPU Job Script

Below is an example batch script for a GPU job, Anatra users can able to submit their batch job scripts from their respective working project directories i.e., Chemistry users it would be /scratch/projects/[project-code] storage area and for Lifesciences users that would be /lifesciences/[project-code] storage area.

#!/bin/bash
#SBATCH --job-name=cuda-gpu-job
#SBATCH --partition=lifescigpu or chemgpu
#SBATCH --gres=gpu:1
#SBATCH --time=00:10:00
#SBATCH --output=cuda_output_%j.out

set -euo pipefail

module load cuda/13.0.1

INPUT_FILE="${1:-}"
if [ -z "$INPUT_FILE" ] || [ ! -f "$INPUT_FILE" ]; then
    echo "Error: Usage: sbatch gpu_job.slm <file.cu>"
    exit 1
fi

PARTITION="${SLURM_JOB_PARTITION:-lifescigpu}"

if [ "$PARTITION" = "lifescigpu" ]; then
    # Life Sciences → H100
    CUDA_ARCH="sm_90"
    GPU_DESC="NVIDIA H100 (Hopper)"

elif [ "$PARTITION" = "chemgpu" ]; then
    # Chemistry → A40 or L40S
    GPU_NAME=$(nvidia-smi --query-gpu=name --format=csv,noheader | head -n 1 | xargs)

    case "$GPU_NAME" in
        *L40S*)
            CUDA_ARCH="sm_89"
            GPU_DESC="NVIDIA L40S (Lovelace)"
            ;;
        *A40*)
            CUDA_ARCH="sm_86"
            GPU_DESC="NVIDIA A40 (Ampere)"
            ;;
        *)
            echo "Error: Unsupported GPU detected in chemgpu: $GPU_NAME"
            exit 1
            ;;
    esac

else
    echo "Error: Unsupported partition: $PARTITION"
    exit 1
fi

echo "Partition : $PARTITION"
echo "GPU       : $GPU_DESC"
echo "CUDA Arch : $CUDA_ARCH"

nvcc -o job_exec "$INPUT_FILE" -arch="$CUDA_ARCH"
./job_exec

Assuming these files are saved as gpu_job.slm then they can be submitted with the command:

sbatch gpu_job.slm input_file.cu

Key Points to Remember

  1. Always specify the correct partition for your research area
  2. Request appropriate GPU resources with --gres=gpu:N
  3. Load the CUDA module before compiling or running GPU code
  4. Set reasonable time limits to avoid unnecessary queuing

Checking Job Status

squeue -u $USER        # Check your running jobs
squeue -p lifescigpu   # Check lifescigpu partition queue
squeue -p chempgpu     # Check chempgpu partition queue

Getting Help

If you encounter GPU-related issues:

  • Check that you're using the correct partition
  • Verify your architecture flag matches your partition's GPUs
  • Ensure CUDA modules are loaded correctly
  • Review your job output file for error messages