The scheduler currently has two partitions, which are meant for different purposes.
Partition | Description |
---|---|
shared | TBD |
longrunning | TBD |
If a job you are submitting uses a large number of CPUs for processing some data, you should consider whether you have data parallelism. Such jobs should use smaller array jobs. For example:
Generate some faux data:
for i in {1..100}; do echo $i >> data.txt; done
Then, create a batch script named split_data_100.sbatch
for splitting:
#!/bin/bash
#SBATCH --job-name=split_data_100
#SBATCH -p shared
#SBATCH --mem-per-cpu=4G
#SBATCH --time=00:30:00
#SBATCH --output=split_data_100_%J.log # %J is job ID
sleep 30
split --numeric-suffixes=1 -n100 --additional-suffix=.txt data.txt data.
And submit the batch script:
sbatch split_data_100.sbatch
Submitted batch job 854907
For real data this might take some time, and you don’t want to wait, so you can
submit the next job with a dependency on it. Call this next one
example_array_job.sbatch
:
#!/bin/bash
#SBATCH --job-name=example_array_job
#SBATCH --mem-per-cpu=1G
#SBATCH --time=00:10:00 # 10 min timelimit, setting a short timelimit decreases wait time in the queue
#SBATCH --output=example_array_job_log.%A_%a.log # %a is array index, %A is job ID
#SBATCH --array=1-100%5 # 100 array task, max 5 running concurrently (i.e. limits IO)
J=$(printf "%03d" $SLURM_ARRAY_TASK_ID)
# python my_process_data_script.py data.${J}.txt > processed_data.${J}.txt
sleep 20
cat data.${J}.txt >processed_data.${J}.txt
And submit it with a dependency on the first one:
split_job_id=$(squeue --noheader --format=%i --name=split_data_100)
sbatch --dependency=afterok:${split_job_id} example_array_job.sbatch
Submitted batch job 854908
Check to see what the queue looks like:
squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 854908_[1-100%5] shared example_ rkjaran PD 0:00 1 (Dependency) 854907 shared split_da rkjaran R 0:12 1 terra
Then check again once the split job has finished:
squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 854908_[6-100%5] shared example_ rkjaran PD 0:00 1 (JobArrayTaskLimit) 854908_1 shared example_ rkjaran R 0:17 1 terra 854908_2 shared example_ rkjaran R 0:17 1 terra 854908_3 shared example_ rkjaran R 0:17 1 terra 854908_4 shared example_ rkjaran R 0:17 1 terra 854908_5 shared example_ rkjaran R 0:17 1 terra
In SLURM-land GPUs are a GRES (Generic RESource). A job will not be allocated
a GRES unless requested with the --gres
option to sbatch
or srun
. A
GRES resource specifier has the format name[:type[:count]]
. For example, to
request a single GPU of any type for an interactive job:
srun -p shared --gres=gpu:1 --pty /bin/bash
For a batch job, a SLURM directive is sometimes a better choice:
#!/bin/bash #SBATCH -p shared #SBATCH --gres=gpu:1 #SBATCH --mem=11G #SBATCH --output=cool-model-log.log python train-my-cool-model.py
Which can schedule like so:
sbatch my-gpu-slurm-job.sbatch
Using sinfo
we can discover what GPUs are available:
sinfo -O partition,nodelist,gres:30
PARTITION NODELIST GRES shared* gaia (null) shared* terra gpu:titanx:2,gpu:gtx1080ti:4 shared* torpaq gpu:rtx2080ti:4 longrunning gaia (null) longrunning torpaq gpu:rtx2080ti:4 login gaia (null)
And we can now request a specific type of GPU. For our interactive job we want two RTX 2080Ti GPUs:
srun -p shared --gres=gpu:rtx2080ti:1 --pty /bin/bash
Kaldi comes with a SLURM wrapper utils/slurm.pl
which can be used as the
cmd
script. Put the following in conf/slurm.conf
:
command sbatch --export=PATH --ntasks-per-node=1 option time=* --time $0 option mem=* --mem-per-cpu $0 option mem=0 # Do not add anything to qsub_opts option num_threads=* --cpus-per-task $0 --ntasks-per-node=1 option num_threads=1 --cpus-per-task 1 --ntasks-per-node=1 default gpu=0 option gpu=0 option gpu=* --gres=gpu:$0 # This has to be figured out # note: the --max-jobs-run option is supported as a special case # by slurm.pl and you don't have to handle it in the config file.
and the following in cmd.sh (or something similar):
export train_cmd="utils/slurm.pl --mem 6G --time 05:00:00" export decode_cmd="utils/slurm.pl --mem 4G" export mkgraph_cmd="utils/slurm.pl --mem 4G" export big_memory_cmd="utils/slurm.pl --mem 8G" export cuda_cmd="utils/slurm.pl --gpu 1"