Ticket 279

Summary: Add ability to specify different GPU types
Product: Slurm Reporter: Moe Jette <jette>
Component: SchedulingAssignee: Moe Jette <jette>
Status: RESOLVED FIXED QA Contact:
Severity: 5 - Enhancement    
Priority: --- CC: da
Version: 2.5.x   
Hardware: Linux   
OS: Linux   
Site: -Other- Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: 14.11.0-pre1 Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Moe Jette 2013-05-02 10:48:37 MDT
User should be able to specify type of the GPU which is needed for his job. For example, "sbatch --gres:gpu_gtx560:1 ./job.sh". But then SLURM will not set CUDA_VISIBLE_DEVICES=<correct GPU id>, because gpu plugin is not used in this case.

If user specify now "sbatch --gres:gpu:1 ..", then CUDA_VISIBLE_DEVICES will be set to the first free GPU, which could be a GPU which is unsupported by the job.

I belive the only true way here is to map GPU device and some string which denotes a GPU type. For example, in gres.conf:

Name=gpu Type=gtx560 File=/dev/nvidia0
Name=gpu Type=gtx560 File=/dev/nvidia1
Name=gpu Type=tesla File=/dev/nvidia2

And then user will request particular GPU as follows:

sbatch --gres=gpu:tesla:1 ./my.job

and then SLURM gpu plugin will set CUDA_VISIBLE_DEVICES=2 for this job.
Comment 1 Moe Jette 2014-04-08 06:22:59 MDT
I'm going to get started on this with an eye toward adding some other GRES enhancements requested by customers.
Comment 2 Moe Jette 2014-04-14 09:20:41 MDT
The logic to provide this functionality is now in v14.11 with a multitude of commits.