Ticket 11529

Summary: Set CUDA_DEVICE_ORDER when AutoDetect=nvml is used
Product: Slurm Reporter: Michael Hinton <hinton>
Component: GPUAssignee: Director of Support <support>
Status: RESOLVED WONTFIX QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: kilian
Version: 21.08.x   
Hardware: Linux   
OS: Linux   
See Also: https://bugs.schedmd.com/show_bug.cgi?id=10827
https://bugs.schedmd.com/show_bug.cgi?id=10933
Site: SchedMD Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: --- DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---

Description Michael Hinton 2021-05-04 15:02:13 MDT
It might be good if Slurm automatically sets CUDA_​DEVICE_​ORDER=PCI_BUS_ID as a convenience to make sure that CUDA applications are guaranteed to have the same GPU order as NVML/nvidia-smi (and eventually Slurm, pending progress in bug 10933).

Things we still need to think through:

1) Should setting CUDA_​DEVICE_​ORDER=PCI_BUS_ID be done automatically whenever any GPU is requested? Or only when AutoDetect=nvml is specified? I think the former makes the most sense.

2) Would there be any case where a user would want to override CUDA_​DEVICE_​ORDER to *not* be PCI_BUS_ID? If so, maybe we would need to check if it was set to anything else first before we blindly set it, and maybe emit a warning about it if it's set to something else.

See bug 10827 comment 83 for more context.
Comment 1 Michael Hinton 2022-01-26 16:42:47 MST
Hey Kilian,

We are going to go ahead and leave CUDA_DEVICE_ORDER alone. How this is set probably won't matter in most cases, and in the cases where it could matter, we have this documented:

"For this numbering to match the numbering reported by CUDA, the CUDA_DEVICE_ORDER environmental variable must be set to CUDA_DEVICE_ORDER=PCI_BUS_ID." 

The CUDA documentation also states that there are two possible values for CUDA_DEVICE_ORDER - FASTEST_FIRST and PCI_BUS_ID - and that the default is FASTEST_FIRST. See https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars.

So we are going to err on the side of flexibility and backwards compatibility and leave it up to the CUDA application developer to change CUDA_DEVICE_ORDER. Of course, if you have a compelling counterpoint, feel free to elaborate.

Thanks!
-Michael