On a Cray internal system with GPUs, a user reported a change in behavior between 17.11.8 and 18.08.0. It appears that --gres now requires a count to be specified. On 18.08.0: lanton@tiger:~/cray/gpu-omp-tests> srun -C P100 --gres=gpu -n 1 hostname srun: error: Unable to allocate resources: Invalid generic resource (gres) specification lanton@tiger:~/cray/gpu-omp-tests> srun -C P100 --gres=gpu:1 -n 1 hostname nid00192 On 17.11.8: dgloe@tiger:~> srun --gres=gpu hostname nid00012 dgloe@tiger:~> srun --version slurm 17.11.8 We have the GPU gres defined as so: slurm.conf: NodeName=nid000[12-15,20-23] Sockets=1 CoresPerSocket=12 ThreadsPerCore=2 Gres=craynetwork:4,gpu Feature=K40 # RealMemory=32768 NodeName=nid000[24-35] Sockets=1 CoresPerSocket=10 ThreadsPerCore=2 Gres=craynetwork:4,gpu Feature=K20 # RealMemory=32768 NodeName=nid000[36-43] Sockets=2 CoresPerSocket=12 ThreadsPerCore=2 Gres=craynetwork:4 # RealMemory=65536 NodeName=nid000[44-47] Sockets=1 CoresPerSocket=10 ThreadsPerCore=2 Gres=craynetwork:4,gpu Feature=K20 # RealMemory=32768 NodeName=nid000[48-59] Sockets=1 CoresPerSocket=12 ThreadsPerCore=2 Gres=craynetwork:4,gpu Feature=K40 # RealMemory=32768 NodeName=nid00[192-203,224-231] Sockets=1 CoresPerSocket=18 ThreadsPerCore=2 Gres=craynetwork:4,gpu Feature=P100 # RealMemory=65536 gres.conf: NodeName=nid000[12-15,20-23,36-43] Name=craynetwork Count=4 NodeName=nid000[12-15,20-23] Name=gpu File=/dev/nvidia0 NodeName=nid000[24-35,44-47] Name=craynetwork Count=4 NodeName=nid000[24-35,44-47] Name=gpu File=/dev/nvidia0 NodeName=nid000[48-59] Name=craynetwork Count=4 NodeName=nid000[48-59] Name=gpu File=/dev/nvidia0 NodeName=nid00[192-203,224-235] Name=craynetwork Count=4 NodeName=nid00[192-203,224-235] Name=gpu File=/dev/nvidia0
Hi I am working on it, I will inform you when we will fix this. Dominik
Hi This commit should fix this issue: https://github.com/SchedMD/slurm/commit/8042bb5fdc076b4 I'm marking this ticket as resolved/info given As always, please feel free to reopen if you have additional questions. Dominik