Ticket 11083 - Erratic GPU allocation
Summary: Erratic GPU allocation
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: Scheduling (show other tickets)
Version: 20.11.2
Hardware: Linux Linux
: 4 - Minor Issue
Assignee: Dominik Bartkiewicz
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2021-03-15 01:46 MDT by Greg Wickham
Modified: 2021-04-28 11:28 MDT (History)
5 users (show)

See Also:
Site: KAUST
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 20.11.6
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
slurm.conf (4.37 KB, text/plain)
2021-03-15 02:36 MDT, Ahmed Essam ElMazaty
Details
gres.conf (9.00 KB, text/plain)
2021-03-15 02:36 MDT, Ahmed Essam ElMazaty
Details
Fragments related to two jobs from Slurmctld (9.18 KB, text/plain)
2021-03-16 07:53 MDT, Greg Wickham
Details
Testing submission - slurmctld log (28.75 KB, text/plain)
2021-03-16 08:02 MDT, Greg Wickham
Details
Partitions.conf (6.28 KB, text/plain)
2021-03-16 08:11 MDT, Greg Wickham
Details

Note You need to log in before you can comment on or make changes to this ticket.
Description Greg Wickham 2021-03-15 01:46:13 MDT
Allocation resources with:

    srun --gpus-per-task=1 --ntasks=2 --nodes=2 --time 00:10:00 --pty /bin/bash -i

Has resulted in different allocations.

JobID|AllocTRES
14726565|billing=8,cpu=8,gres/gpu=3,mem=16G,node=2
14726565.extern|billing=8,cpu=8,gres/gpu=3,mem=16G,node=2
14726565.0|cpu=2,gres/gpu:gtx1080ti=2,gres/gpu=2,mem=0,node=2
14733058|billing=8,cpu=8,gres/gpu=5,mem=16G,node=2
14733058.extern|billing=8,cpu=8,gres/gpu=5,mem=16G,node=2
14733058.0|cpu=2,gres/gpu:gtx1080ti=2,gres/gpu=2,mem=0,node=2


Job # 14726565 was allocated 3 GPUs
Job # 14733058 was allocated 5 GPUs

Expected behavior is 1 task on each node, with each task being allocated 1 GPU.

   -Greg
Comment 1 Ahmed Essam ElMazaty 2021-03-15 02:36:12 MDT
Created attachment 18426 [details]
slurm.conf
Comment 2 Ahmed Essam ElMazaty 2021-03-15 02:36:40 MDT
Created attachment 18427 [details]
gres.conf
Comment 5 Marcin Stolarek 2021-03-16 02:38:37 MDT
Could you please set SlurmctldDebug to at least verbose, enable GRES debug flag and share slurmctld logs from the time when jobs are submitted and started?

cheers,
Marcin
Comment 6 Greg Wickham 2021-03-16 07:53:29 MDT
Created attachment 18468 [details]
Fragments related to two jobs from Slurmctld
Comment 7 Greg Wickham 2021-03-16 08:02:03 MDT
Created attachment 18469 [details]
Testing submission - slurmctld log

$ srun --gpus-per-task=1 --ntasks=2 --nodes=2 --time 00:10:00 --pty /bin/bash -i
srun: job 590 queued and waiting for resources
srun: job 590 has been allocated resources


$ scontrol show -d job=590
JobId=590 JobName=bash
   UserId=wickhagj(100302) GroupId=g-wickhagj(1100302) MCS_label=N/A
   Priority=889 Nice=0 Account=root QOS=normal
   JobState=RUNNING Reason=None Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
   DerivedExitCode=0:0
   RunTime=00:00:13 TimeLimit=00:10:00 TimeMin=N/A
   SubmitTime=2021-03-16T16:55:24 EligibleTime=2021-03-16T16:55:24
   AccrueTime=2021-03-16T16:55:24
   StartTime=2021-03-16T16:55:24 EndTime=2021-03-16T17:05:24 Deadline=N/A
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2021-03-16T16:55:24
   Partition=batch AllocNode:Sid=slurm-02:2418
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=dgpu502-[29,33]
   BatchHost=dgpu502-29
   NumNodes=2 NumCPUs=8 NumTasks=2 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   TRES=cpu=8,mem=16G,node=2,billing=8,gres/gpu=8
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   JOB_GRES=gpu:8
     Nodes=dgpu502-[29,33] CPU_IDs=0-3 Mem=8192 GRES=gpu:4(IDX:0-3)
   MinCPUsNode=1 MinMemoryCPU=2G MinTmpDiskNode=0
   Features=nolmem DelayBoot=00:00:00
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=/bin/bash
   WorkDir=/home/wickhagj
   Power=
   CpusPerTres=gpu:4
   TresPerTask=gpu:1
   NtasksPerTRES:0

$ sacct -j 590 -P --format=jobid,alloctres
JobID|AllocTRES
590|billing=8,cpu=8,gres/gpu=8,mem=16G,node=2
590.extern|billing=8,cpu=8,gres/gpu=8,mem=16G,node=2
590.0|cpu=2,gres/gpu:gtx1080ti=2,gres/gpu=2,mem=0,node=2
Comment 8 Dominik Bartkiewicz 2021-03-16 08:08:58 MDT
Hi

Could you send us partitions.conf?

Dominik
Comment 9 Greg Wickham 2021-03-16 08:10:50 MDT
The full debug logs will be uploaded tomorrow.
Comment 10 Greg Wickham 2021-03-16 08:11:24 MDT
Created attachment 18470 [details]
Partitions.conf
Comment 11 Dominik Bartkiewicz 2021-03-16 09:04:10 MDT
Hi

I can recreate this issue.
I will let you know when the fix will be available.

Dominik
Comment 18 Dominik Bartkiewicz 2021-03-31 09:41:49 MDT
Hi

This commit should fix this issue.
It will be available in slurm 20.11.6 and above.
https://github.com/SchedMD/slurm/commit/bdf66674f9e0f03

Dominik
Comment 19 Dominik Bartkiewicz 2021-04-02 04:35:50 MDT
Hi

Is there anything else I can do to help or are you ok to close this ticket?

Dominik
Comment 20 Greg Wickham 2021-04-04 00:37:43 MDT
Hi Dominik,

If the bug has been resolved, the ticket can be closed.

thanks,

   -greg
Comment 22 Greg Wickham 2021-04-28 11:28:13 MDT
We upgraded to 20.11.6 today and it's working great.

Thanks Dominik.

   -Greg