Ticket 9788 - Partition QOS with DenyOnLimit blocking multiple partition jobs when one partition is valid
Summary: Partition QOS with DenyOnLimit blocking multiple partition jobs when one part...
Status: RESOLVED DUPLICATE of ticket 7375
Alias: None
Product: Slurm
Classification: Unclassified
Component: Scheduling (show other tickets)
Version: 20.02.4
Hardware: Linux Linux
: 3 - Medium Impact
Assignee: Marshall Garey
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2020-09-09 13:29 MDT by Trey Dockendorf
Modified: 2020-10-06 17:06 MDT (History)
1 user (show)

See Also:
Site: Ohio State OSC
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
slurm.conf (181.26 KB, text/plain)
2020-09-09 13:29 MDT, Trey Dockendorf
Details

Note You need to log in before you can comment on or make changes to this ticket.
Description Trey Dockendorf 2020-09-09 13:29:44 MDT
Created attachment 15819 [details]
slurm.conf

We submit GPU jobs to multiple partitions via job submit filter so users don't have to know which GPU partition to choose.  We have a partition QOS on each GPU partition to try and avoid people getting jobs that would not work for a given node. Right now node types are dual-gpu with 48 core and quad-gpu with 48 core.  I have this QOS:

If I remove DenyOnLimit the job works even in case that fails below.  The job is correctly started on only partition that can satisfy request, gpuserial-quad.

I would expect EnforcePartLimits=ANY to mean that if DenyOnLimit blocked on partition, the other valid partition would be used.


# sacctmgr show qos format=Name,Flags,MaxTRESPerJob,MaxTRESPerNode,MinTresPerJob --parsable
Name|Flags|MaxTRES|MaxTRESPerNode|MinTRES|
pitzer-gpuserial-partition|DenyOnLimit|gres/gpu=2||gres/gpu=1|
pitzer-gpu-quad-partition|DenyOnLimit||gres/gpu=4|gres/gpu=3|

$ sbatch --gpus-per-node=4 -p gpuserial-48core,gpuserial-quad --wrap 'scontrol show job=$SLURM_JOB_ID'
sbatch: error: QOSMaxGRESPerJob
sbatch: error: Batch job submission failed: Job violates accounting/QOS policy (job submit limit, user's size and/or time limits)

$ sbatch --gpus-per-node=4 -p gpuserial-quad --wrap 'scontrol show job=$SLURM_JOB_ID'
Submitted batch job 27781
Comment 1 Trey Dockendorf 2020-09-09 13:32:20 MDT
Appears the order matters, if I put gpuserial-quad first in list, the job is accepted:

$ sbatch --gpus-per-node=4 -p gpuserial-quad,gpuserial-48core --wrap 'scontrol show job=$SLURM_JOB_ID'
Submitted batch job 27819

The same issue occurs if I reduce --gpus-per-node=2 and try and submit to gpuserial-48core:

$ sbatch --gpus-per-node=2 -p gpuserial-quad,gpuserial-48core --wrap 'scontrol show job=$SLURM_JOB_ID'
sbatch: error: QOSMinGRES
sbatch: error: Batch job submission failed: Job violates accounting/QOS policy (job submit limit, user's size and/or time limits)

$ sbatch --gpus-per-node=2 -p gpuserial-48core,gpuserial-quad --wrap 'scontrol show job=$SLURM_JOB_ID'
Submitted batch job 27818
Comment 3 Marshall Garey 2020-09-09 16:57:38 MDT
Hi Trey, I'm looking into this. I think it might be a duplicate of another bug. I'll check on that and get back to you.
Comment 4 Marshall Garey 2020-10-06 17:06:31 MDT
Trey,

I've confirmed that this is indeed a duplicate of bug 7375. I see you've already commented on that bug, so you're already aware of it. I hope to get that bug fixed by the release of 20.11, and though I can't guarantee it will go into 20.02 I could give you a patch to test when it's ready.

Let me know if you have any more questions. For now I'm closing this as a duplicate of bug 7375.

*** This ticket has been marked as a duplicate of ticket 7375 ***