Ticket 23127

Summary: Change in 25.05: If new QoSs are created with Flags=-1, then we are unable to submit jobs that use the QoS
Product: Slurm Reporter: Omen Wild <omen>
Component: slurmdbdAssignee: Benjamin Witham <benjamin.witham>
Status: OPEN --- QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: benjamin.witham
Version: 25.05.0   
Hardware: Linux   
OS: Linux   
Site: UC Davis Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: --- DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---

Description Omen Wild 2025-06-30 17:49:08 MDT
We have an account provisioning system that adds new QoSs like this:

> sacctmgr -iQ add qos pigrp-gpu-6000_ada-h-qos GrpTres=cpu=128,mem=1536000,gres/gpu=8 MaxTRESPerUser=cpu=-1,mem=-1,gres/gpu=-1 MaxTresPerJob=cpu=-1,mem=-1,gres/gpu=-1 Flags=-1 Priority=0

Then adds new associations like this:

> sacctmgr -iQ add user user=user1 account=pigrp partition=gpu-6000_ada-h qos=pigrp-gpu-6000_ada-h-qos
> sacctmgr -iQ add user user=user2 account=pigrp partition=gpu-6000_ada-h qos=pigrp-gpu-6000_ada-h-qos

This worked with Slurm 24.05, but broke when we upgraded to 25.05. When users submit jobs, they get this error:

> srun: error: Unable to allocate resources: Invalid qos specification

And slurmctld would log this line: 

> error: QOS pigrp-gpu-6000_ada-h-qos is relative and used as a Partition QOS. This prohibits it from being used as a job's QOS

We do NOT have a QoS set on the partition "QoS=N/A".

After a bunch of testing, we found that if we removed "Flags=-1" from the "add qos" command, then everything would work correctly and users could submit jobs.

Is there a reason that creating new QoSs with "Flags=-1" changed like this?
Comment 1 Benjamin Witham 2025-07-01 17:26:10 MDT
Hello Omen,

I'm able to replicate this, and I'm working towards a solution. It looks like a code change led to this unintended regression. I'll send updates when I know more.
Comment 2 Benjamin Witham 2025-07-03 15:12:27 MDT
Hello Omen, 

After investigating further, I have to retract this statement.
> It looks like a code change led to this unintended regression.
There is a problem with the handling of the qos flags, and the code change that I referenced above did not cause the issue.

For some context behind the issue, the Flags=-1 is used to remove all flags from an existing QOS. This is done by setting all flags, then also setting a remove flag. This remove flag signals to the slurmctld and slurmdbd to remove those flags from the existing QOS. When the Flags=-1 is used when creating a new QOS, all flags and the remove flag are set. The slurmdbd recognizes the remove flag and adjusts appropriately, but the slurmctld has no such logic, creating a QOS that has ALL flags instead of none of the flags! If the slurmctld is restarted, the correct QOS is pulled from the slurmdbd.

When I was reproducing your issue, I was unintentionally restarting the slurmctld, as I can no longer reproduce without a slurmctld restart; jobs will be denied with an INVALID_QOS error until the restart.

Do you often restart your slurmctld? If not, are you able to upload your slurm.conf as well as a typical job command from a user? I'm confident that a patch I have will fix the issue, but I'd like to check to confirm we're hitting your issue and not a similar but unrelated one.
Comment 4 Omen Wild 2025-07-03 18:34:46 MDT
Hi Benjamin,

An example srun that was failing is:

> srun --time=1:00:00 --account=bnbaileygrp --partition=gpu-6000_ada-h --gres=gpu:1 --cpus-per-task=8 --mem=64G --pty bash

This is the same cluster as another open ticket. The slurm.conf is here: https://support.schedmd.com/attachment.cgi?id=42344

We typically only restart slurmctld when we add new nodes, which could a couple of times a month, but sometimes 3-6 months.