Ticket 23127

Summary:	Change in 25.05: If new QoSs are created with Flags=-1, then we are unable to submit jobs that use the QoS
Product:	Slurm	Reporter:	Omen Wild <omen>
Component:	slurmdbd	Assignee:	Benjamin Witham <benjamin.witham>
Status:	OPEN ---	QA Contact:
Severity:	4 - Minor Issue
Priority:	---	CC:	benjamin.witham
Version:	25.05.0
Hardware:	Linux
OS:	Linux
Site:	UC Davis	Slinky Site:	---
Alineos Sites:	---	Atos/Eviden Sites:	---
Confidential Site:	---	Coreweave sites:	---
Cray Sites:	---	DS9 clusters:	---
Google sites:	---	HPCnow Sites:	---
HPE Sites:	---	IBM Sites:	---
NOAA SIte:	---	NoveTech Sites:	---
Nvidia HWinf-CS Sites:	---	OCF Sites:	---
Recursion Pharma Sites:	---	SFW Sites:	---
SNIC sites:	---	Tzag Elita Sites:	---
Linux Distro:	---	Machine Name:
CLE Version:		Version Fixed:
Target Release:	---	DevPrio:	---
Emory-Cloud Sites:	---

Description Omen Wild 2025-06-30 17:49:08 MDT

We have an account provisioning system that adds new QoSs like this:

> sacctmgr -iQ add qos pigrp-gpu-6000_ada-h-qos GrpTres=cpu=128,mem=1536000,gres/gpu=8 MaxTRESPerUser=cpu=-1,mem=-1,gres/gpu=-1 MaxTresPerJob=cpu=-1,mem=-1,gres/gpu=-1 Flags=-1 Priority=0

Then adds new associations like this:

> sacctmgr -iQ add user user=user1 account=pigrp partition=gpu-6000_ada-h qos=pigrp-gpu-6000_ada-h-qos
> sacctmgr -iQ add user user=user2 account=pigrp partition=gpu-6000_ada-h qos=pigrp-gpu-6000_ada-h-qos

This worked with Slurm 24.05, but broke when we upgraded to 25.05. When users submit jobs, they get this error:

> srun: error: Unable to allocate resources: Invalid qos specification

And slurmctld would log this line: 

> error: QOS pigrp-gpu-6000_ada-h-qos is relative and used as a Partition QOS. This prohibits it from being used as a job's QOS

We do NOT have a QoS set on the partition "QoS=N/A".

After a bunch of testing, we found that if we removed "Flags=-1" from the "add qos" command, then everything would work correctly and users could submit jobs.

Is there a reason that creating new QoSs with "Flags=-1" changed like this?

Comment 1 Benjamin Witham 2025-07-01 17:26:10 MDT

Hello Omen,

I'm able to replicate this, and I'm working towards a solution. It looks like a code change led to this unintended regression. I'll send updates when I know more.

Comment 2 Benjamin Witham 2025-07-03 15:12:27 MDT

Hello Omen, 

After investigating further, I have to retract this statement.
> It looks like a code change led to this unintended regression.
There is a problem with the handling of the qos flags, and the code change that I referenced above did not cause the issue.

For some context behind the issue, the Flags=-1 is used to remove all flags from an existing QOS. This is done by setting all flags, then also setting a remove flag. This remove flag signals to the slurmctld and slurmdbd to remove those flags from the existing QOS. When the Flags=-1 is used when creating a new QOS, all flags and the remove flag are set. The slurmdbd recognizes the remove flag and adjusts appropriately, but the slurmctld has no such logic, creating a QOS that has ALL flags instead of none of the flags! If the slurmctld is restarted, the correct QOS is pulled from the slurmdbd.

When I was reproducing your issue, I was unintentionally restarting the slurmctld, as I can no longer reproduce without a slurmctld restart; jobs will be denied with an INVALID_QOS error until the restart.

Do you often restart your slurmctld? If not, are you able to upload your slurm.conf as well as a typical job command from a user? I'm confident that a patch I have will fix the issue, but I'd like to check to confirm we're hitting your issue and not a similar but unrelated one.

Comment 4 Omen Wild 2025-07-03 18:34:46 MDT

Hi Benjamin,

An example srun that was failing is:

> srun --time=1:00:00 --account=bnbaileygrp --partition=gpu-6000_ada-h --gres=gpu:1 --cpus-per-task=8 --mem=64G --pty bash

This is the same cluster as another open ticket. The slurm.conf is here: https://support.schedmd.com/attachment.cgi?id=42344

We typically only restart slurmctld when we add new nodes, which could a couple of times a month, but sometimes 3-6 months.