Ticket 3570

Summary: GrpNodes seems non-functional
Product: Slurm Reporter: Stephen Fralich <sjf4>
Component: SchedulingAssignee: Danny Auble <da>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 3 - Medium Impact    
Priority: ---    
Version: 17.02.1   
Hardware: Linux   
OS: Linux   
Site: University of Washington Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: --- DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---
Attachments: slurm.conf
Output from show assoc

Description Stephen Fralich 2017-03-10 17:03:07 MST
Created attachment 4192 [details]
slurm.conf

I'm trying to set up floating partitions.

I've set up a partition with the QOS parameter:
PartitionName=hpc-256 Nodes=n2[001-133] Default=NO AllowAccounts=hpc QOS=hpc-256 AllowQos=hpc-256 Priority=1000 OverSubscribe=EXCLUSIVE PreemptMode=OFF

I've set up a QOS with the GrpNodes parameter:
sacctmgr modify qos hpc-256 set GrpNodes=5

But yet, I can submit an arbitrary number of jobs to the given parition and they all start:
sbatch --cpus-per-task 28 -N 1 -t 1:00:00 -A hpc -p hpc-256 test-sleep.sh
squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
                84   hpc-256 test-sle     sjf4  R       6:25      1 n2026
                85   hpc-256 test-sle     sjf4  R       6:25      1 n2060
                86   hpc-256 test-sle     sjf4  R       6:25      1 n2068
                87   hpc-256 test-sle     sjf4  R       6:25      1 n2071
                88   hpc-256 test-sle     sjf4  R       6:25      1 n2081
                83   hpc-256 test-sle     sjf4  R       6:28      1 n2001

I've added my slurm.conf and will add the output from show assoc.
Comment 1 Stephen Fralich 2017-03-10 17:03:31 MST
Created attachment 4193 [details]
Output from show assoc
Comment 2 Tim Wickberg 2017-03-10 17:10:48 MST
(Apologies for brevity, on a plane at the moment.)

If you turn on
AccountingStorageEnforce=safe,qos
I think this you'll see things start working as desired. Without that setting none of the limits set through sacctmgr will affect anything.
Comment 3 Danny Auble 2017-03-10 17:14:19 MST
Stephen, as Tim noted, you will need to enforce limits for them to be enforced ;).

You can read about the other settings for AccountingStorageEnforce here...

https://slurm.schedmd.com/slurm.conf.html#OPT_AccountingStorageEnforce

I will make note multiple jobs running from the same association will count multiple times even if the job runs on the same node. i.e. if you both job 1 and 2 from the same association run on node0 it will count as 2 nodes.

I noticed you don't appear to be sharing any nodes (OverSubscribe=Exclusive is in all partitions) so this shouldn't be an issue, but if you ever do share nodes I would suggest you use GrpCPUs instead (or the more modern GrpTRES=cpus=$COUNT) as it will work correctly in either case, since you usually don't share cpus all at the same time.

Let us know if you need anything more or if this is sufficient.
Comment 4 Stephen Fralich 2017-03-10 17:28:37 MST
That sure did it. Thanks a lot.