Created attachment 4192 [details] slurm.conf I'm trying to set up floating partitions. I've set up a partition with the QOS parameter: PartitionName=hpc-256 Nodes=n2[001-133] Default=NO AllowAccounts=hpc QOS=hpc-256 AllowQos=hpc-256 Priority=1000 OverSubscribe=EXCLUSIVE PreemptMode=OFF I've set up a QOS with the GrpNodes parameter: sacctmgr modify qos hpc-256 set GrpNodes=5 But yet, I can submit an arbitrary number of jobs to the given parition and they all start: sbatch --cpus-per-task 28 -N 1 -t 1:00:00 -A hpc -p hpc-256 test-sleep.sh squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 84 hpc-256 test-sle sjf4 R 6:25 1 n2026 85 hpc-256 test-sle sjf4 R 6:25 1 n2060 86 hpc-256 test-sle sjf4 R 6:25 1 n2068 87 hpc-256 test-sle sjf4 R 6:25 1 n2071 88 hpc-256 test-sle sjf4 R 6:25 1 n2081 83 hpc-256 test-sle sjf4 R 6:28 1 n2001 I've added my slurm.conf and will add the output from show assoc.
Created attachment 4193 [details] Output from show assoc
(Apologies for brevity, on a plane at the moment.) If you turn on AccountingStorageEnforce=safe,qos I think this you'll see things start working as desired. Without that setting none of the limits set through sacctmgr will affect anything.
Stephen, as Tim noted, you will need to enforce limits for them to be enforced ;). You can read about the other settings for AccountingStorageEnforce here... https://slurm.schedmd.com/slurm.conf.html#OPT_AccountingStorageEnforce I will make note multiple jobs running from the same association will count multiple times even if the job runs on the same node. i.e. if you both job 1 and 2 from the same association run on node0 it will count as 2 nodes. I noticed you don't appear to be sharing any nodes (OverSubscribe=Exclusive is in all partitions) so this shouldn't be an issue, but if you ever do share nodes I would suggest you use GrpCPUs instead (or the more modern GrpTRES=cpus=$COUNT) as it will work correctly in either case, since you usually don't share cpus all at the same time. Let us know if you need anything more or if this is sufficient.
That sure did it. Thanks a lot.