slurm.conf: * SelectType=select/cons_tres * MaxCpusPerNode limit on a partition. Easy reproducer: * Set MaxCpusPerNode=4. * Submit 4 single-core jobs (assuming 1 thread per core) to a specific node (-w flag). * Three jobs run. The last job pends with reason RESOURCES because it thinks we've hit the limit. If there is only one more CPU left until we hit the MaxCpusPerNode limit, then slurmctld doesn't run the job. This only exists in select/cons_tres. I have a patch and will submit it to the review queue.
We fixed this in 21.08 in commit 288631a9cf and we did a little bit of extra cleanup in master (for 22.05). Closing this as fixed.