Reproduced a reported problem. The problem was reported against version 16.05.5 but I had the same problem on 17.02.9 as follows: I created a new account (test) and associated a user (dparisek) with that account. Then I first set MaxWall to 1 minute (and later MaxWallDurationPerJob to 1 min to see if that made a difference). I set AccountingStorageEnforce=associations,limit; user dparisek ran a sleep job for 2 mins but the job remained running the entire 2 mins. ======================================================================== sacctmgr modify user dparisek set MaxWallDurationPerJob=1 Modified user associations... C = cluster5 A = test U = dparisek Would you like to commit changes? (You have 30 seconds to decide) (N/y): y [trek0] (slurm) dhp> sacctmgr -s show user where user=dparisek format=user,maxw User MaxWall ---------- ----------- dparisek 00:01:00 [trek0] (slurm) dhp> scontrol show config | grep Accounting AccountingStorageBackupHost = (null) AccountingStorageEnforce = associations,limits << srun sleep 120& >> Ran entire 2 mins - maxwall not enforced! ======================================================================== Then I created a new QoS and associated that QoS with the user and associated that QoS with a MaxWall=1 min. This DID work! [trek0] (slurm) dhp> sacctmgr add qos qosA Adding QOS(s) qosa Settings Description = qosa Would you like to commit changes? (You have 30 seconds to decide) (N/y): y [trek0] (slurm) dhp> sacctmgr modify user dparisek set qos=qosa Modified user associations... C = cluster5 A = test U = dparisek Would you like to commit changes? (You have 30 seconds to decide) (N/y): y [trek0] (slurm) dhp> sacctmgr -s show user where user=dparisek format=user,maxw,qos User MaxWall QOS ---------- ----------- -------------------- dparisek 00:01:00 qosa [trek0] (slurm) dhp> sacctmgr modify qos set maxwall=1 where user=dparisek << srun sleep 120& >> Maxwall was enforced - job was killed after 1 min ======================================================================== Question: Did I miss something in the first scenario when I didn't have a QoS associated? Is associating a QoS the only way to enforce MaxWall (and maybe other limits)? If so then what is the point of allowing sacctmgr to set the limit without a QoS? Is there a bug here? Thanks.
This has already been fixed in bug 4681. Marking as duplicate. Please reopen if it doesn't address your problem. Specifically: https://github.com/SchedMD/slurm/commit/9143c7c964 and more work done here: https://github.com/SchedMD/slurm/commit/2ef56d4b96f93e0854 *** This ticket has been marked as a duplicate of ticket 4681 ***