Ticket 16523

Summary: Limiting the number of jobs and cores per user per partition
Product: Slurm Reporter: Navneet Khetrapal <navneet.khetrapal>
Component: LimitsAssignee: Jacob Jenson <jacob>
Status: OPEN --- QA Contact:
Severity: 6 - No support contract    
Priority: ---    
Version: 21.08.8   
Hardware: Linux   
OS: Linux   
Site: -Other- Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: Rocky Linux Machine Name: Cruntch4
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---

Description Navneet Khetrapal 2023-04-13 15:05:44 MDT
Hi Everyone,

I am having trouble in enforcing limits on maximum number of jobs and cores per user per partition. I tired using QOS for this purpose on one partition:

1) Created a test account: 
sacctmgr add account testac

2) Added a user to the account:
sacctmgr add user nsk0051 account=testac

3) Added a qos:
sacctmgr add qos share.64

4) Limited the number of jobs and CPUs for the qos:
sacctmgr modify qos share.64 set GrpJobs=1 GrpTRES=cpu=64

5)sacctmgr show qos share.64 format=name,GrpJobs,GrpTRES
      Name GrpJobs       GrpTRES
---------- ------- -------------
  share.64       1        cpu=64

6) Added QOS=share.64 and AccountingStorageEnforce=qos to partition line in slurm.conf 

Ideally the maximum running jobs should be 1 and cpus should be 64 in this case. But users able to exceed this. The QOS limits don't seem to imposed.  Can anyone help me to please resolve this issue ?