Ticket 17820 - Limiting CPUs/GPUs per user
Summary: Limiting CPUs/GPUs per user
Status: RESOLVED INFOGIVEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: Limits (show other tickets)
Version: 23.02.4
Hardware: Linux Linux
: 4 - Minor Issue
Assignee: Benjamin Witham
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2023-10-02 14:33 MDT by carlos
Modified: 2023-10-03 14:50 MDT (History)
1 user (show)

See Also:
Site: Concordia University
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 23.02.4
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description carlos 2023-10-02 14:33:18 MDT
Hi

I'm trying to limit the use of CPUs and GPUs per user. I used sacctmgr to define the limits on one user test: 

sacctmgr -i modify User carlos set MaxTRES=gres/gpu=1
sacctmgr -i modify User carlos set MaxTRES=cpu=1

Account                    User       MaxTRES                  QOS
-------------------- ---------- ---------------- --------------------
root                                                        normal
 root                    carlos cpu=1,gres/gpu=1               normal


this is the script

#!/encs/bin/tcsh
#SBATCH --job-name=test
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --mem=4G
##SBATCH --gres=gpu:2
#SBATCH --time=00:05:00
...

Either I comment --gres=gpu:2 to use cpus or --cpus-per-task=2 to use GPUs, when I run the script it bypasses the parameters configured in sacctmgr.

in slurm.conf
AccountingStorageEnforce=associations,limits

What am I missing? It could be done without using qos? the limits for some users (not always the same) will change from time to time.

Thanks
Comment 1 Benjamin Witham 2023-10-03 10:24:56 MDT
> It could be done without using qos? the limits for some users (not always the 
> same) will change from time to time.

From your needs, It looks like the user limits is your best option, especially if these limits will change per user. In order for the AccountingStorageEnforce to take effect a slurmctld restart is required. Have you restarted the slurmctld since changing this parameter?

> https://slurm.schedmd.com/slurm.conf.html#OPT_AccountingStorageEnforce
Comment 2 carlos 2023-10-03 14:46:45 MDT
(In reply to Benjamin Witham from comment #1)
> > It could be done without using qos? the limits for some users (not always the 
> > same) will change from time to time.
> 
> From your needs, It looks like the user limits is your best option,
> especially if these limits will change per user. In order for the
> AccountingStorageEnforce to take effect a slurmctld restart is required.
> Have you restarted the slurmctld since changing this parameter?
> 
> > https://slurm.schedmd.com/slurm.conf.html#OPT_AccountingStorageEnforce

Thanks, it is working fine now.
Comment 3 carlos 2023-10-03 14:48:14 MDT
No more help needed in this case
Again, thank you for your time.

Closing ticket