Ticket 10815

Summary:	Limit TRES per user across the cluster?
Product:	Slurm	Reporter:	Greg Wickham <greg.wickham>
Component:	Accounting	Assignee:	Skyler Malinowski <skyler>
Status:	RESOLVED INFOGIVEN	QA Contact:
Severity:	4 - Minor Issue
Priority:	---
Version:	20.11.2
Hardware:	Linux
OS:	Linux
Site:	KAUST	Slinky Site:	---
Alineos Sites:	---	Atos/Eviden Sites:	---
Confidential Site:	---	Coreweave sites:	---
Cray Sites:	---	DS9 clusters:	---
Google sites:	---	HPCnow Sites:	---
HPE Sites:	---	IBM Sites:	---
NOAA SIte:	---	NoveTech Sites:	---
Nvidia HWinf-CS Sites:	---	OCF Sites:	---
Recursion Pharma Sites:	---	SFW Sites:	---
SNIC sites:	---	Tzag Elita Sites:	---
Linux Distro:	---	Machine Name:
CLE Version:		Version Fixed:
Target Release:	---	DevPrio:	---
Emory-Cloud Sites:	---

Description Greg Wickham 2021-02-09 03:01:41 MST

We're trying to limit the TRES (specifically GPUs) per user across the cluster.

The issue being faced is that there are multiple QOSs (ie: normal and priority), and as MaxTRESPerUser is an attribute of a QOS this limit applies per QOS.

This results in a user being able to access more TRES than required.

Is there anyway to apply a per-user cluster wide limit independent of QOS?

   -greg

Comment 2 Skyler Malinowski 2021-02-10 11:30:20 MST

Hi Greg,

You can use `sacctmgr` to set a `grpTRES` on an account to constrain it and all their children. This acts independently of QOS.

> sacctmgr modify account root set grpTRES=gres/gpu=1

In this example I have set an arbitrarily limit of `grpTRES=gres/gpu=1`. Please replace it with your desired limits. Below you can see what it would look like when applied (albeit in a simple environment).

> sacctmgr show assoc format=account,user,GrpTRES        
   Account       User       GrpTRES 
---------- ---------- ------------- 
      root               gres/gpu=1 
      root       root               
        qa                          
        qa malinowski

Comment 3 Greg Wickham 2021-02-10 23:48:18 MST

Hi Skyler,

GrpTRES= The total count of TRES able to be used at any given time from jobs running from an association and its children or QOS. If this limit is reached new jobs will be queued but only allowed to run after resources have been relinquished from this group.

GrpTRES is the limit on the association, hence your example would place a limit of a maximum of 1 GPU across all users of the 'root' account.

I'm after a "per user" limit.

   -Greg

Comment 4 Skyler Malinowski 2021-02-11 10:32:24 MST

Hi Greg,

You are correct in that `GrpTRES` is not a per user constraint. Unfortunately there is not a `MaxTRESPerUser` option for account associations at this time. So no, you cannot apply a per-user cluster wide limit independent of QOS, at this time.

An option is to have the partition bound QOS have `MaxTRESPerUser`. Depending on your configuration, this may meet what you want.

Anything beyond that would be a feature request.

Regards,
Skyler

Comment 5 Greg Wickham 2021-02-11 12:23:18 MST

Hi Skyler,

Thanks.

Please consider this a feature request.

   -greg

Comment 6 Skyler Malinowski 2021-02-11 15:42:18 MST

Hi Greg,

There may be another option. You can a run a similar command on all the users (per cluster). I know this may be not very ergonomic especially with thousands of user and multiple clusters.

> sacctmgr modify user malinowski set grpTRES=gres/gpu=1 where cluster=qa

Does this cover your use case? Would you prefer this to be a single setting on a cluster?

Regards,
SKyler

Comment 7 Greg Wickham 2021-02-11 23:10:22 MST

Hi Skyler,

Your suggestion won't work for us as an individual user is associated with multiple accounts, and hence the GrpTRES would apply distinctly for each account.

   -Greg

Comment 8 Skyler Malinowski 2021-02-12 14:10:51 MST

Hi Greg,

I will need some more information regarding your configuration and what specifically you are attempting to accomplish. Please provide a sample configuration based on your site and a situation to illustrate the dilemma.

We (SchedMD) need to better understand the why account and QOS cannot satisfy your use case before we can proceed with a formal feature request.

Thanks,
Skyler

Comment 9 Skyler Malinowski 2021-02-19 10:50:46 MST

Hi Gres,

There is internal push-back on a feature like `MaxTRESPerUser` for accounts, unless it can be proven that there is not another method to accomplish the desired outcome. Hence your feature request is halted until more information can prove the feature necessary.

After reviewing the ticket, I believe that a QOS on the partition with `MaxTRESPerUser` with optional flag `OverPartQOS` may work for you.


In the following example I will create a qos that limits Gres cluster wide.

> sacctmgr add qos global_limits
> sacctmgr modify qos global_limits set MaxTRESPerUser=gres/gpu=1
> sacctmgr mod qos global_limits set flags=overpartqos # optional: if set, jobs using this QOS will be able to override any limits used by the requested partition's QOS limits.

In your `slurm.conf` add the following line before the partition definitions.

> PartitionName=DEFAULT Qos=global_limits

Or attach to select partition like normal.

> PartitionName=debug Qos=global_limits

Then reconfigure `slurmctld`

> scontrol reconfigure

The above example will limit Gres on a per-user basis across all partitions, hence the entire cluster.

Regards,
Skyler

Comment 10 Greg Wickham 2021-02-25 06:29:03 MST

Hi Skyler,

Before this is implemented please let me explain our environment:

 - we have many partitions
 - we have many accounts
 - we have multiple QOS
 - users can access all partitions
 - users can belong to multiple accounts

Will your suggestion work if there are multiple partitions? Is TRES tracking for a partition QOS only accounting for usage on that partition?

The primary issue we are facing is users belonging to multiple accounts, thus be choosing different accounts when they submit jobs they can exceed the global desired limit.

   -greg

Comment 11 Skyler Malinowski 2021-02-25 14:10:59 MST

Hi Greg,

> Will your suggestion work if there are multiple partitions? Is TRES tracking for a partition QOS only accounting for usage on that partition?

A partition QOS will track all TRES across all partitions with that partition same QOS. If you need partition level control, you could make a QOS for each partition giving you even more fine grain control.

> The primary issue we are facing is users belonging to multiple accounts, thus be choosing different accounts when they submit jobs they can exceed the global desired limit.

Users can still submit from multiple accounts and QOS but the partition QOS will never be breached unless the user submits from a QOS that has `Flags=OverPartQOS`.

Does that help and clear things up?

Regards,
Skyler

Comment 12 Greg Wickham 2021-03-01 02:20:01 MST

Hi Skyler,

Many thanks.

A global QOS has been setup and is working a treat.

   -Greg

Comment 13 Skyler Malinowski 2021-03-01 06:12:05 MST

Hi Greg,

Great to hear! I am glad I could find you a solution.

Regards,
Skyler