Ticket 2669 - Different per-account TRES limits in different QoS
Summary: Different per-account TRES limits in different QoS
Status: RESOLVED DUPLICATE of ticket 2242
Alias: None
Product: Slurm
Classification: Unclassified
Component: Limits (show other tickets)
Version: 15.08.6
Hardware: Linux Linux
: 4 - Minor Issue
Assignee: Tim Wickberg
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2016-04-27 05:53 MDT by Stephane Thiell
Modified: 2016-04-28 09:46 MDT (History)
1 user (show)

See Also:
Site: Stanford
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Stephane Thiell 2016-04-27 05:53:14 MDT
Hi there,

We have set GrpTRES limits per account (as shown by sacctmgr list assoc tree) in order to avoid a single group to allocate too much resource on the cluster.

I tried to setup a preemptable QoS so that some users can still run preemptable jobs on the whole cluster. By setting a QoS's GrpTRES that matches the whole cluster resources (cpu+gpu), the QoS limit does overrides the assoc GrpTRES (good). However, when preemptable jobs are running and I submit a job using another QoS allowed to preempt, I'm hitting the AssocGrpCpuLimit first.

Indeed the documentation made that clear:

"NOTE: The group limits (GrpJobs, GrpTRES, etc.) are tested when a job is being considered for being allocated resources. If starting a job would cause any of its group limit to be exceeded, that job will not be considered for scheduling even if that job might preempt other jobs which would release sufficient group resources for the pending job to be initiated."
(from http://slurm.schedmd.com/sacctmgr.html)

Still, is there a way to have different per-account TRES limits in different QoSes? That would be very useful in our case.

Thanks!
Stephane Thiell
Stanford Research Computing
Comment 1 Tim Wickberg 2016-04-28 09:41:01 MDT
> Still, is there a way to have different per-account TRES limits in different QoSes? That would be very useful in our case.

There will be in 16.05. We've (courtesty of FHCRC's sponsorship) added a new QOS limit of MaxTRESPerAccount which is designed for exactly this use case.

- Tim

*** This ticket has been marked as a duplicate of ticket 2242 ***
Comment 2 Stephane Thiell 2016-04-28 09:46:40 MDT
Hi Tim,

Awesome! Exactly what we need indeed!

Thanks much!

Stephane