| Summary: | Set Group GPUs limits for a group/account | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Damien <damien.leong> |
| Component: | Limits | Assignee: | Marshall Garey <marshall> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | ||
| Version: | 17.11.5 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | Monash University | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
|
Description
Damien
2018-05-09 02:48:35 MDT
(In reply to Damien from comment #0) > For a particular group, I am trying to limit them more, but I am trying to > do this on their account directly , Not another QOS, like > > sacctmgr update Account Boo44 set GrpTRES=gres/gpu=8 > > > Is this possible ? if not what is correct syntax ? or a proper method to do > this ? Yes, that's the correct way to do it. Thanks, Yes. this is working. That's good to hear. I'm closing this ticket as resolved/infogiven. Hi Marshall Actually this is not working as what we expected.... -- sacctmgr update Account Boo44 set GrpTRES=gres/gpu=8 -- This command and feature works, it is really blocking users from Boo44 from getting more than 8 GPUs.... BUT the same set of users are asking for additional new CPU jobs only, and this is blocking them because they already used 8 GPUs. Which by SLURM documentation is correct: - GrpTRES= The total count of TRES able to be used at any given time from jobs running from an association and its children or QOS. If this limit is reached new jobs will be queued but only allowed to run after resources have been relinquished from this group. - So let me rephrase this question, Is there a way or method where I can block a specify group the number of GPU used (8 GPUs), but not block they when they ask additional jobs that does need GPU ? (For example, CPUs job) Please let me know if you need clarification on this. Thanks. Cheers Damien (In reply to Damien from comment #4) > -- > sacctmgr update Account Boo44 set GrpTRES=gres/gpu=8 > -- > This command and feature works, it is really blocking users from Boo44 from > getting more than 8 GPUs.... BUT the same set of users are asking for > additional new CPU jobs only, and this is blocking them because they already > used 8 GPUs. > > > Which by SLURM documentation is correct: > - > GrpTRES= The total count of TRES able to be used at any given time from jobs > running from an association and its children or QOS. If this limit is > reached new jobs will be queued but only allowed to run after resources have > been relinquished from this group. > - This is only for the TRES you set a limit on - it doesn't limit any other TRES. I have defined a limit of 4 gres/gpu on my account test. $ sacctmgr show assoc where account=test format=account,user,grptres Account User GrpTRES ---------- ---------- ------------- test gres/gpu=4 test marshall I request 4 gpus in a job and hit the limit, so the second job pends: $ srun --gres=gpu:4 sleep 789& $ srun --gres=gpu:4 sleep 789& srun: job 538119 queued and waiting for resources marshall@voyager:~/slurm/17.11/byu$ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 538119 debug sleep marshall PD 0:00 1 (AssocGrpGRES) 538118 debug sleep marshall R 0:04 1 v1 But I can still run non-GPU jobs just fine: $ srun sleep 78& $ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 538119 debug sleep marshall PD 0:00 1 (AssocGrpGRES) 538121 debug sleep marshall R 0:01 1 v1 538118 debug sleep marshall R 1:37 1 v1 So something else is going on here. Can you upload an example job submission that pends, as well as the output of squeue and scontrol show job <jobid> of the job that is pending? Hi Thanks for this. In additional, We have 3 different types of GPU within our cluster, Using this mechanism, Can we lock down even further ? For example: We have K10, K20, K80 -- sacctmgr update Account Boo44 set GrpTRES=gres/gpu:K10=2,gres/gpu:K20=2,gres/gpu:K80=2 -- And if this is possible and logical, the above is using and operator, How can we use 'or' instead ? Kindly advise. Thanks. Cheers Damien (In reply to Damien from comment #6) > In additional, We have 3 different types of GPU within our cluster, Using > this mechanism, Can we lock down even further ? > > For example: > > We have K10, K20, K80 > > -- > > sacctmgr update Account Boo44 set > GrpTRES=gres/gpu:K10=2,gres/gpu:K20=2,gres/gpu:K80=2 This is kind of possible, but with the caveat that if a user requests a generic gres/gpu (for example, srun --gres=gpu:4 <job>), they can exceed the limit on the specific types of gres. See bug 4767 and commit c2c06468, which is now live on our website: https://slurm.schedmd.com/resource_limits.html That commit specifically talks about QOS limits, but it also applies to association limits. I will update the documentation to clarify this. I recommend using the suggested approach in the resource_limits page - that is, use a job submit plugin to force the user to always request specific GPUs. > And if this is possible and logical, the above is using and operator, How > can we use 'or' instead ? There isn't a way to enforce one limit or the other - it will enforce all limits. Damien, We've updated the documentation to clarify that this isn't just a limitation on QOS limits. See commit 1e1cd45ee86c45c4c. Is there anything else we can help you with for this ticket? - Marshall Hi We are still to keen to explore TRES limits on a selected account: -- sacctmgr update Account Boo44 set GrpTRES=gres/gpu:K10=2 sacctmgr update Account Boo44 set GrpTRES=gres/gpu:K20=2 sacctmgr update Account Boo44 set GrpTRES=gres/gpu:K30=10 -- We hope to implement this concurrently, Does this makes sense and is this logical ? Cheers Damien Hi My Testings for this does not work: --- sacctmgr update Account boo6 set GrpTRES=gres/gpu:K80=3 Unknown option: GrpTRES=gres/gpu:K80=3 Use keyword 'where' to modify condition --- If I choose just the generic gpu limits, it works: --- sacctmgr update Account boo6 set GrpTRES=gres/gpu=7 Modified account associations... C = m3 A = boo6 of p001 Would you like to commit changes? (You have 30 seconds to decide) (N/y): y --- So the command does not allow me to restrict by GPU-types ? Is this correct ? Kindly advise. Thanks. Cheers Damien (In reply to Damien from comment #14) > My Testings for this does not work: > --- > > sacctmgr update Account boo6 set GrpTRES=gres/gpu:K80=3 > Unknown option: GrpTRES=gres/gpu:K80=3 > Use keyword 'where' to modify condition What's the output of the following? sacctmgr show tres I suspect it doesn't include the specific types of tres (gpu:K80) but includes the generic gres/gpu. You have to put the specific types of gres in AccountingStorageTRES in slurm.conf to make that work: AccountingStorageTRES=gres/gpu,gres/gpu:K10,gres/gpu:K20,... then it should show up in sacctmgr show tres and you should be able to set the limit. (In reply to Damien from comment #13) > We are still to keen to explore TRES limits on a selected account: > > -- > > sacctmgr update Account Boo44 set GrpTRES=gres/gpu:K10=2 > > sacctmgr update Account Boo44 set GrpTRES=gres/gpu:K20=2 > > sacctmgr update Account Boo44 set GrpTRES=gres/gpu:K30=10 > > -- > > We hope to implement this concurrently This is fine. Just make sure you understand the limitation and workaround mentioned at the bottom of the resource limits page (and in comment 7): https://slurm.schedmd.com/resource_limits.html To sum it up again: The limitation is that jobs that request generic gpus will be able to exceed the limit imposed on specific gpus. The recommended workaround is to use a job submit plugin that enforces the policy that all jobs must specify the type of gpu. Hi Marshall
These are our tres
sacctmgr:
sacctmgr: show tres
Type Name ID
-------- --------------- ------
cpu 1
mem 2
energy 3
node 4
billing 5
gres gpu 1001
scontrol show config |grep AccountingStorageTRES
AccountingStorageTRES = cpu,mem,energy,node,billing,gres/gpu
We need to do another config changes.
Cheers
Damien
Yes, that's what I thought. Go ahead and add the additional types of GPUs in AccountingStorageTRES. You can just use scontrol reconfigure to propagate that change to the cluster - you don't have to restart the controller. Then you should see the new TRES by using sacctmgr show tres and should be able to create the limits. Let us know if you have any additional questions on that, or if you're able to do it successfully. Hi Marshall We wanted to switch to fairtree priority as mentioned in the previous notes, but I don't see it under '/opt/slurm-17.11.4/lib/slurm' , Do it need a separate so file or this is built-in ? Please advise. Thanks. Cheers Damien (In reply to Damien from comment #18) > We wanted to switch to fairtree priority as mentioned in the previous notes, > but I don't see it under '/opt/slurm-17.11.4/lib/slurm' , Do it need a > separate so file or this is built-in ? I'm guess the "previous notes" you mention are in a different ticket, perhaps bug 5176? Can you bring this up in a separate ticket (perhaps 5176)? I'd like to keep this bug focused on GPU limits. If you have no further questions about GPU limits, I'd like to close this ticket. Hi Marshall Sorry, I am confused with the number of questions that we have asked. Cheers Damien No worries. Closing as resolved/infogiven. |