| Summary: | How to restrict to min of 1 gpu per node | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | UCF ARCC <stokes.arcc> |
| Component: | Limits | Assignee: | Jason Booth <jbooth> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | ||
| Version: | 18.08.3 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | UCF | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | ||
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
|
Description
UCF ARCC
2018-12-12 11:25:27 MST
Hi Paul, From https://slurm.schedmd.com/tres.html AccountingStorageTRES Used to define which TRES are to be tracked on the system. By default Billing, CPU, Energy, Memory and Node are tracked. This will be the case whether specified or not. So by adding gres/gpu to the slurm.conf you can then add these into the database. AccountingStorageTRES=gres/gpu,cpu,node jason@nh-blue:~/slurm/master$ sacctmgr modify qos big set GrpTRESMins=gres/gpu=120000 Modified qos... big Would you like to commit changes? (You have 30 seconds to decide) (N/y): y Let me know if this answers your question. -Jason I believe this is how I tried to initially set things up last Summer (when we were on a 17.x version), but the problem was that people could still request resources that *didn't* have a GPU and basically have indefinite use of that node as long as they didn't want the GPU. I want it so that a user *cannot* get any resource without also requesting a GPU. But I will try this again later tonight when I get a chance to and let you know. Maybe I misremember my tests last Summer. Paul ------ Original message------ From: bugs@schedmd.com Date: Wed, Dec 12, 2018 17:01 To: Wiegand, Paul; Cc: Subject:[Bug 6226] How to restrict to min of 1 gpu per node Jason Booth<mailto:jbooth@schedmd.com> changed bug 6226<https://bugs.schedmd.com/show_bug.cgi?id=6226> What Removed Added Assignee support@schedmd.com jbooth@schedmd.com Comment # 1<https://bugs.schedmd.com/show_bug.cgi?id=6226#c1> on bug 6226<https://bugs.schedmd.com/show_bug.cgi?id=6226> from Jason Booth<mailto:jbooth@schedmd.com> Hi Paul, From https://slurm.schedmd.com/tres.html AccountingStorageTRES Used to define which TRES are to be tracked on the system. By default Billing, CPU, Energy, Memory and Node are tracked. This will be the case whether specified or not. So by adding gres/gpu to the slurm.conf you can then add these into the database. AccountingStorageTRES=gres/gpu,cpu,node jason@nh-blue:~/slurm/master$ sacctmgr modify qos big set GrpTRESMins=gres/gpu=120000 Modified qos... big Would you like to commit changes? (You have 30 seconds to decide) (N/y): y Let me know if this answers your question. -Jason ________________________________ You are receiving this mail because: * You reported the bug. Hi Paul, I apologize for not responding to your other question. I read into the last few lines as that being the primary issue. >I tried doing this: > sacctmgr add qos somePIuser priority=100 GrpTRESMins=gres/gpu=120000 MinTRESPerJob=gres/gpu:1 >But that gave me an error. I tried setting MinTRES type things on the partition, but that did not work as I had expected. What is the correct way to do this? There is a second half to this setup. The first is to set your GrpTRESMins. The second part would be to set up the job_submit plugin to add a gres to the job. You can do this in lua or in c. There is an example of how to append the gres to a job in c under: src/plugins/job_submit/cray/job_submit_cray.c. In that code, they are adding craynetwork:1 to job_desc->tres_per_node but you can make use of this with your own gres. You should also be able to do this in the lua version as well. Does this help? -Jason Understood. I think that addresses my question. Thanks! Paul ------ Original message------ From: bugs@schedmd.com Date: Wed, Dec 12, 2018 17:32 To: Wiegand, Paul; Cc: Subject:[Bug 6226] How to restrict to min of 1 gpu per node Comment # 3<https://bugs.schedmd.com/show_bug.cgi?id=6226#c3> on bug 6226<https://bugs.schedmd.com/show_bug.cgi?id=6226> from Jason Booth<mailto:jbooth@schedmd.com> Hi Paul, I apologize for not responding to your other question. I read into the last few lines as that being the primary issue. >I tried doing this: > sacctmgr add qos somePIuser priority=100 GrpTRESMins=gres/gpu=120000 MinTRESPerJob=gres/gpu:1 >But that gave me an error. I tried setting MinTRES type things on the partition, but that did not work as I had expected. What is the correct way to do this? There is a second half to this setup. The first is to set your GrpTRESMins. The second part would be to set up the job_submit plugin to add a gres to the job. You can do this in lua or in c. There is an example of how to append the gres to a job in c under: src/plugins/job_submit/cray/job_submit_cray.c. In that code, they are adding craynetwork:1 to job_desc->tres_per_node but you can make use of this with your own gres. You should also be able to do this in the lua version as well. Does this help? -Jason ________________________________ You are receiving this mail because: * You reported the bug. Resolving this issue |