| Summary: | add new MaxTRESPerAccount limit | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Tim Wickberg <tim> |
| Component: | Accounting | Assignee: | Danny Auble <da> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | 3 - Medium Impact | ||
| Priority: | --- | CC: | hermes, mrg, pedmon, sthiell |
| Version: | 16.05.x | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| See Also: | http://bugs.schedmd.com/show_bug.cgi?id=1556 | ||
| Site: | FHCRC - Fred Hutchinson Cancer Research Center | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | 16.05.0-pre1 | Target Release: | 16.05 |
| DevPrio: | 1 - Paid | Emory-Cloud Sites: | --- |
| Attachments: |
15.08 patch for max tres functionality per account
fix issues in comment 8 fix issues in comment 8 15.08 patch for max tres functionality per account |
||
|
Description
Tim Wickberg
2015-12-10 04:37:36 MST
I'm hoping to have this done before Valentines day. It commit will be in 16.05 but I will give you a patch for 15.08. Please let me know otherwise. This sounds good- I'll be able to use both patch and then figure on an upgrade to 16.05 in the May/June timeframe. Created attachment 2630 [details]
15.08 patch for max tres functionality per account
Michael, attached you will find a patch that will convert your 15.08 install (based off 15.08.7) to use the new MaxTres functionality for accounts added to QOS. The 3 options added are these...
MaxTRESPerAccount
MaxJobsPerAccount
MaxSubmitJobsPerAccount
Please let me know if you have any questions or issues. I'll check this into the master branch after you have verified it working.
I will make note if you want to go back to vanilla 15.08 after this you will perhaps have minor hiccups, like the association cache read from the slurmctld upon start will fail, but that isn't a very big deal as it will just get the correct 15.08 information from the reverted slurmdbd.
Great! I'll drop it on the test cluster in the morning. Should have a report for you early next week. PS- someone on the list was asking about this feature. Cool if I let him know to expect it in 16.05? ----- bugs@schedmd.com wrote: > http://bugs.schedmd.com/show_bug.cgi?id=2242 > > --- Comment #3 from Danny Auble <da@schedmd.com> --- > Created attachment 2630 [details] > --> http://bugs.schedmd.com/attachment.cgi?id=2630&action=edit > 15.08 patch for max tres functionality per account > > Michael, attached you will find a patch that will convert your 15.08 install > (based off 15.08.7) to use the new MaxTres functionality for accounts added to > QOS. The 3 options added are these... > > MaxTRESPerAccount > MaxJobsPerAccount > MaxSubmitJobsPerAccount > > Please let me know if you have any questions or issues. I'll check this into > the master branch after you have verified it working. > > I will make note if you want to go back to vanilla 15.08 after this you will > perhaps have minor hiccups, like the association cache read from the slurmctld > upon start will fail, but that isn't a very big deal as it will just get the > correct 15.08 information from the reverted slurmdbd. > > -- > You are receiving this mail because: > You are on the CC list for the bug. Thanks. I will put it in our test build here as well. -Paul Edmon- On 1/21/2016 8:10 PM, bugs@schedmd.com wrote: > > *Comment # 4 <http://bugs.schedmd.com/show_bug.cgi?id=2242#c4> on bug > 2242 <http://bugs.schedmd.com/show_bug.cgi?id=2242> from Michael > Gutteridge <mailto:mrg@fredhutch.org> * > Great! I'll drop it on the test cluster in the morning. Should have a report > for you early next week. > > PS- someone on the list was asking about this feature. Cool if I let him know > to expect it in 16.05? > > -----bugs@schedmd.com <mailto:bugs@schedmd.com> wrote: > >http://bugs.schedmd.com/show_bug.cgi?id=2242 <show_bug.cgi?id=2242> > > > --- Comment #3 <show_bug.cgi?id=2242#c3> from Danny Auble > <da@schedmd.com <mailto:da@schedmd.com>> --- > Created attachment 2630 [details] > <attachment.cgi?id=2630&action=diff> [details] > <attachment.cgi?id=2630&action=edit> > --> > http://bugs.schedmd.com/attachment.cgi?id=2630&action=edit > 15.08 > patch for max tres functionality per account > > Michael, attached you > will find a patch that will convert your 15.08 install > (based off > 15.08.7) to use the new MaxTres functionality for accounts added to > > QOS. The 3 options added are these... > > MaxTRESPerAccount > > MaxJobsPerAccount > MaxSubmitJobsPerAccount > > Please let me know if > you have any questions or issues. I'll check this into > the master > branch after you have verified it working. > > I will make note if you > want to go back to vanilla 15.08 after this you will > perhaps have > minor hiccups, like the association cache read from the slurmctld > > upon start will fail, but that isn't a very big deal as it will just > get the > correct 15.08 information from the reverted slurmdbd. > > -- > > You are receiving this mail because: > You are on the CC list for > the bug. > ------------------------------------------------------------------------ > You are receiving this mail because: > > * You are on the CC list for the bug. > Paul's the guy from the list :). But you can tell him if you would like on the list ;). Yup, if you want too as some other people might be curious about this feature. -Paul Edmon- On 01/21/2016 09:49 PM, bugs@schedmd.com wrote: > > *Comment # 6 <http://bugs.schedmd.com/show_bug.cgi?id=2242#c6> on bug > 2242 <http://bugs.schedmd.com/show_bug.cgi?id=2242> from Danny Auble > <mailto:da@schedmd.com> * > Paul's the guy from the list :). But you can tell him if you would like on the > list ;). > ------------------------------------------------------------------------ > You are receiving this mail because: > > * You are on the CC list for the bug. > Hi
I've successfully built and deployed this patch against 15.08.7 (thought I might do the point upgrade while I'm at it).
only two things I see currently is that:
a) squeue doesn't show the reason correctly
slapshot[~/tutorial]: squeue
JOBID USER ACCOUNT PARTITION QOS NAME ST TIME NODES CPUS MIN_ NODELIST(REASON)
27147336 mrg scicomp campus normal sleeper.sh PD 0:00 1 1 1 (167)
27147337 mrg scicomp campus normal sleeper.sh PD 0:00 1 1 1 (Priority)
27147338 mrg scicomp campus normal sleeper.sh PD 0:00 1 1 1 (Priority)
27147335 mrg scicomp campus normal sleeper.sh R 0:03 1 1 1 gizmof368
27147334 mrg scicomp campus normal sleeper.sh R 0:06 1 1 1 gizmof368
b) sacctmgr doesn't show this TRES by default, but will show when specified in "format":
slapshot[~/tutorial]: sacctmgr show qos where name=normal
Name Priority GraceTime Preempt PreemptMode Flags UsageThres UsageFactor GrpTRES GrpTRESMins GrpTRESRunMin GrpJobs GrpSubmit GrpWall MaxTRES MaxTRESPerNode MaxTRESMins MaxWall MaxTRESPU MaxJobsPU MaxSubmitPU MinTRES
---------- ---------- ---------- ---------- ----------- ---------------------------------------- ---------- ----------- ------------- ------------- ------------- ------- --------- ----------- ------------- -------------- ------------- ----------- ------------- --------- ----------- -------------
normal 0 00:00:00 cluster 1.000000 cpu=250 cpu=300 5000
slapshot[~/tutorial]: sacctmgr show qos where name=normal format=maxtresperaccount
MaxTRESPA
-------------
cpu=2
Looks real good so far, though. Thanks
M
Cool, both should be easy to fix, I'll get you a patch tomorrow. Let me know if you find anything else. On January 25, 2016 4:46:37 PM PST, bugs@schedmd.com wrote: >http://bugs.schedmd.com/show_bug.cgi?id=2242 > >--- Comment #8 from Michael Gutteridge <mrg@fredhutch.org> --- >Hi > >I've successfully built and deployed this patch against 15.08.7 >(thought I >might do the point upgrade while I'm at it). > >only two things I see currently is that: > >a) squeue doesn't show the reason correctly > >slapshot[~/tutorial]: squeue > JOBID USER ACCOUNT PARTITION QOS NAME >ST TIME NODES CPUS MIN_ NODELIST(REASON) > 27147336 mrg scicomp campus normal sleeper.sh >PD 0:00 1 1 1 (167) > 27147337 mrg scicomp campus normal sleeper.sh >PD 0:00 1 1 1 (Priority) > 27147338 mrg scicomp campus normal sleeper.sh >PD 0:00 1 1 1 (Priority) > 27147335 mrg scicomp campus normal sleeper.sh >R 0:03 1 1 1 gizmof368 > 27147334 mrg scicomp campus normal sleeper.sh >R 0:06 1 1 1 gizmof368 > >b) sacctmgr doesn't show this TRES by default, but will show when >specified in >"format": > >slapshot[~/tutorial]: sacctmgr show qos where name=normal >Name Priority GraceTime Preempt PreemptMode > > Flags UsageThres UsageFactor GrpTRES GrpTRESMins >GrpTRESRunMin GrpJobs GrpSubmit GrpWall MaxTRES >MaxTRESPerNode >MaxTRESMins MaxWall MaxTRESPU MaxJobsPU MaxSubmitPU >MinTRES >---------- ---------- ---------- ---------- ----------- >---------------------------------------- ---------- ----------- >------------- >------------- ------------- ------- --------- ----------- ------------- >-------------- ------------- ----------- ------------- --------- >----------- >------------- >normal 0 00:00:00 cluster > > 1.000000 cpu=250 > > cpu=300 5000 > >slapshot[~/tutorial]: sacctmgr show qos where name=normal >format=maxtresperaccount > MaxTRESPA >------------- > cpu=2 > >Looks real good so far, though. Thanks > >M > >-- >You are receiving this mail because: >You are on the CC list for the bug. >You are watching all bug changes. Created attachment 2642 [details] fix issues in comment 8 Hey Michael attached you will find a patch that can be used on top of the normal patch that fixes the issues you noticed. Please let me know if you find anything else. I'll also update the 15.08 patch after this will a full patch set for future releases so you don't have to patch things multiple times if/when you update to future 15.08 releases. Created attachment 2644 [details] fix issues in comment 8 Sorry the previous patch was for 16.05, this one is for 15.08. Created attachment 2645 [details] 15.08 patch for max tres functionality per account This is an updated full patch for 15.08. attachment 2644 [details] is not needed with this, only with attachment 2630 [details]. Michael, if all is well, I would like to push this into the master branch. Let me know if you have found any other issues before then. Thanks! Yep- built and installed in test. Looks to be working as intended, the two problems I'd noticed (reason display and sacctmgr output) appear correct. This has been committed to 16.05 commit be68e87c7493. If you have any problems please open a new bug for the related topic. *** Ticket 2669 has been marked as a duplicate of this ticket. *** |