Summary: | Make Infiniband and Lustre Accounting Node Specific | ||
---|---|---|---|
Product: | Slurm | Reporter: | Paul Edmon <pedmon> |
Component: | Accounting | Assignee: | Unassigned Developer <dev-unassigned> |
Status: | OPEN --- | QA Contact: | |
Severity: | 5 - Enhancement | ||
Priority: | --- | CC: | alex, sts |
Version: | 19.05.1 | ||
Hardware: | Linux | ||
OS: | Linux | ||
Site: | Harvard University | Alineos Sites: | --- |
Atos/Eviden Sites: | --- | Confidential Site: | --- |
Coreweave sites: | --- | Cray Sites: | --- |
DS9 clusters: | --- | HPCnow Sites: | --- |
HPE Sites: | --- | IBM Sites: | --- |
NOAA SIte: | --- | NoveTech Sites: | --- |
Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
Recursion Pharma Sites: | --- | SFW Sites: | --- |
SNIC sites: | --- | Tzag Elita Sites: | --- |
Linux Distro: | --- | Machine Name: | |
CLE Version: | Version Fixed: | ||
Target Release: | --- | DevPrio: | --- |
Emory-Cloud Sites: | --- |
Description
Paul Edmon
2019-08-13 09:34:28 MDT
Hi Paul. Thanks for logging this issue with us. There is some enhancement work involved to make these configurable on a per-node basis. Is this something your site is interested in sponsoring development for? As a workaround you could try setting the following: JobAcctGatherFrequency=[filesystem|network]=0 , which according to the documentation: "An interval of 0 disables sampling of the specified type. If the task sampling interval is 0, accounting information is collected only at job termination (reducing Slurm interference with the job)." Right now we just removed those completely from our config so they aren't even polling anymore. As for sponsoring, as it stands not right now. I know we have a number of pending feature requests, some of which are more broadly applicable to the community and have significant interest. Right now we are just dropping these in here so that they may be done for the sake of general improvement as you or others have time. Given the number of outstanding requests we have (about 20+ at this point) we may look at sponsoring some work in the future, but I will need to talk it over with my management first. So as it stands do not expect anything from us other than just providing these as suggestions for future improvement that would greatly aide us and the community. -Paul Edmon- On 8/13/19 1:40 PM, bugs@schedmd.com wrote: > > *Comment # 2 <https://bugs.schedmd.com/show_bug.cgi?id=7566#c2> on bug > 7566 <https://bugs.schedmd.com/show_bug.cgi?id=7566> from Jason Booth > <mailto:jbooth@schedmd.com> * > Hi Paul. Thanks for logging this issue with us. There is some enhancement work > involved to make these configurable on a per-node basis. Is this something your > site is interested in sponsoring development for? > > As a workaround you could try setting the following: > > JobAcctGatherFrequency=[filesystem|network]=0 , which according to the > documentation: > > "An interval of 0 disables sampling of the specified type. If the task > sampling interval is 0, accounting information is collected only at job > termination (reducing Slurm interference with the job)." > ------------------------------------------------------------------------ > You are receiving this mail because: > > * You reported the bug. > Thanks, Paul - converting this over to an enhancement. |