Summary: | Gating GPU Memory | ||
---|---|---|---|
Product: | Slurm | Reporter: | Paul Edmon <pedmon> |
Component: | Limits | Assignee: | Unassigned Developer <dev-unassigned> |
Status: | OPEN --- | QA Contact: | |
Severity: | 5 - Enhancement | ||
Priority: | --- | ||
Version: | 17.11.x | ||
Hardware: | Linux | ||
OS: | Linux | ||
See Also: | https://bugs.schedmd.com/show_bug.cgi?id=13907 | ||
Site: | Harvard University | Alineos Sites: | --- |
Atos/Eviden Sites: | --- | Confidential Site: | --- |
Coreweave sites: | --- | Cray Sites: | --- |
DS9 clusters: | --- | HPCnow Sites: | --- |
HPE Sites: | --- | IBM Sites: | --- |
NOAA SIte: | --- | NoveTech Sites: | --- |
Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
Recursion Pharma Sites: | --- | SFW Sites: | --- |
SNIC sites: | --- | Tzag Elita Sites: | --- |
Linux Distro: | --- | Machine Name: | |
CLE Version: | Version Fixed: | ||
Target Release: | --- | DevPrio: | --- |
Emory-Cloud Sites: | --- |
Description
Paul Edmon
2017-06-21 08:12:51 MDT
Looks like an interesting idea, but AFAICT there is no API for us to build such support off of at present. The Linux cgroup system only lets us block access to the device files - there is no equivalent of the cgroup memory controller tailored for the GPU. At a quick glance, I don't see any obvious equivalent through the nvidia-smi command or their other tools. If you're aware of something that would enforce this that'd give us a viable approach please update the bug, otherwise this is likely to go unresolved. - Tim Yeah, sadly I'm not aware of any method for this either. About the only solution I have would be to gate access to the full GPU card or make GPU jobs use the full node. -Paul Edmon- On 06/21/2017 11:01 AM, bugs@schedmd.com wrote: > Tim Wickberg <mailto:tim@schedmd.com> changed bug 3915 > <https://bugs.schedmd.com/show_bug.cgi?id=3915> > What Removed Added > Severity 4 - Minor Issue 5 - Enhancement > Assignee support@schedmd.com dev-unassigned@schedmd.com > > *Comment # 1 <https://bugs.schedmd.com/show_bug.cgi?id=3915#c1> on bug > 3915 <https://bugs.schedmd.com/show_bug.cgi?id=3915> from Tim Wickberg > <mailto:tim@schedmd.com> * > Looks like an interesting idea, but AFAICT there is no API for us to build such > support off of at present. > > The Linux cgroup system only lets us block access to the device files - there > is no equivalent of the cgroup memory controller tailored for the GPU. At a > quick glance, I don't see any obvious equivalent through the nvidia-smi command > or their other tools. > > If you're aware of something that would enforce this that'd give us a viable > approach please update the bug, otherwise this is likely to go unresolved. > > - Tim > ------------------------------------------------------------------------ > You are receiving this mail because: > > * You reported the bug. > |