Summary: | unexplained _oom_event_monitor: oom-kill event count: 1 event | ||
---|---|---|---|
Product: | Slurm | Reporter: | Todd Merritt <tmerritt> |
Component: | slurmstepd | Assignee: | Marshall Garey <marshall> |
Status: | RESOLVED DUPLICATE | QA Contact: | |
Severity: | 4 - Minor Issue | ||
Priority: | --- | CC: | mcmullan |
Version: | 19.05.6 | ||
Hardware: | Linux | ||
OS: | Linux | ||
Site: | U of AZ | Alineos Sites: | --- |
Atos/Eviden Sites: | --- | Confidential Site: | --- |
Coreweave sites: | --- | Cray Sites: | --- |
DS9 clusters: | --- | HPCnow Sites: | --- |
HPE Sites: | --- | IBM Sites: | --- |
NOAA SIte: | --- | NoveTech Sites: | --- |
Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
Recursion Pharma Sites: | --- | SFW Sites: | --- |
SNIC sites: | --- | Tzag Elita Sites: | --- |
Linux Distro: | --- | Machine Name: | |
CLE Version: | Version Fixed: | ||
Target Release: | --- | DevPrio: | --- |
Emory-Cloud Sites: | --- |
Description
Todd Merritt
2020-07-13 07:18:47 MDT
Hi Todd, Thanks for the information. It turns out we've already encountered this bug and have bug 9202 open to handle it. It was originally private but it's public now and I've marked bug 9202 comment 0 as public. About this bug - we think that there weren't actually any OOM events but that there is a bug in the extern slurmstepd. We have reproduced it but aren't reproducing it consistently. Feel free to post comments or questions on bug 9202. I'm marking this bug as a duplicate of bug 9202. *** This ticket has been marked as a duplicate of ticket 9202 *** |