Ticket 10349

Summary: OOM errors for all jobs
Product: Slurm Reporter: Ahmed Essam ElMazaty <ahmed.mazaty>
Component: OtherAssignee: Felip Moll <felip.moll>
Status: RESOLVED DUPLICATE QA Contact:
Severity: 4 - Minor Issue    
Priority: ---    
Version: 20.11.0   
Hardware: Linux   
OS: Linux   
Site: KAUST Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---

Description Ahmed Essam ElMazaty 2020-12-03 12:31:55 MST
Hello team,
We're affected by the same issue in bug https://bugs.schedmd.com/show_bug.cgi?id=10255 and it's affecting all jobs after upgrading to 20.11.0
Is there a patch to fix this issue till 20.11.1 is released? Also, Any estimation regarding the release date of 20.11.1
Best regards,
Ahmed
Comment 1 Jason Booth 2020-12-03 14:46:57 MST
Ahmed - I will have Felip reply to you in more details on the status of this issue. In regard to your other questions, we are tentatively targeting the 20.11.1 release for next week.
Comment 2 Felip Moll 2020-12-04 04:59:12 MST
Hi Ahmed,

This question is exactly a duplicate of bug 10336. If you don't mind I am closing this out as a DUP. You can apply the patch from this commit:

https://github.com/SchedMD/slurm/commit/272c636d507e1dc59d987da478d42f6713d88ae1

or just wait until 20.11.1 is released.

Post further questions into 10336.

Thanks!

*** This ticket has been marked as a duplicate of ticket 10336 ***