Ticket 13950

Summary: fix rss in sacct to include tmpfs and match memcg behaviour
Product: Slurm Reporter: Robin Humble <robin.humble+slurm>
Component: AccountingAssignee: Tim Wickberg <tim>
Status: OPEN --- QA Contact:
Severity: C - Contributions    
Priority: --- CC: csamuel, scrosby
Version: 21.08.7   
Hardware: Linux   
OS: Linux   
Site: -Other- Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---
Attachments: make accounting rss include tmpfs usage to match memcg behaviour

Description Robin Humble 2022-04-28 00:49:31 MDT
Created attachment 24712 [details]
make accounting rss include tmpfs usage to match memcg behaviour

Hi,

we've been using a patch to fix rss in sacct from memcg for a few years.

the issue is that the kernel's memcg includes rss+tmpfs in its OOM decision making, but rss in sacct doesn't include tmpfs, so they don't always line up.

if jobs use a lot of shared mem or write plain files to /dev/shm, then jobs can be killed by OOM, but rss in sacct is wrong so it's confusing why the job was killed.

here's a patch to fix it.

TBH I kinda thought I'd submitted this patch a few years ago. hopefully this isn't a dup.

cheers,
robin