Ticket 13950

Summary:	fix rss in sacct to include tmpfs and match memcg behaviour
Product:	Slurm	Reporter:	Robin Humble <robin.humble+slurm>
Component:	Accounting	Assignee:	Tim Wickberg <tim>
Status:	OPEN ---	QA Contact:
Severity:	C - Contributions
Priority:	---	CC:	csamuel, scrosby
Version:	21.08.7
Hardware:	Linux
OS:	Linux
Site:	-Other-	Alineos Sites:	---
Atos/Eviden Sites:	---	Confidential Site:	---
Coreweave sites:	---	Cray Sites:	---
DS9 clusters:	---	HPCnow Sites:	---
HPE Sites:	---	IBM Sites:	---
NOAA SIte:	---	NoveTech Sites:	---
Nvidia HWinf-CS Sites:	---	OCF Sites:	---
Recursion Pharma Sites:	---	SFW Sites:	---
SNIC sites:	---	Tzag Elita Sites:	---
Linux Distro:	---	Machine Name:
CLE Version:		Version Fixed:
Target Release:	---	DevPrio:	---
Emory-Cloud Sites:	---
Attachments:	make accounting rss include tmpfs usage to match memcg behaviour

Description Robin Humble 2022-04-28 00:49:31 MDT

Created attachment 24712 [details]
make accounting rss include tmpfs usage to match memcg behaviour

Hi,

we've been using a patch to fix rss in sacct from memcg for a few years.

the issue is that the kernel's memcg includes rss+tmpfs in its OOM decision making, but rss in sacct doesn't include tmpfs, so they don't always line up.

if jobs use a lot of shared mem or write plain files to /dev/shm, then jobs can be killed by OOM, but rss in sacct is wrong so it's confusing why the job was killed.

here's a patch to fix it.

TBH I kinda thought I'd submitted this patch a few years ago. hopefully this isn't a dup.

cheers,
robin