Created attachment 24712 [details] make accounting rss include tmpfs usage to match memcg behaviour Hi, we've been using a patch to fix rss in sacct from memcg for a few years. the issue is that the kernel's memcg includes rss+tmpfs in its OOM decision making, but rss in sacct doesn't include tmpfs, so they don't always line up. if jobs use a lot of shared mem or write plain files to /dev/shm, then jobs can be killed by OOM, but rss in sacct is wrong so it's confusing why the job was killed. here's a patch to fix it. TBH I kinda thought I'd submitted this patch a few years ago. hopefully this isn't a dup. cheers, robin
(In reply to Robin Humble from comment #0) > Created attachment 24712 [details] > make accounting rss include tmpfs usage to match memcg behaviour > > Hi, > > we've been using a patch to fix rss in sacct from memcg for a few years. > > the issue is that the kernel's memcg includes rss+tmpfs in its OOM decision > making, but rss in sacct doesn't include tmpfs, so they don't always line up. > > if jobs use a lot of shared mem or write plain files to /dev/shm, then jobs > can be killed by OOM, but rss in sacct is wrong so it's confusing why the > job was killed. > > here's a patch to fix it. > > TBH I kinda thought I'd submitted this patch a few years ago. hopefully this > isn't a dup. > > cheers, > robin Hello Robin, thank you for your contribution. Unfortunately we are not going to add behavioral changes to cgroup/v1 as we freezed its development in favor of cgroup/v2. I understand your concerns and actually in cgroup/v2 the memory is accounted based on memory.current, which includes the shmem, so accounting should be now what you expect. I hope you understand. I am closing this issue now. Thanks again