our Slurm doesnt seems to clean up the freezer cgroup. What did we do wrong? [akmalm@kud13 ~]$ cat /d/sw/slurm/etc/cgroup.conf CgroupAutomount=yes CgroupMountpoint=/cgroup CgroupReleaseAgentDir=/.slurm-release-agent/ [akmalm@kud13 ~]$ ll /.slurm-release-agent/ total 4 -rwxr-xr-x 1 root root 3355 Mar 31 17:34 release_common lrwxrwxrwx 1 root root 14 Jun 11 2015 release_cpuset -> release_common lrwxrwxrwx 1 root root 14 Jun 11 2015 release_freezer -> release_common [akmalm@kud13 ~]$ ll /cgroup/freezer/slurm/uid_1419/ total 0 --w--w--w- 1 root root 0 Mar 31 17:02 cgroup.event_control -rw-r--r-- 1 root root 0 Mar 31 17:02 cgroup.procs -rw-r--r-- 1 root root 0 Mar 31 17:02 freezer.state drwxr-xr-x 2 root root 0 Mar 31 17:06 job_533287 drwxr-xr-x 2 root root 0 Mar 31 17:03 job_533288 drwxr-xr-x 2 root root 0 Mar 31 17:03 job_533289 drwxr-xr-x 2 root root 0 Mar 31 17:03 job_533290 drwxr-xr-x 2 root root 0 Mar 31 17:03 job_533291 drwxr-xr-x 2 root root 0 Mar 31 17:03 job_533292 drwxr-xr-x 2 root root 0 Mar 31 17:03 job_533293 We are using the example release_common provided. I did try adding echo something > /tmp/somewhere on top of the release_common but it doesnt seem to be executed
Akmal, could you try as root: > echo "/path/to/your/release_freezer" > /cgroup/freezer/release_agent Then submit a job and when it finishes verify if the hierarchy is cleaned up under /cgroup/freezer?
Hi Alejandro, When I put the path in /cgroup/freezer/release_agent, the cgroup is cleaned. Akmal
Hi Alejandro, Why is this happening? Is it caused by a configuration error? Akmal
I'm investigating why there's a need for manually setting up the path in /cgroup/freezer/release_agent in 14.11. Just tested in 15.08 and there's no need to do that, hierarchy is cleaned up with just the Slurm config.
I think this was fixed in the following commit: https://github.com/SchedMD/slurm/commit/c2ce30c2c1879da4d3b02622ca82819163c90d30 So if you are not planning to upgrade to 15.08, you can use the workaround I suggested in comment #1. Please, let me know if you have any more questions.
Hi Akmal. Can we close this bug? Do you have any more questions? Thanks.
Hi Alejandro, yes we can close this bug. Thanks