On sbatch jobs, we're consistently seeing this error at the end of job output: slurmstepd: task/cgroup: unable to remove step memcg : Device or resource busy This results in leftover memory control groups with no tasks like this one: # cat /dev/mcgroup/slurm/uid_29597/job_7045/step_4294967294/tasks # There must be some process left in the memory control group when it's being removed. We're using TaskPlugin=task/affinity,task/cgroup,task/cray.
Could please upload the slurmd log file? Do you use the release agent provided with Slurm? On 03/31/2014 12:45 PM, bugs@schedmd.com wrote: > Site CRAY > Bug ID 671 <http://bugs.schedmd.com/show_bug.cgi?id=671> > Summary sbatch error: unable to remove step memcg > Product SLURM > Version 14.11.x > Hardware Linux > OS Linux > Status UNCONFIRMED > Severity 4 - Minor Issue > Priority --- > Component slurmd daemon > Assignee david@schedmd.com > Reporter dgloe@cray.com > CC da@schedmd.com, david@schedmd.com, jette@schedmd.com > > On sbatch jobs, we're consistently seeing this error at the end of job output: > slurmstepd: task/cgroup: unable to remove step memcg : Device or resource busy > > This results in leftover memory control groups with no tasks like this one: > # cat /dev/mcgroup/slurm/uid_29597/job_7045/step_4294967294/tasks > # > > There must be some process left in the memory control group when it's being > removed. > > We're using TaskPlugin=task/affinity,task/cgroup,task/cray. > > ------------------------------------------------------------------------ > You are receiving this mail because: > > * You are on the CC list for the bug. > * You are the assignee for the bug. > * You are watching someone on the CC list of the bug. > * You are watching the assignee of the bug. >
Created attachment 723 [details] slurmd log file showing the problem We don't use the Slurm release agent. Here's our cgroup.conf: ### # # Slurm cgroup support configuration file # # See man slurm.conf and man cgroup.conf for further # information on cgroup configuration parameters #-- CgroupAutomount=yes #CgroupReleaseAgentDir="/etc/slurm/cgroup" CgroupMountpoint="/dev" ConstrainCores=yes ConstrainRAMSpace=yes TaskAffinity=no
Thanks, I am investigating the problem. On 03/31/2014 01:35 PM, bugs@schedmd.com wrote: > *Comment # 2 <http://bugs.schedmd.com/show_bug.cgi?id=671#c2> on bug 671 > <http://bugs.schedmd.com/show_bug.cgi?id=671> from David Gloe > <mailto:dgloe@cray.com> * > > Createdattachment 723 <attachment.cgi?id=723> [details] <attachment.cgi?id=723&action=edit> > slurmd log file showing the problem > > We don't use the Slurm release agent. Here's our cgroup.conf: > > ### > # > # Slurm cgroup support configuration file > # > # See man slurm.conf and man cgroup.conf for further > # information on cgroup configuration parameters > #-- > CgroupAutomount=yes > #CgroupReleaseAgentDir="/etc/slurm/cgroup" > CgroupMountpoint="/dev" > ConstrainCores=yes > ConstrainRAMSpace=yes > TaskAffinity=no > > ------------------------------------------------------------------------ > You are receiving this mail because: > > * You are on the CC list for the bug. > * You are the assignee for the bug. > * You are watching someone on the CC list of the bug. > * You are watching the assignee of the bug. >
The messages you seen indicate that there are still processes using the cgroup, or jobs belonging to the same user running on the machine. The release agent provided with Slurm should clean those directories after the last process exit. Why are you not use the release agent? David
(In reply to David Bigagli from comment #4) > The messages you seen indicate that there are still processes using the > cgroup, > or jobs belonging to the same user running on the machine. The release agent > provided with Slurm should clean those directories after the last process > exit. Why are you not use the release agent? > > David We don't use the Slurm release agent due to some quirkiness on our compute nodes. Slurm is installed in a chroot environment, along with its release agent, so the release agent path (which is based on the actual root, not the chroot) isn't set correctly in the cgroup by Slurm. We do have our own release agent we use for cpusets; I'll try setting the mcgroup release agent to that as well and see if that helps at all.
One think we do in our release agent is to lock, using the flock command, the entire subsystem before doing anything, since the code that creates the cgroups uses the flock() sys call as well this allows for a mutual exclusion. On 04/02/2014 10:55 AM, bugs@schedmd.com wrote: > *Comment # 5 <http://bugs.schedmd.com/show_bug.cgi?id=671#c5> on bug 671 > <http://bugs.schedmd.com/show_bug.cgi?id=671> from David Gloe > <mailto:dgloe@cray.com> * > > (In reply to David Bigagli fromcomment #4 <show_bug.cgi?id=671#c4>) >> The messages you seen indicate that there are still processes using the >> cgroup, >> or jobs belonging to the same user running on the machine. The release agent >> provided with Slurm should clean those directories after the last process >> exit. Why are you not use the release agent? >> >> David > > We don't use the Slurm release agent due to some quirkiness on our compute > nodes. Slurm is installed in a chroot environment, along with its release > agent, so the release agent path (which is based on the actual root, not the > chroot) isn't set correctly in the cgroup by Slurm. > > We do have our own release agent we use for cpusets; I'll try setting the > mcgroup release agent to that as well and see if that helps at all. > > ------------------------------------------------------------------------ > You are receiving this mail because: > > * You are on the CC list for the bug. > * You are the assignee for the bug. > * You are watching someone on the CC list of the bug. > * You are watching the assignee of the bug. >
I've dug up a potential cause: http://linux-kernel.2935.n7.nabble.com/PATCH-cgroup-fix-rmdir-EBUSY-regression-in-3-11-td710818.html I'm looking to see if we have that patch on our compute nodes.
David, do you have an update on this issue? Thanks, David
After some more investigation, this is definitely a Cray kernel issue. I don't believe Slurm is at fault unless it's doing something strange that results in the memory control group having memory usage remaining when no tasks are left.