During a recent training session we reorganized our configuration files to arrive at a slurm configuration with the following setting in slurm.conf: TaskPlugin=task/affinity,task/cgroup TaskPluginParam=task/affinity,task/cgroup this resulted in: scontrol: error: Bad TaskPluginParam: task/affinity scontrol: fatal: Unable to process configuration file Assuming the configuration settings were in error, I changed this to: TaskPlugin=task/affinity,task/cgroup TaskPluginParam=Cores,Cpusets While this cleared up the configuration error, I next ran into the following error on the compute node when running a simple job: Sep 9 14:56:28 s0014 slurmstepd[81325]: Munge cryptographic signature plugin loaded Sep 9 14:56:28 s0014 slurmstepd[81325]: error: slurm_build_cpuset: mkdir(/dev/cpuset/slurm2131): No such file or directory Sep 9 14:56:28 s0014 slurmstepd[81325]: error: task_p_pre_setuid: slurm_build_cpuset() failed Sep 9 14:56:28 s0014 slurmstepd[81325]: error: _spawn_job_container: Failed to invoke task plugins: one of task_p_pre_setuid functions returned error Sep 9 14:56:28 s0014 slurmd[80884]: Launching batch job 2131 for UID 1209 Sep 9 14:56:28 s0014 slurmstepd[81331]: task affinity plugin loaded with CPU mask 0000000000...0ffffffffffff Sep 9 14:56:28 s0014 slurmstepd[81331]: Munge cryptographic signature plugin loaded Sep 9 14:56:28 s0014 slurmstepd[81331]: error: slurm_build_cpuset: mkdir(/dev/cpuset/slurm2131): No such file or directory Sep 9 14:56:28 s0014 slurmstepd[81331]: error: task_p_pre_setuid: slurm_build_cpuset() failed Sep 9 14:56:28 s0014 slurmstepd[81331]: error: Failed to invoke task plugins: one of task_p_pre_setuid functions returned error Sep 9 14:56:28 s0014 slurmstepd[81331]: error: job_manager exiting abnormally, rc = 4020 Sep 9 14:56:28 s0014 slurmstepd[81331]: job 2131 completed with slurm_rc = 4020, job_rc = 0 Sep 9 14:56:28 s0014 slurmstepd[81331]: sending REQUEST_COMPLETE_BATCH_SCRIPT, error:4020 status 0 Sep 9 14:56:28 s0014 slurmd[80884]: error: task_p_slurmd_release_resources: rmdir(/dev/cpuset/slurm2131) failed No such file or directory This, of course, results in the node being drained due to the error. What is strange about this error is, since the compute node is a RHEL 7 system, there is no /dev/cpuset. Removing the "cpusets" setting from the configuration to: TaskPluginParam=Cores permits the job to succeed. It's not clear to me how to proceed to resolve this issue. Please advise.
(In reply to Anthony DelSorbo from comment #0) > During a recent training session we reorganized our configuration files to > arrive at a slurm configuration with the following setting in slurm.conf: > > TaskPlugin=task/affinity,task/cgroup > TaskPluginParam=task/affinity,task/cgroup The config I sent over to you had removed TaskPluginParam as those settings are not needed with TaskPlugin=task/cgroup added in now. I'm not sure where you got that setting from? > It's not clear to me how to proceed to resolve this issue. Please advise. Please delete the TaskPluginParam config line entirely.
Updating to resolved/infogiven, please reopen if you have any further questions. - Tim