In chasing down a problem with cancelling a job step on Sequoia, I was distracted by the following errors in the slurmd.log: not distracting: [2013-02-25T10:56:19-08:00] debug: Sending signal 9 to step 394271.1 distracting: [2013-02-25T10:56:19-08:00] debug: _step_connect: connect: No such file or directory [2013-02-25T10:56:19-08:00] debug: signal for nonexistant 394271.1 stepd_connect failed: No such file or directory The signal got through even though _signal_jobstep() returns early due to the -1 return status of stepd_connect(). The errors are reported due to the absence of nodename_job.step directories in the SlurmdSpoolDir on BlueGene systems. Don
(In reply to comment #0) [...] > The errors are reported due to the absence of nodename_job.step directories > in the SlurmdSpoolDir on BlueGene systems. correction: nodename_job.step "sockets"
This should be fixed in ac7c76ab6ffa0e1d049a42463752c1d6f9cb6d6d.