Hi! After finding a simple reproducible case and performing some diagnostics, we believe that we've discovered a bug in step creation when OverTimeLimit is in use. In summary: job step requests within an allocation may be erroneously rejected with an error that the "Job/step already completing or completed" when the job is still running and is within the soft limit of "time limit + OverTimeLimit". This can be reproduced as follows: 1. Set an OverTimeLimit for a partition (or in general) of 2 (minutes) (note that any value greater than 2 [including UNLIMITED] will suffice for this repro). 2. Create a job script such as the following: ---cut--- #!/bin/bash date sleep 10 echo "Step 1" date srun hostname sleep 60 echo "Step 2" date srun hostname sleep 60 echo "Step 3" date srun hostname echo "Script complete" ---cut--- 3. Submit the job script with a time limit of 2 (minutes): sbatch --time=2 test.sh Expected results: All steps should run, and the job should run to completion; the first two steps should start within the soft time limit of 2 minutes specified for the job (at ~20s in and ~80s in), while the third step should start within the first additional minute allowed by the OverTimeLimit configuration (at ~140s in). Actual results: The first two steps run as expected. The third step fails with an error: srun: error: Unable to create step for job <id>: Job/step already completing or completed Diagnosis: Lines 2400-2402 of slurmctld/step_mgr.c (in the 'step_create' function) read as follows: if (IS_JOB_FINISHED(job_ptr) || ((job_ptr->end_time <= time(NULL)) && !IS_JOB_CONFIGURING(job_ptr))) return ESLURM_ALREADY_DONE; Ref: https://github.com/SchedMD/slurm/blob/master/src/slurmctld/step_mgr.c#L2400-2402 The job_ptr->end_time is compared against the current time, without taking into account any configured OverTimeLimit leading to a refusal of the creation of the step with ESLURM_ALREADY_DONE. As mentioned above, any values for OverTimeLimit (including UNLIMITED) also exhibit this failure, which unfortunately renders the use of time-limited jobs, OverTimeLimit and srun (including srun invocations by OpenMPI) non-functional. Please let me know if you'd like any further details. Thanks, Mark.
This has finally been resolved after two and a half years by https://github.com/SchedMD/slurm/commit/4c381542a85678e7e97ddcae1a14bbbd432b776d and https://github.com/SchedMD/slurm/commit/cc424ea9ae869e8566cfd59e43386b8ea04efc3f