Hi Bjørn,
I reproduced it but I had to partially fill one node, because otherwise my task placement is different and the test works.
So basically I needed to force this to happen:
> SLURM_JOB_CPUS_PER_NODE=1,8
I am not sure this is related to the new behavior of --exact.
Still looking into that but looks more like something that still happened before.
Have you tried specifying --mem per each step?
According to the changes done and which are described in the RELEASE_NOTES, you're right that you must use --exact in place of --exclusive, otherwise it will try to use all the resources in the allocation, so this is the right way (and maybe using --mem):
srun -n4 --exact my-binary A &
srun -n3 --exact my-binary B &
srun -n1 --exact my-binary C &
srun -n1 --exact my-binary D &
I also suggest using "-v" in srun to see exactly what is being requested, and "scontrol show steps". I am doing tests inside an salloc which is more interactive instead of within an sbatch.
I am still looking into it, let me know if you see any oddity or if the --mem fixes anything.
------------
-- By default, a step started with srun will be granted exclusive (or non-
overlapping) access to the resources assigned to that step. No other
parallel step will be allowed to run on the same resources at the same
time. This replaces one facet of the '--exclusive' option's behavior, but
does not imply the '--exact' option described below. To get the previous
default behavior - which allowed parallel steps to share all resources -
use the new srun '--overlap' option.
-- In conjunction to this non-overlapping step allocation behavior being the
new default, there is an additional new option for step management
'--exact', which will allow a step access to only those resources requested
by the step. This is the second half of the '--exclusive' behavior.
Otherwise, by default all non-gres resources on each node in the allocation
will be used by the step, making it so no other parallel step will have
access to those resources unless both steps have specified '--overlap'.
(In reply to Felip Moll from comment #2) > I reproduced it but I had to partially fill one node, because otherwise my > task placement is different and the test works. > So basically I needed to force this to happen: > > > SLURM_JOB_CPUS_PER_NODE=1,8 Yes, it does depend on the task placement for the job. > I am not sure this is related to the new behavior of --exact. > > Still looking into that but looks more like something that still happened > before. I'm quite sure we were able to run examples like that (with the --exact) in version 19.05 and earlier and get all tasks to start at the same time. > Have you tried specifying --mem per each step? I hadn't before, but now I've tried, and it did not help. In one case, it actually delayed the start of steps even more than without --mem. Also, using --mem without specifying the explicit distribution of tasks over nodes doesn't seem like a good idea (our jobs are typically submitted with --mem-per-cpu). Here is an excerpt of a run with "-v". It landed on five nodes: SLURM_JOB_NODELIST=c5-[37,39,42-44] SLURM_TASKS_PER_NODE=1(x3),3(x2) # output from # echo Submitting parallel steps, with --exact: # srun -v -n4 --exact my-binary A & # srun -v -n3 --exact my-binary B & # srun -v -n1 --exact my-binary C & # srun -v -n1 --exact my-binary D & Submitting parallel steps, with --exact: Done submitting. Waiting... srun: Warning: can't run 4 processes on 5 nodes, setting nnodes to 4 srun: defined options srun: -------------------- -------------------- srun: (null) : c5-[37,39,42-44] srun: exact : set srun: jobid : 4565098 srun: job-name : srun_parallel_from_man.sm srun: mem-per-cpu : 1G srun: nodes : 4 srun: ntasks : 4 srun: verbose : 1 srun: -------------------- -------------------- srun: end of defined options srun: jobid 4565098: nodes(5):`c5-[37,39,42-44]', cpu counts: 1(x3),3(x2) srun: launch/slurm: launch_p_step_launch: CpuBindType=(null type) srun: Warning: can't run 1 processes on 5 nodes, setting nnodes to 1 srun: defined options srun: -------------------- -------------------- srun: (null) : c5-[37,39,42-44] srun: Warning: can't run 3 processes on 5 nodes, setting nnodes to 3 srun: exact : set srun: defined options srun: jobid : 4565098 srun: -------------------- -------------------- srun: job-name : srun_parallel_from_man.sm srun: (null) : c5-[37,39,42-44] srun: mem-per-cpu : 1G srun: exact : set srun: nodes : 1 srun: jobid : 4565098 srun: ntasks : 1 srun: job-name : srun_parallel_from_man.sm srun: verbose : 1 srun: mem-per-cpu : 1G srun: -------------------- -------------------- srun: nodes : 3 srun: end of defined options srun: ntasks : 3 srun: verbose : 1 srun: -------------------- -------------------- srun: end of defined options srun: launching StepId=4565098.4 on host c5-37, 1 tasks: 0 srun: jobid 4565098: nodes(5):`c5-[37,39,42-44]', cpu counts: 1(x3),3(x2) srun: launching StepId=4565098.4 on host c5-39, 1 tasks: 1 srun: launching StepId=4565098.4 on host c5-42, 1 tasks: 2 srun: jobid 4565098: nodes(5):`c5-[37,39,42-44]', cpu counts: 1(x3),3(x2) srun: launching StepId=4565098.4 on host c5-43, 1 tasks: 3 srun: route/default: init: route default plugin loaded srun: Warning: can't run 1 processes on 5 nodes, setting nnodes to 1 srun: defined options srun: -------------------- -------------------- srun: (null) : c5-[37,39,42-44] srun: exact : set srun: jobid : 4565098 srun: job-name : srun_parallel_from_man.sm srun: mem-per-cpu : 1G srun: nodes : 1 srun: ntasks : 1 srun: verbose : 1 srun: -------------------- -------------------- srun: end of defined options srun: jobid 4565098: nodes(5):`c5-[37,39,42-44]', cpu counts: 1(x3),3(x2) srun: launch/slurm: launch_p_step_launch: CpuBindType=(null type) srun: launching StepId=4565098.5 on host c5-44, 1 tasks: 0 srun: route/default: init: route default plugin loaded srun: launch/slurm: launch_p_step_launch: CpuBindType=(null type) srun: launching StepId=4565098.6 on host c5-43, 1 tasks: 0 srun: route/default: init: route default plugin loaded srun: launch/slurm: _task_start: Node c5-42, 1 tasks started srun: launch/slurm: _task_start: Node c5-43, 1 tasks started srun: launch/slurm: _task_start: Node c5-37, 1 tasks started srun: launch/slurm: _task_start: Node c5-39, 1 tasks started srun: launch/slurm: _task_start: Node c5-44, 1 tasks started srun: launch/slurm: _task_start: Node c5-43, 1 tasks started 2021-12-09T10:45:20 - Arg: A - Step ID: 4 - Host: c5-42 - CPUs on node: 1 2021-12-09T10:45:20 - Arg: A - Step ID: 4 - Host: c5-37 - CPUs on node: 1 2021-12-09T10:45:20 - Arg: A - Step ID: 4 - Host: c5-39 - CPUs on node: 1 2021-12-09T10:45:20 - Arg: C - Step ID: 5 - Host: c5-44 - CPUs on node: 1 2021-12-09T10:45:20 - Arg: A - Step ID: 4 - Host: c5-43 - CPUs on node: 1 2021-12-09T10:45:20 - Arg: D - Step ID: 6 - Host: c5-43 - CPUs on node: 1 srun: launch/slurm: _task_finish: Received task exit notification for 1 task of StepId=4565098.4 (status=0x0000). srun: launch/slurm: _task_finish: c5-42: task 2: Completed srun: launch/slurm: _task_finish: Received task exit notification for 1 task of StepId=4565098.4 (status=0x0000). srun: launch/slurm: _task_finish: c5-39: task 1: Completed srun: launch/slurm: _task_finish: Received task exit notification for 1 task of StepId=4565098.5 (status=0x0000). srun: launch/slurm: _task_finish: Received task exit notification for 1 task of StepId=4565098.4 (status=0x0000). srun: launch/slurm: _task_finish: c5-44: task 0: Completed srun: launch/slurm: _task_finish: c5-37: task 0: Completed srun: Job 4565098 step creation temporarily disabled, retrying (Requested nodes are busy) srun: launch/slurm: _task_finish: Received task exit notification for 1 task of StepId=4565098.4 (status=0x0000). srun: launch/slurm: _task_finish: c5-43: task 3: Completed srun: Job 4565098 step creation still disabled, retrying (Requested nodes are busy) srun: Step created for job 4565098 srun: launch/slurm: launch_p_step_launch: CpuBindType=(null type) srun: launch/slurm: _task_finish: Received task exit notification for 1 task of StepId=4565098.6 (status=0x0000). srun: launch/slurm: _task_finish: c5-43: task 0: Completed srun: launching StepId=4565098.7 on host c5-37, 1 tasks: 0 srun: launching StepId=4565098.7 on host c5-39, 1 tasks: 1 srun: launching StepId=4565098.7 on host c5-42, 1 tasks: 2 srun: route/default: init: route default plugin loaded srun: launch/slurm: _task_start: Node c5-42, 1 tasks started srun: launch/slurm: _task_start: Node c5-37, 1 tasks started srun: launch/slurm: _task_start: Node c5-39, 1 tasks started 2021-12-09T10:45:51 - Arg: B - Step ID: 7 - Host: c5-37 - CPUs on node: 1 2021-12-09T10:45:51 - Arg: B - Step ID: 7 - Host: c5-42 - CPUs on node: 1 2021-12-09T10:45:51 - Arg: B - Step ID: 7 - Host: c5-39 - CPUs on node: 1 srun: launch/slurm: _task_finish: Received task exit notification for 1 task of StepId=4565098.7 (status=0x0000). srun: launch/slurm: _task_finish: c5-42: task 2: Completed srun: launch/slurm: _task_finish: Received task exit notification for 1 task of StepId=4565098.7 (status=0x0000). srun: launch/slurm: _task_finish: c5-39: task 1: Completed srun: launch/slurm: _task_finish: Received task exit notification for 1 task of StepId=4565098.7 (status=0x0000). srun: launch/slurm: _task_finish: c5-37: task 0: Completed # output from # echo Submitting parallel steps, with --exact and --mem: # srun -v -n4 --exact --mem=1G my-binary A & # srun -v -n3 --exact --mem=1G my-binary B & # srun -v -n1 --exact --mem=1G my-binary C & # srun -v -n1 --exact --mem=1G my-binary D & Submitting parallel steps, with --exact and --mem: Done submitting. Waiting... srun: Warning: can't run 3 processes on 5 nodes, setting nnodes to 3 srun: defined options srun: -------------------- -------------------- srun: (null) : c5-[37,39,42-44] srun: exact : set srun: jobid : 4565098 srun: job-name : srun_parallel_from_man.sm srun: mem : 1G srun: nodes : 3 srun: ntasks : 3 srun: verbose : 1 srun: -------------------- -------------------- srun: end of defined options srun: jobid 4565098: nodes(5):`c5-[37,39,42-44]', cpu counts: 1(x3),3(x2) srun: Warning: can't run 1 processes on 5 nodes, setting nnodes to 1 srun: defined options srun: -------------------- -------------------- srun: (null) : c5-[37,39,42-44] srun: exact : set srun: jobid : 4565098 srun: job-name : srun_parallel_from_man.sm srun: mem : 1G srun: nodes : 1 srun: ntasks : 1 srun: verbose : 1 srun: -------------------- -------------------- srun: Warning: can't run 4 processes on 5 nodes, setting nnodes to 4 srun: end of defined options srun: defined options srun: -------------------- -------------------- srun: (null) : c5-[37,39,42-44] srun: exact : set srun: jobid : 4565098 srun: Warning: can't run 1 processes on 5 nodes, setting nnodes to 1 srun: job-name : srun_parallel_from_man.sm srun: defined options srun: jobid 4565098: nodes(5):`c5-[37,39,42-44]', cpu counts: 1(x3),3(x2) srun: mem : 1G srun: -------------------- -------------------- srun: nodes : 4 srun: (null) : c5-[37,39,42-44] srun: ntasks : 4 srun: exact : set srun: verbose : 1 srun: jobid : 4565098 srun: -------------------- -------------------- srun: job-name : srun_parallel_from_man.sm srun: end of defined options srun: mem : 1G srun: nodes : 1 srun: ntasks : 1 srun: verbose : 1 srun: -------------------- -------------------- srun: end of defined options srun: jobid 4565098: nodes(5):`c5-[37,39,42-44]', cpu counts: 1(x3),3(x2) srun: jobid 4565098: nodes(5):`c5-[37,39,42-44]', cpu counts: 1(x3),3(x2) srun: launch/slurm: launch_p_step_launch: CpuBindType=(null type) srun: launch/slurm: launch_p_step_launch: CpuBindType=(null type) srun: launching StepId=4565098.12 on host c5-37, 1 tasks: 0 srun: launching StepId=4565098.12 on host c5-39, 1 tasks: 1 srun: launching StepId=4565098.12 on host c5-42, 1 tasks: 2 srun: route/default: init: route default plugin loaded srun: launching StepId=4565098.13 on host c5-43, 1 tasks: 0 srun: route/default: init: route default plugin loaded srun: launch/slurm: launch_p_step_launch: CpuBindType=(null type) srun: launching StepId=4565098.14 on host c5-44, 1 tasks: 0 srun: route/default: init: route default plugin loaded srun: launch/slurm: _task_start: Node c5-37, 1 tasks started srun: launch/slurm: _task_start: Node c5-42, 1 tasks started srun: launch/slurm: _task_start: Node c5-39, 1 tasks started srun: launch/slurm: _task_start: Node c5-43, 1 tasks started srun: launch/slurm: _task_start: Node c5-44, 1 tasks started 2021-12-09T10:46:55 - Arg: B - Step ID: 12 - Host: c5-37 - CPUs on node: 1 2021-12-09T10:46:55 - Arg: C - Step ID: 13 - Host: c5-43 - CPUs on node: 1 2021-12-09T10:46:55 - Arg: B - Step ID: 12 - Host: c5-42 - CPUs on node: 1 2021-12-09T10:46:55 - Arg: D - Step ID: 14 - Host: c5-44 - CPUs on node: 1 2021-12-09T10:46:55 - Arg: B - Step ID: 12 - Host: c5-39 - CPUs on node: 1 srun: launch/slurm: _task_finish: Received task exit notification for 1 task of StepId=4565098.13 (status=0x0000). srun: launch/slurm: _task_finish: c5-43: task 0: Completed srun: Job 4565098 step creation temporarily disabled, retrying (Requested nodes are busy) srun: launch/slurm: _task_finish: Received task exit notification for 1 task of StepId=4565098.12 (status=0x0000). srun: launch/slurm: _task_finish: c5-37: task 0: Completed srun: launch/slurm: _task_finish: Received task exit notification for 1 task of StepId=4565098.12 (status=0x0000). srun: launch/slurm: _task_finish: c5-42: task 2: Completed srun: launch/slurm: _task_finish: Received task exit notification for 1 task of StepId=4565098.12 (status=0x0000). srun: launch/slurm: _task_finish: c5-39: task 1: Completed srun: launch/slurm: _task_finish: Received task exit notification for 1 task of StepId=4565098.14 (status=0x0000). srun: launch/slurm: _task_finish: c5-44: task 0: Completed srun: Job 4565098 step creation still disabled, retrying (Requested nodes are busy) srun: Step created for job 4565098 srun: launch/slurm: launch_p_step_launch: CpuBindType=(null type) srun: launching StepId=4565098.15 on host c5-37, 1 tasks: 0 srun: launching StepId=4565098.15 on host c5-39, 1 tasks: 1 srun: launching StepId=4565098.15 on host c5-42, 1 tasks: 2 srun: launching StepId=4565098.15 on host c5-43, 1 tasks: 3 srun: route/default: init: route default plugin loaded srun: launch/slurm: _task_start: Node c5-42, 1 tasks started srun: launch/slurm: _task_start: Node c5-37, 1 tasks started srun: launch/slurm: _task_start: Node c5-39, 1 tasks started srun: launch/slurm: _task_start: Node c5-43, 1 tasks started 2021-12-09T10:47:26 - Arg: A - Step ID: 15 - Host: c5-37 - CPUs on node: 1 2021-12-09T10:47:26 - Arg: A - Step ID: 15 - Host: c5-39 - CPUs on node: 1 2021-12-09T10:47:26 - Arg: A - Step ID: 15 - Host: c5-42 - CPUs on node: 1 2021-12-09T10:47:26 - Arg: A - Step ID: 15 - Host: c5-43 - CPUs on node: 1 srun: launch/slurm: _task_finish: Received task exit notification for 1 task of StepId=4565098.15 (status=0x0000). srun: launch/slurm: _task_finish: c5-37: task 0: Completed srun: launch/slurm: _task_finish: Received task exit notification for 1 task of StepId=4565098.15 (status=0x0000). srun: launch/slurm: _task_finish: c5-42: task 2: Completed srun: launch/slurm: _task_finish: Received task exit notification for 1 task of StepId=4565098.15 (status=0x0000). srun: launch/slurm: _task_finish: c5-39: task 1: Completed srun: launch/slurm: _task_finish: Received task exit notification for 1 task of StepId=4565098.15 (status=0x0000). srun: launch/slurm: _task_finish: c5-43: task 3: Completed Apart from the "srun: mem-per-cpu : 1G" versus "srun: mem : 1G" and the ordering of the steps, I see no substantial difference. The last step to start seems to refuse to start until it can start a single task per node. (In reply to Bjørn-Helge Mevik from comment #4) > (In reply to Felip Moll from comment #2) > > > I am not sure this is related to the new behavior of --exact. > > > > Still looking into that but looks more like something that still happened > > before. > > I'm quite sure we were able to run examples like that (with the --exact) in > version 19.05 and earlier and get all tasks to start at the same time. Sorry, that should have been "(with the --exclusive)". (In reply to Bjørn-Helge Mevik from comment #5) > (In reply to Bjørn-Helge Mevik from comment #4) > > (In reply to Felip Moll from comment #2) > > > > > I am not sure this is related to the new behavior of --exact. > > > > > > Still looking into that but looks more like something that still happened > > > before. > > > > I'm quite sure we were able to run examples like that (with the --exact) in > > version 19.05 and earlier and get all tasks to start at the same time. > > Sorry, that should have been "(with the --exclusive)". Hi Bjørn, Can you repeat the test and while the issue is appearing and the job running, do: 'scontrol show jobs' 'scontrol show nodes' 'scontrol show steps' and upload/paste it here? Thanks (In reply to Felip Moll from comment #6) Hi, Felip, > Can you repeat the test and while the issue is appearing and the job > running, do: > > 'scontrol show jobs' > 'scontrol show nodes' > 'scontrol show steps' > > and upload/paste it here? This is on a production cluster, and I don't feel comfortable with uploading info about every running job on the cluster to a public place like this. Is there somewhere I can send the output instead? Ok Bjørn, Please defer the tests for now. I will try to work more on my testbed, since I have more ideas, and will get back to you soon. Bjørn, I have finally figured out the issue, and it turns to be an expected behavior. If you are inside an allocation, we expect that srun will take the default values from the parameters of the allocation, so in your case, since the examples you shown needs two nodes, then the minimum number of nodes set by default in further sruns will be 2. The fact that setting --nodes=1-$SLURM_JOB_NUM_NODES makes it work, is because you are setting the minimum number of nodes to 1. Not setting this the minimum would be two. You can check that enabling the STEP debugflag in slurmctld, and noticing a line similar to this one: [2021-12-14T20:24:59.862] STEPS: _pick_step_nodes: step pick 2-2 nodes, avail:node2 idle: picked:NONE This was noticed in bug 11589 too, and a solution was introduced in 21.08. The solution is to set "--distribution=pack" to srun, then I checked how it works. You can also set SelectTypeParameters=CR_PACK_NODES to make it the default. See more in slurm.conf man page in 21.08 or bug 11589, commit e942cadb345. In 20.11 you can use the workaround you found of --nodes=1-$SLURM_JOB_NUM_NODES. Another solution I suggest is to cherry pick the patch from bug 11589 and apply it to 20.11, in case you can patch your Slurm installation. In 20.02 it didn't happen because the exclusive flag was flawed, so after the fixes we did in 20.11 this is the new behavior. We are working to document this situation better, including your case, in bug 11310. Does it make sense? Ok, thanks for the info! We are planning to upgrade to 21.08 in the near future, so in the mean time, I'll simply document the workaround --nodes=1-$SLURM_JOB_NUM_NODES. (In reply to Bjørn-Helge Mevik from comment #18) > Ok, thanks for the info! > > We are planning to upgrade to 21.08 in the near future, so in the mean time, > I'll simply document the workaround --nodes=1-$SLURM_JOB_NUM_NODES. Ok Bjørn, Please, reopen this bug when you upgrade if you still have issues. Thanks for your patience. |
Created attachment 22402 [details] Main slurm config file. After upgrading from 19.05.7 to 20.11.8, we've discovered that the way that we've recommended for running tasks in parallel with srun does not work any more. In 19.05.7 and earlier, we used "srun --exclusive", based on an example in the srun man page: > cat my.script #!/bin/bash srun --exclusive -n4 prog1 & srun --exclusive -n3 prog2 & srun --exclusive -n1 prog3 & srun --exclusive -n1 prog4 & wait As I understand, the behaviour of srun changed with 20.11.x, and now the example in the man page says $ cat my.script #!/bin/bash srun -n4 prog1 & srun -n3 prog2 & srun -n1 prog3 & srun -n1 prog4 & wait However, in some of our partitions, we hand out cpu and memory, not whole nodes, and as I understand it, the default for srun is now that each run gets access to the whole job allocation, which means that only one srun will run at a time on each node. We've verified this with the following job script: ---- snip ---- #!/bin/bash #SBATCH -A nn9999k --time=10 --mem-per-cpu=1G #SBATCH -o out/%x-%j.out #SBATCH --ntasks=9 echo Starting. echo env | grep SLURM | sort echo echo Submitting parallel steps, default: srun -n4 my-binary A & srun -n3 my-binary B & srun -n1 my-binary C & srun -n1 my-binary D & echo Done submitting. Waiting... wait echo echo Submitting parallel steps, with --exact: srun -n4 --exact my-binary A & srun -n3 --exact my-binary B & srun -n1 --exact my-binary C & srun -n1 --exact my-binary D & echo Done submitting. Waiting... wait ---- snip ---- "my-binary" is just a small script printing the date, the command line argument, the $SLURM_STEP_ID, hostname and $SLURM_CPUS_ON_NODE, and then sleeping a little: ---- snip ---- #!/bin/bash echo $(date +%FT%T) - Arg: $1 - Step ID: $SLURM_STEP_ID - Host: $(hostname) - CPUs on node: $SLURM_CPUS_ON_NODE sleep 30 ---- snip ---- With this, the (relevant) output is ---- snip ---- SLURM_JOB_CPUS_PER_NODE=1,8 [...] SLURM_JOB_NODELIST=c5-[1,5] SLURM_JOB_NUM_NODES=2 [...] SLURM_TASKS_PER_NODE=1,8 [...] Submitting parallel steps, default: Done submitting. Waiting... srun: Warning: can't run 1 processes on 2 nodes, setting nnodes to 1 srun: Warning: can't run 1 processes on 2 nodes, setting nnodes to 1 2021-11-23T09:42:02 - Arg: A - Step ID: 0 - Host: c5-1 - CPUs on node: 1 2021-11-23T09:42:02 - Arg: A - Step ID: 0 - Host: c5-5 - CPUs on node: 8 2021-11-23T09:42:02 - Arg: A - Step ID: 0 - Host: c5-5 - CPUs on node: 8 2021-11-23T09:42:02 - Arg: A - Step ID: 0 - Host: c5-5 - CPUs on node: 8 srun: Job 4448178 step creation temporarily disabled, retrying (Requested nodes are busy) srun: Job 4448178 step creation temporarily disabled, retrying (Requested nodes are busy) srun: Job 4448178 step creation temporarily disabled, retrying (Requested nodes are busy) srun: Step created for job 4448178 srun: Step created for job 4448178 2021-11-23T09:42:32 - Arg: C - Step ID: 2 - Host: c5-5 - CPUs on node: 8 2021-11-23T09:42:32 - Arg: D - Step ID: 1 - Host: c5-1 - CPUs on node: 1 srun: Job 4448178 step creation still disabled, retrying (Requested nodes are busy) srun: Job 4448178 step creation still disabled, retrying (Requested nodes are busy) srun: Step created for job 4448178 2021-11-23T09:43:03 - Arg: B - Step ID: 3 - Host: c5-5 - CPUs on node: 8 2021-11-23T09:43:03 - Arg: B - Step ID: 3 - Host: c5-1 - CPUs on node: 1 2021-11-23T09:43:03 - Arg: B - Step ID: 3 - Host: c5-5 - CPUs on node: 8 Submitting parallel steps, with --exact: Done submitting. Waiting... srun: Warning: can't run 1 processes on 2 nodes, setting nnodes to 1 srun: Warning: can't run 1 processes on 2 nodes, setting nnodes to 1 2021-11-23T09:43:33 - Arg: A - Step ID: 4 - Host: c5-1 - CPUs on node: 1 2021-11-23T09:43:33 - Arg: A - Step ID: 4 - Host: c5-5 - CPUs on node: 3 2021-11-23T09:43:33 - Arg: C - Step ID: 5 - Host: c5-5 - CPUs on node: 1 2021-11-23T09:43:33 - Arg: D - Step ID: 6 - Host: c5-5 - CPUs on node: 1 2021-11-23T09:43:33 - Arg: A - Step ID: 4 - Host: c5-5 - CPUs on node: 3 2021-11-23T09:43:33 - Arg: A - Step ID: 4 - Host: c5-5 - CPUs on node: 3 srun: Job 4448178 step creation temporarily disabled, retrying (Requested nodes are busy) srun: Job 4448178 step creation still disabled, retrying (Requested nodes are busy) srun: Job 4448178 step creation still disabled, retrying (Requested nodes are busy) srun: Step created for job 4448178 2021-11-23T09:44:04 - Arg: B - Step ID: 7 - Host: c5-5 - CPUs on node: 2 2021-11-23T09:44:04 - Arg: B - Step ID: 7 - Host: c5-5 - CPUs on node: 2 2021-11-23T09:44:04 - Arg: B - Step ID: 7 - Host: c5-1 - CPUs on node: 1 Done. ---- snip ---- As can be seen, by default, each task of each step sees all the cpus on the node it runs on, so only one step can run on each node at the same time. Adding --exact to the srun command lines fixes that particular problem, but still the last step (7, argument "B") refuses to use the three available CPUs on c5-5, but waits until it can run on two nodes. The only reliable way to hand out cpus to parallel sruns that I can find is using "srun --exact --nodes=1-$SLURM_JOB_NUM_NODES ...", like this: srun -n4 --exact --nodes=1-$SLURM_JOB_NUM_NODES my-binary A & srun -n3 --exact --nodes=1-$SLURM_JOB_NUM_NODES my-binary B & srun -n1 --exact --nodes=1-$SLURM_JOB_NUM_NODES my-binary C & srun -n1 --exact --nodes=1-$SLURM_JOB_NUM_NODES my-binary D & From a different, but similar (two nodes, 1 + 8 cpus), run as the one above, this gave: ---- snip ---- Submitting parallel steps, with --exact and --nodes range: Done submitting. Waiting... 2021-11-25T10:25:36 - Arg: D - Step ID: 8 - Host: c11-7 - CPUs on node: 1 2021-11-25T10:25:36 - Arg: C - Step ID: 9 - Host: c11-60 - CPUs on node: 1 2021-11-25T10:25:36 - Arg: A - Step ID: 10 - Host: c11-7 - CPUs on node: 4 2021-11-25T10:25:36 - Arg: B - Step ID: 11 - Host: c11-7 - CPUs on node: 3 2021-11-25T10:25:36 - Arg: A - Step ID: 10 - Host: c11-7 - CPUs on node: 4 2021-11-25T10:25:36 - Arg: B - Step ID: 11 - Host: c11-7 - CPUs on node: 3 2021-11-25T10:25:36 - Arg: A - Step ID: 10 - Host: c11-7 - CPUs on node: 4 2021-11-25T10:25:36 - Arg: B - Step ID: 11 - Host: c11-7 - CPUs on node: 3 2021-11-25T10:25:36 - Arg: A - Step ID: 10 - Host: c11-7 - CPUs on node: 4 Done. ---- snip ---- (This also gets rid of the warnings about number of nodes.) Is this how one is supposed to run parallel sruns when handing out memory and cpus? Regards, Bjørn-Helge Mevik