|
Description
Marshall Garey
2021-04-06 16:45:10 MDT
Bug 11824 has another example how this can be confusing - exclusive allocation of CPUs to steps is the default behavior, but --exact is not the default. But if you use --exclusive, then --exact is implied. We should document this. ================================================================================= Bug 11824 comment 1: ================================================================================= I believe I can answer your question. I believe the confusion here is that the --exclusive option does more than just grant exclusive allocation to resources. It also implies the --exact flag, which means srun is allocated exactly the amount of CPUs it requested. Looking at your examples: (1) Without --exclusive: ``` $ ## start a step requesting a subset of the job's resources, without `--exclusive`, in the background: $ $ srun -l -n 1 -c 2 sleep 1000 & [1] 32509 $ ## check the allocated resources: it shows 20 CPUs, everything that was allocated to the job: $ $ sacct -j $SLURM_JOBID --format user,jobid,start,end,ntasks,reqcpus,ncpus,reqmem User JobID Start End NTasks ReqCPUS NCPUS ReqMem --------- ------------ ------------------- ------------------- -------- -------- ---------- ---------- kilian 26302313 2021-06-14T13:21:25 Unknown 20 20 4000Mc 26302313.in+ 2021-06-14T13:21:25 Unknown 1 20 20 4000Mc 26302313.ex+ 2021-06-14T13:21:25 Unknown 1 20 20 4000Mc 26302313.0 2021-06-14T13:23:48 2021-06-14T13:23:49 1 20 20 4000Mc 26302313.1 2021-06-14T13:23:58 Unknown 1 20 20 4000Mc ``` Here, srun is given all of the CPUs in the allocation because it did not use --exact (or --exclusive, which implies --exact). However, srun is also given exclusive access to these CPUs. If you tried to run srun --overlap in the allocation, those srun would not start until this step is completed. (Well, they would also not run because there's no memory available, but you can either not enforce memory or just use --mem to ensure that there's enough memory for all the srun's that you want.) (2) With --exclusive: ``` $ ## start a new step with the same resource requirements as before, but with `--exclusive`: $ $ srun -l -n 1 -c 2 --exclusive sleep 1000 & [1] 311 $ ## check the allocated resources: $ $ sacct -j $SLURM_JOBID --format user,jobid,start,end,ntasks,reqcpus,ncpus,reqmem User JobID Start End NTasks ReqCPUS NCPUS ReqMem --------- ------------ ------------------- ------------------- -------- -------- ---------- ---------- kilian 26302313 2021-06-14T13:21:25 Unknown 20 20 4000Mc 26302313.in+ 2021-06-14T13:21:25 Unknown 1 20 20 4000Mc 26302313.ex+ 2021-06-14T13:21:25 Unknown 1 20 20 4000Mc 26302313.0 2021-06-14T13:23:48 2021-06-14T13:23:49 1 20 20 4000Mc 26302313.1 2021-06-14T13:23:58 2021-06-14T13:25:11 1 20 20 4000Mc 26302313.2 2021-06-14T13:25:21 Unknown 1 2 2 4000Mc That one shows that it only allocated the requested resources for the step (2 CPUs). ``` Here because you use --exclusive it implied --exact, therefore srun was only given 2 CPUs. A couple of thoughts: (1) This is confusing - the fact that we say exclusive allocation is the default, but the default doesn't imply --exact, but specifying --exclusive does imply --exact which gives you different behavior. I'm going to research and see what we actually want. We probably need to update the documentation at least. (2) As of bug 11275, specifying --cpus-per-task implies --exact. However, because this was a change in behavior we only pushed this change to 21.08. This means that in your first example you would see the behavior you expect - srun would only get 2 CPUs. However, if you did not use --cpus-per-task nor --exclusive, then srun would get all the CPUs in the allocation. Does this answer your question? Would updating the documentation be sufficient? ================================================================================= Bug 11824 comment 2: ================================================================================= > Would updating the documentation be sufficient? Yes, I don't think that the actual behavior needs to be changed, but I strongly believe that a documentation update (well, more like a brand new section, maybe?) is in order. Given the number of recent bug reports in this area since 20.11, it would likely benefit many Slurm sysadmins and end-users. Ideally, a general explanation of the options and a list of simple examples would go a very long way. Because right now, it's hard to guess the behavior you'll get from the option names only. :) *** Ticket 11824 has been marked as a duplicate of this ticket. *** *** Ticket 12850 has been marked as a duplicate of this ticket. *** Hi all, We've pushed a small improvement to the srun man page. There is more we still need to do, like adding some examples and maybe a whole new section. commit 934f3b543b6bc9f3335d1cc6813b8d95cb2c49b4 Author: Marshall Garey <marshall@schedmd.com> Date: Wed Nov 24 11:28:30 2021 -0700 Docs - Clarify default behavior of srun --exclusive Bug 11310 I was going to make this note private, but I'll just make it public since it's good information. The following commit has some good examples, and also shows how --mem-per-cpu is affected. I think they'd be good to incorporate into the documentation: https://github.com/schedMD/slurm/commit/9c7d36b44f I've copied the examples here for convenience. Some expectations on a 16 core 2 threaded node. salloc --exclusive --mem-per-cpu=5 We expect 32 cpus and 160M memory allocated srun shostname Here we expect all cpus and memory from the job srun -c2 --exact -n1 whereami Here we expect 2 cpus and 10M memory srun -c1 --exact -n1 whereami Here we expect 1 cpus and 5M of memory (Though we actually have access to the other threads on the core) srun -c1 --exact -n1 --threads-per-core=1 whereami Here we expect 2 cpus since we don't want something else starting on the other thread on the core and 5M of memory. srun -c2 --exact -n1 --threads-per-core=1 whereami Here we expect 4 cpus for the same reason as above and 10M of memory. sacct -j $SLURM_JOBID -o jobid,alloctres -p JobID|AllocTRES| 152422|billing=32,cpu=32,gres/gpu:k80=4,gres/gpu:tesla=4,gres/gpu=8,mem=160M,node=1| 152422.interactive|cpu=32,gres/gpu:k80=4,gres/gpu:tesla=4,gres/gpu=8,mem=160M,node=1| 152422.0|cpu=32,mem=160M,node=1| 152422.1|cpu=2,mem=10M,node=1| 152422.2|cpu=1,mem=5M,node=1| 152422.3|cpu=2,mem=5M,node=1| 152422.4|cpu=4,mem=10M,node=1| |