| Summary: | Exclusive allocation of CPUs is not the default for job steps | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Kilian Cavalotti <kilian> |
| Component: | Other | Assignee: | Marshall Garey <marshall> |
| Status: | RESOLVED DUPLICATE | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | ||
| Version: | 20.11.7 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| See Also: | https://bugs.schedmd.com/show_bug.cgi?id=11310 | ||
| Site: | Stanford | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | ||
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
|
Description
Kilian Cavalotti
2021-06-14 14:34:26 MDT
Hi Kilian,
I believe I can answer your question. I believe the confusion here is that the --exclusive option does more than just grant exclusive allocation to resources. It also implies the --exact flag, which means srun is allocated exactly the amount of CPUs it requested.
Looking at your examples:
(1) Without --exclusive:
```
$ ## start a step requesting a subset of the job's resources, without `--exclusive`, in the background:
$
$ srun -l -n 1 -c 2 sleep 1000 &
[1] 32509
$ ## check the allocated resources: it shows 20 CPUs, everything that was allocated to the job:
$
$ sacct -j $SLURM_JOBID --format user,jobid,start,end,ntasks,reqcpus,ncpus,reqmem
User JobID Start End NTasks ReqCPUS NCPUS ReqMem
--------- ------------ ------------------- ------------------- -------- -------- ---------- ----------
kilian 26302313 2021-06-14T13:21:25 Unknown 20 20 4000Mc
26302313.in+ 2021-06-14T13:21:25 Unknown 1 20 20 4000Mc
26302313.ex+ 2021-06-14T13:21:25 Unknown 1 20 20 4000Mc
26302313.0 2021-06-14T13:23:48 2021-06-14T13:23:49 1 20 20 4000Mc
26302313.1 2021-06-14T13:23:58 Unknown 1 20 20 4000Mc
```
Here, srun is given all of the CPUs in the allocation because it did not use --exact (or --exclusive, which implies --exact). However, srun is also given exclusive access to these CPUs. If you tried to run srun --overlap in the allocation, those srun would not start until this step is completed. (Well, they would also not run because there's no memory available, but you can either not enforce memory or just use --mem to ensure that there's enough memory for all the srun's that you want.)
(2) With --exclusive:
```
$ ## start a new step with the same resource requirements as before, but with `--exclusive`:
$
$ srun -l -n 1 -c 2 --exclusive sleep 1000 &
[1] 311
$ ## check the allocated resources:
$
$ sacct -j $SLURM_JOBID --format user,jobid,start,end,ntasks,reqcpus,ncpus,reqmem
User JobID Start End NTasks ReqCPUS NCPUS ReqMem
--------- ------------ ------------------- ------------------- -------- -------- ---------- ----------
kilian 26302313 2021-06-14T13:21:25 Unknown 20 20 4000Mc
26302313.in+ 2021-06-14T13:21:25 Unknown 1 20 20 4000Mc
26302313.ex+ 2021-06-14T13:21:25 Unknown 1 20 20 4000Mc
26302313.0 2021-06-14T13:23:48 2021-06-14T13:23:49 1 20 20 4000Mc
26302313.1 2021-06-14T13:23:58 2021-06-14T13:25:11 1 20 20 4000Mc
26302313.2 2021-06-14T13:25:21 Unknown 1 2 2 4000Mc
That one shows that it only allocated the requested resources for the step (2 CPUs).
```
Here because you use --exclusive it implied --exact, therefore srun was only given 2 CPUs.
A couple of thoughts:
(1) This is confusing - the fact that we say exclusive allocation is the default, but the default doesn't imply --exact, but specifying --exclusive does imply --exact which gives you different behavior. I'm going to research and see what we actually want. We probably need to update the documentation at least.
(2) As of bug 11275, specifying --cpus-per-task implies --exact. However, because this was a change in behavior we only pushed this change to 21.08. This means that in your first example you would see the behavior you expect - srun would only get 2 CPUs. However, if you did not use --cpus-per-task nor --exclusive, then srun would get all the CPUs in the allocation.
Does this answer your question? Would updating the documentation be sufficient?
Hi Marshall, Thank you very much for the explanation, that definitely clarifies things. Now, yes, totally agree with (1), this is extremely confusing. > we say exclusive allocation is the default, but the default doesn't imply --exact, but specifying --exclusive does imply --exact which gives you different behavior. Yes! And not only that, but the very fact that the exacts same option (--exclusive) has completely different meanings for sbatch and srun has already been confusing for years. The added `--exact` switch makes it combinatorially more perplexing. :) > Would updating the documentation be sufficient? Yes, I don't think that the actual behavior needs to be changed, but I strongly believe that a documentation update (well, more like a brand new section, maybe?) is in order. Given the number of recent bug reports in this area since 20.11, it would likely benefit many Slurm sysadmins and end-users. Ideally, a general explanation of the options and a list of simple examples would go a very long way. Because right now, it's hard to guess the behavior you'll get from the option names only. :) Thanks! -- Kilian Kilian, I already have bug 11310 open about improving this documentation, and that bug links to yet another bug where there were questions about the number of CPUs that would be allocated to steps. So I'm making this bug a duplicate of bug 11310. Feel free to add yourself to CC on 11310 and feel free to comment on that one as well. *** This ticket has been marked as a duplicate of ticket 11310 *** On Thu, Jun 17, 2021 at 9:26 AM <bugs@schedmd.com> wrote: > Kilian, I already have bug 11310 open about improving this documentation, and > that bug links to yet another bug where there were questions about the number > of CPUs that would be allocated to steps. So I'm making this bug a duplicate of > bug 11310. Feel free to add yourself to CC on 11310 and feel free to comment on > that one as well. Sounds perfect, thank you! Cheers, -- Kilian |