| Summary: | Changes to "srun --overlap" coming in Slurm 22.05, new "--overlap=force" option | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Tim Wickberg <tim> |
| Component: | User Commands | Assignee: | Marshall Garey <marshall> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | kevin.mooney, lyeager, tim |
| Version: | 22.05.x | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| See Also: |
https://bugs.schedmd.com/show_bug.cgi?id=12880 https://bugs.schedmd.com/show_bug.cgi?id=12462 |
||
| Site: | SchedMD | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | ||
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
|
Description
Tim Wickberg
2022-03-09 17:41:47 MST
Hi Tim, Thank you for the forewarning, it's much appreciated. We should be able to support this in Arm Forge (DDT & MAP) before May. Will there be a preview that we could test against? Kevin Hi Kevin, The changes are already upstream on github in the Slurm master branch: https://github.com/SchedMD/slurm/ These are the relevant commits: fe9f416ec2 Add --overlap=force option to srun 751b1b4288 Steps may only overlap with steps that also used --overlap 84d602dd7f Pack/unpack cpus_overlap cfbd78601b Add a way to track overlapped cpus in a job (--overlap) Please let us know if you have any problems testing this change. I'd like elaborate on the motivation behind the change to --overlap: The behavior change to --overlap: Steps that specify --overlap cannot overlap with steps that do not specify --overlap. In Slurm 21.08 and 20.11 (--overlap did not exist prior to 20.11), the following two srun steps will run in parallel: > $ sbatch -N1 -c2 --mem 1G --wrap "srun sleep 300" > Submitted batch job 72 > $ srun -N1 -c2 --mem 0 --overlap --pty --jobid=72 /bin/bash However, the following two steps will *not* run in parallel: > $ sbatch -N1 -c2 --mem 1G --wrap "srun --overlap sleep 300" > Submitted batch job 72 > $ srun -N1 -c2 --mem 0 --pty --jobid=72 /bin/bash # Not started in parallel Why does it work this way in 21.08? The steps that don't have overlap (therefore, exclusive access to resources) won't use CPUs that are already being used. They don't know if the CPUs are being used by steps with --overlap or not. This is confusing and inconsistent. Therefore, we decided to change --overlap such that they may only overlap other steps that also specify --overlap. This change breaks some users' use of --overlap, and breaks some current debugging tools such as Arm Forge (DDT and Map) that rely on --overlap to create a "debugging" step. Hence this bug to communicate this change. A final piece of motivation for adding --overlap=force is that --overlap only causes CPUs to be shared, but not other resources (memory, GRES). So, the 20.11/21.08 behavior of --overlap didn't really work as a "debugging" or "zero-allocation" step. Kevin, We made a proposal to the site sponsoring this change to swap the "--overlap" and "--overlap=force" behaviors, like so: * --overlap=force becomes --overlap. Therefore, using --overlap will get this new overlap behavior of overlapping on all resources (by not being counted against the job's allocation). * --overlap becomes --overlap=mutual. This would be here to opt into the 20.11/21.08 behavior but with the fixes to ensure that these steps only overlap with other steps that specify --overlap=mutual. They agreed that this would make more sense to them. We also really don't want to break things. So, we're going to make this change. This means that the Arm Forge tools shouldn't be broken by 22.05 anymore; and in fact they should be better with the new --overlap behavior. Do you have any questions about this? Hi Marshall, That's great to hear that the interface we use won't change. We currently use --mem-per-cpu=0 with --overlap to launch our debug step. Will this be no longer necessary with --overlap's new behaviour? Kevin (In reply to Kevin Mooney from comment #4) > Hi Marshall, > > That's great to hear that the interface we use won't change. > > We currently use --mem-per-cpu=0 with --overlap to launch our debug step. > Will this be no longer necessary with --overlap's new behaviour? It will still be necessary. The new --overlap behavior just means that whatever resources are allocated to this step can also be allocated to any other step. However, the step is still allocated exactly what it asks for. So by requested --mem-per-cpu=0 (or --mem=0 which is equivalent), the step requests all of the memory in the job allocation. So, steps that request --overlap still need to request whatever resources (CPUS, memory, nodes, GRES) they need. The new --overlap behavior means that they won't block other steps from running on those resources (previously it was just CPUs and didn't even work properly). Quick update: Implementing --overlap=mutual properly turned out to be more difficult and had bad performance, so we're throwing that away. We are still keeping the new behavior for --overlap (where it overlaps all resources, not just CPUs). Closing this as infogiven. |