| Summary: | srun --whole input variable (SLURM_WHOLE is not documented) | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Josko Plazonic <plazonic> |
| Component: | User Commands | Assignee: | Marcin Stolarek <cinek> |
| Status: | RESOLVED FIXED | QA Contact: | Ben Roberts <ben> |
| Severity: | 3 - Medium Impact | ||
| Priority: | --- | CC: | cinek, kilian, sts, uemit.seren |
| Version: | 20.11.1 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | Princeton (PICSciE) | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | 20.11.2 21.08pre1 | |
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
|
Description
Josko Plazonic
2020-12-12 10:17:42 MST
Josko,
I checked the code and it looks like documentation only issue. In my simple test using SLURM_WHOLE input variable works as expected:
>(copy paste from Bug 10383 coment 26)
>[salloc] bash-4.2# srun --ntasks-per-node=1 /bin/bash -c 'if [ $SLURM_NODEID -eq 0 ]; then scontrol show step; fi'
>StepId=58889.0 UserId=0 StartTime=2020-12-14T12:09:34 TimeLimit=UNLIMITED
> State=RUNNING Partition=par1 NodeList=test[01,08]
> Nodes=2 CPUs=2 Tasks=2 Name=bash Network=(null)
> TRES=cpu=2,mem=0,node=2
> ResvPorts=12043-12044
> CPUFreqReq=Default Dist=Cyclic
> SrunHost:Pid=slurmctl:28541
>[salloc] bash-4.2# export SLURM_WHOLE=1
>[salloc] bash-4.2# srun --ntasks-per-node=1 /bin/bash -c 'if [ $SLURM_NODEID -eq 0 ]; then scontrol show step; fi'
>StepId=58889.1 UserId=0 StartTime=2020-12-14T12:09:40 TimeLimit=UNLIMITED
> State=RUNNING Partition=par1 NodeList=test[01,08]
> Nodes=2 CPUs=64 Tasks=2 Name=bash Network=(null)
> TRES=cpu=64,mem=0,node=2
> ResvPorts=12045-12046
> CPUFreqReq=Default Dist=Cyclic
> SrunHost:Pid=slurmctl:28616
I'll prepare a documentation fix and keep you posted on the progress. Let me know if you notice any issue with input variable functionality though.
cheers,
Marcin
Great, I verified that it does work. It still might be nice to have it also in sbatch but this is good enough for us. Thanks! Josko, The documentation fix is now merged[1]. We prefer not to add the option to sbatch, since it's really not related with allocation/batch step and we think that exclusive steps with isolated resources are really more intuitive and better way to go for the future. We may rethink that if we see more customers interested. Should you have any question please reopen. cheers, Marcin [1]https://github.com/SchedMD/slurm/commit/4c36b604451172bb6bea9c5e931273efe80275b3 |