| Summary: | "SLURM_NTASKS" sometimes isn't set when using sbatch | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Greg Wickham <greg.wickham> |
| Component: | User Commands | Assignee: | Marcin Stolarek <cinek> |
| Status: | RESOLVED DUPLICATE | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | cinek |
| Version: | 20.11.6 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | KAUST | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
Greg, It's one of the issues we're working on in Bug 10620, where we try to fully understand the consequences and fix the code or documentation appropriately. If this is just a concern/question for explanation I'll go ahead and mark this as duplicate. Are you impacted somehow by the behavior? cheers, Marcin Hi Marcin, Please mark this as a duplicated. Apologies for not catching 10620 when filing the bug - it's obviously the same one. -greg >Apologies for not catching 10620 when That's fine, we're here to track tickets too. cheers, Marcin *** This ticket has been marked as a duplicate of ticket 10620 *** |
SLURM_NTASKS isn't always being set when using "--ntasks": $ cat fail6.sh #!/bin/bash #SBATCH --time=00:10:00 #SBTACH --ntasks=2 echo "BEGIN ${SLURM_JOBID}" env | egrep 'N(TASK|PROC)' echo "END ${SLURM_JOBID}" exit 0 $ sbatch fail6.sh Submitted batch job 15199216 $ cat slurm-15199216.out BEGIN 15199216 END 15199216 $ It is being set when using other sbatch parameters (ie: --tasks-per-node=) $ cat ok7.sh #!/bin/bash #SBATCH --time=00:10:00 #SBTACH --ntasks=2 #SBTACH --tasks-per-node=2 echo "BEGIN ${SLURM_JOBID}" env | egrep 'N(TASK|PROC)' echo "END ${SLURM_JOBID}" exit 0 $ sbatch ok7.sh Submitted batch job 15199311 $ cat slurm-15199311.out BEGIN 15199311 SLURM_NTASKS=2 SLURM_NPROCS=2 END 15199311 $ and for example `--cpus-per-task=` $ cat ok8.sh #!/bin/bash #SBATCH --time=00:10:00 #SBTACH --ntasks=2 #SBTACH --cpus-per-task=2 echo "BEGIN ${SLURM_JOBID}" env | egrep 'N(TASK|PROC)' echo "END ${SLURM_JOBID}" exit 0 $ sbatch ok8.sh Submitted batch job 15199313 $ cat slurm-15199313.out BEGIN 15199313 SLURM_NTASKS=2 SLURM_NPROCS=2 END 15199313 $ And finally, the failed above example works fine using srun: $ srun --ntasks 2 --time 00:10:00 --pty /bin/bash -i srun: job 15199314 queued and waiting for resources srun: job 15199314 has been allocated resources $ env | egrep 'N(TASK|PROC)' SLURM_NTASKS=2 SLURM_NPROCS=2 $