Ticket 11507

Summary: "SLURM_NTASKS" sometimes isn't set when using sbatch
Product: Slurm Reporter: Greg Wickham <greg.wickham>
Component: User CommandsAssignee: Marcin Stolarek <cinek>
Status: RESOLVED DUPLICATE QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: cinek
Version: 20.11.6   
Hardware: Linux   
OS: Linux   
Site: KAUST Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Greg Wickham 2021-05-02 05:18:56 MDT
SLURM_NTASKS isn't always being set when using "--ntasks":

$ cat fail6.sh 
#!/bin/bash
#SBATCH --time=00:10:00
#SBTACH --ntasks=2
echo "BEGIN ${SLURM_JOBID}"
env | egrep 'N(TASK|PROC)'
echo "END ${SLURM_JOBID}"
exit 0
$ sbatch fail6.sh
Submitted batch job 15199216
$ cat slurm-15199216.out 
BEGIN 15199216
END 15199216
$

It is being set when using other sbatch parameters (ie: --tasks-per-node=)

$ cat ok7.sh 
#!/bin/bash

#SBATCH --time=00:10:00
#SBTACH --ntasks=2
#SBTACH --tasks-per-node=2

echo "BEGIN ${SLURM_JOBID}"
env | egrep 'N(TASK|PROC)'
echo "END ${SLURM_JOBID}"

exit 0
$ sbatch  ok7.sh
Submitted batch job 15199311
$ cat slurm-15199311.out 
BEGIN 15199311
SLURM_NTASKS=2
SLURM_NPROCS=2
END 15199311
$

and for example `--cpus-per-task=`

$ cat ok8.sh
#!/bin/bash

#SBATCH --time=00:10:00
#SBTACH --ntasks=2
#SBTACH --cpus-per-task=2

echo "BEGIN ${SLURM_JOBID}"
env | egrep 'N(TASK|PROC)'
echo "END ${SLURM_JOBID}"

exit 0
$ sbatch ok8.sh 
Submitted batch job 15199313
$ cat slurm-15199313.out
BEGIN 15199313
SLURM_NTASKS=2
SLURM_NPROCS=2
END 15199313
$

And finally, the failed above example works fine using srun:

$ srun --ntasks 2 --time 00:10:00 --pty /bin/bash -i 
srun: job 15199314 queued and waiting for resources
srun: job 15199314 has been allocated resources
$ env | egrep 'N(TASK|PROC)'
SLURM_NTASKS=2
SLURM_NPROCS=2
$
Comment 1 Marcin Stolarek 2021-05-03 04:06:22 MDT
Greg,

It's one of the issues we're working on in Bug 10620, where we try to fully understand the consequences and fix the code or documentation appropriately.

If this is just a concern/question for explanation I'll go ahead and mark this as duplicate.

Are you impacted somehow by the behavior?

cheers,
Marcin
Comment 2 Greg Wickham 2021-05-03 11:26:14 MDT
Hi Marcin,

Please mark this as a duplicated.

Apologies for not catching 10620 when filing the bug - it's obviously the same one.

   -greg
Comment 3 Marcin Stolarek 2021-05-03 23:46:13 MDT
>Apologies for not catching 10620 when

That's fine, we're here to track tickets too.

cheers,
Marcin

*** This ticket has been marked as a duplicate of ticket 10620 ***