Ticket 4973

Summary: Slurm 17.11 seems to ignore SLURM_EXIT_IMMEDIATE variable
Product: Slurm Reporter: Christian Peter <peter>
Component: User CommandsAssignee: Jacob Jenson <jacob>
Status: RESOLVED INVALID QA Contact:
Severity: 6 - No support contract    
Priority: ---    
Version: 17.11.1   
Hardware: Linux   
OS: Linux   
Site: -Other- Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Christian Peter 2018-03-22 12:19:27 MDT
according to the docs, "srun" should terminate with exit code $SLURM_EXIT_IMMEDIATE if an "srun --immediate" job cannot start immediately.

however, this does not work for me.

in my observations, it will typically terminate with exit code 1 instead (or with $SLURM_EXIT_ERROR if that variable is set).

i could reproduce such behaviour with two different Slurm 17.11 installations.

steps to reproduce:

$ export SLURM_EXIT_IMMEDIATE=120; srun -N 150 -I /bin/true; echo $?
srun: error: Unable to allocate resources: Required node not available (down, drained or reserved)
1

$ export SLURM_EXIT_ERROR=120; srun -N 150 -I /bin/true; echo $?
srun: error: Unable to allocate resources: Required node not available (down, drained or reserved)
120

- christian