Ticket 8944 - $SLURM_DISTRIBUTION is affecting submitted jobs
Summary: $SLURM_DISTRIBUTION is affecting submitted jobs
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: User Commands (show other tickets)
Version: 19.05.6
Hardware: Linux Linux
: 3 - Medium Impact
Assignee: Tim McMullan
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2020-04-26 20:26 MDT by hooverdm@helix.nih.gov
Modified: 2020-05-08 11:22 MDT (History)
0 users

See Also:
Site: NIH
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 20.02.3
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description hooverdm@helix.nih.gov 2020-04-26 20:26:38 MDT
A job (--ntasks=4 --nodes=1) submitted from within an allocation with SLURM_DISTRIBUTION=cyclic inherits this property, and runs with SLURM_DISTRIBUTION=cyclic.  If the environment variable SLURM_DISTRIBUTION is unset before the job is submitted, then the job runs with SLURM_DISTRIBUTION=block.

This is in violation of the documentation, which says that SBATCH_DISTRIBUTION is the input environment variable that can change the properties of the subsequent job.  It is also in violation of the documentation which says that when the number of tasks exceeds the number of nodes, the default distribution is block.

Isn't there supposed to be a difference in behavior between environment variables that begin with SBATCH_ and those that begin with SLURM_?

David
Comment 4 Tim McMullan 2020-04-29 11:54:05 MDT
Hi David,

Would you be able to clarify where the behavior is being seen?  What method are you using to get an allocation, is it something "salloc --ntasks=4 -m cyclic" or are you working in an sbatch script?

Thanks!
--Tim
Comment 5 hooverdm@helix.nih.gov 2020-04-29 13:35:26 MDT
The problem stems from an application that fails when run under distribution=cyclic, but runs normally under distribution=block.

We have a wrapper script called sinteractive that first runs salloc with user-supplied options, and then runs srun with those same options.

The default action is to allocate 1 task on 1 node, and thus set distribution=cyclic (if sinteractive is called with --ntasks > --nodes, then distribution=block as expected).  The environment variable $SLURM_DISTRIBUTION is set to cyclic.

If then a script is created and submitted using sbatch, that job is also run using distribution=cyclic, no matter what the --ntasks and --nodes values, UNLESS:

* the script is submitted with the commandline option --distribution=(cyclic|block), this overrides the environment variable
* the script is modified to contain #SBATCH --distribution=(cyclic|block), this also overrides the environment variable
* the SLURM_DISTRIBUTION variable is unset
* the SLURM_DISTRIBUTION variable is set to 'block'

Indeed, all that matters is the variable SLURM_DISTRIBUTION.  I've tested it, and it doesn't matter whether the sbatch submission is run from within another job or from the login host.

If the SLURM_DISTRIBUTION variable is set to either cyclic or block, and no commandline option or sbatch directive is set to override it, the distribution of the submitted sbatch script is determined by its value, no matter what the values of --ntasks or --nodes values are.  This (as far as I know) is not what the documentation for sbatch states.  This is unexpected behavior; I thought only variables that begin with SBATCH_ affected the sbatch allocations.

David
Comment 6 Tim McMullan 2020-05-06 13:22:54 MDT
Hi David,

Thank you for the extra information on this, it was very helpful in seeing what was going on and where the confusion here is.

As part of the job submission, sbatch ships the user environment along with the job.  The --export option for sbatch can partially control this, but the SLURM_* environment variables are always preserved (this is intentional, but is not clear looking at the sbatch man page).  That behavior is responsible for what you are observing.  The SBATCH_* variables serve as a way to override the SLURM_* variables, and while sbatch doesn't "parse" the SLURM_* variables, they still have impact.

I will work on a documentation update to make sure that its clear that the SLURM_ variables are always shipped along with the job since this is expected behavior.  I would suggest using one of the options you identified ("--distribution" or SATCH_DISTRIBUTION) as good ways to handle this.

Thanks!
--Tim
Comment 9 Tim McMullan 2020-05-08 11:22:16 MDT
We have landed a patch in 20.02 that clarifies that the SLURM_* variables are always propagated by sbatch so this behavior is better documented.

I'm going to resolve this ticket for now, but please feel free to re-open if you have any other questions on this!

Thanks!
--Tim