Ticket 1432 - Undocumented environment variables in Prolog/Epilog
Summary: Undocumented environment variables in Prolog/Epilog
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: Documentation (show other tickets)
Version: 14.03.11
Hardware: Linux Linux
: 6 - No support contract
Assignee: Jacob Jenson
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2015-02-05 01:08 MST by Pär Lindfors
Modified: 2017-11-03 15:45 MDT (History)
0 users

See Also:
Site: -Other-
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 14.03.12
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Pär Lindfors 2015-02-05 01:08:40 MST
In Slurm 14.03.11 and 14.11.3, the following environment variables are
available in Prolog/Epilog but not documented in the slurm.conf man
page (under "Prolog and Epilog Scripts")

  Epilog and Prolog
  -----------------
  SLURM_CONF
  SLURM_JOBID
  SLURMD_NODENAME
  SLURM_NODELIST
  SLURM_UID

  Prolog only
  -----------
  SLURM_STEP_ID

I believe that most of these should be documented. Let me know if you
would like me to write this documentation and submit a patch. Before I
could do that some details needs to be clarified, see below:

SLURM_CONF
 - Added in commit b2b5b908, needed by spank plugins. Not 100% sure
   this should be documented.

SLURM_JOBID
 - Deprecated name in replaced by SLURM_JOB_ID, don't document?

SLURMD_NODENAME
 - Originally added in commit b4191579. Very useful when running
   multiple slurmd. Should be documented.

SLURM_NODELIST
- Very useful, should be documented. However, in
  PrologSlurm/EpilogSlurmctld the nodelist is in SLURM_JOB_NODELIST
  which is documented. Is it really wise to use different names for
  the same thing in PrologSlurm/EpilogSlurmctld compared to
  Prolog/Epilog?

SLURM_UID
 - Deprecated name in replaced by SLURM_JOB_UID, don't document?

SLURM_STEP_ID
 - Added in commit 34da34fd, references bug 607. Useful and should be
   documented. However should this have a special value for batch jobs
   or not?
   
   In comment 10 ( http://bugs.schedmd.com/show_bug.cgi?id=607#c10 )
   Moe wrote that for batch jobs this will be SLURM_STEP_ID=4294967294
   and it is in 14.03. In Slurm 14.11 the variable is set to 0, which
   also makes sense IMHO, since the batch script is step 0. Intended
   behavior change, or bug?
Comment 1 Pär Lindfors 2015-02-05 02:28:21 MST
Two other fixes for the same part of the man page is already in a pull request on github: https://github.com/SchedMD/slurm/pull/100
Comment 2 David Bigagli 2015-02-05 04:02:15 MST
Merged by Brian. Please not that we don't plan anymore 14.03 releases. However
these patches will be merged in 14.11.

David
Comment 3 Pär Lindfors 2015-02-05 04:16:30 MST
(In reply to David Bigagli from comment #2)
> Merged by Brian. Please not that we don't plan anymore 14.03 releases.
> However these patches will be merged in 14.11.

Great.

However, I only mentioned the pull request here since it was about the same part of the same man page. Merging that does not fix the documentation issues that was reported in this bug. I am setting the status of this one back to UNCONFIRMED.
Comment 4 David Bigagli 2015-02-05 04:45:21 MST
Indeed, I closed it too hastily. Please go ahead and generate the documentation 
patch if you like. We need to keep both SLURM_JOBID and SLURM_JOB_ID for
backward compatibility and so for the other variables.

SLURM_STEP_ID is 0 for a single srun task and 4294967294 for a batch step.

Thanks,
       David
Comment 5 Moe Jette 2015-02-18 05:14:04 MST
Note that I copied the environment variable information from slurm.conf to the prolog/epilog web page as part of bug 1458. I added a couple of new fields, but did not check the information previously listed. The commit is here:

https://github.com/SchedMD/slurm/commit/2e95c20b3bf9bcddd9b0fe0048e222fb8306c90b