| Summary: | sbatch jobs do not source scripts in /etc/profile.d/*.sh at startup | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Ole.H.Nielsen <Ole.H.Nielsen> |
| Component: | slurmd | Assignee: | Tim Wickberg <tim> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 3 - Medium Impact | ||
| Priority: | --- | ||
| Version: | 16.05.10 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | DTU Physics | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
|
Description
Ole.H.Nielsen@fysik.dtu.dk
2017-03-22 10:16:30 MDT
(In reply to Ole.H.Nielsen@fysik.dtu.dk from comment #0) > We try to set up the environment in sbatch jobs by adding scripts to the > /etc/profile.d/ directory, but they don't seem to be sourced by the bash > shell in the job spawned by slurmd (we assume the bash shell on CentOS 7.3 > systems). > > The FAQ https://slurm.schedmd.com/faq.html#user_env says that user .profile > and .bashrc are not sourced, but how about the system files in > /etc/profile.d/*.sh? Also, the bash man-page explains under INVOCATION how > startup is done, but I don't know how slurmd does the startup. > > We would like to set up automatically for the users the batch job > environment (which may well differ from the environment on the login nodes) > such as this one: > > # cat /etc/profile.d/cpu_arch.sh > export CPU_ARCH="broadwell" > function cpu_arch { echo $CPU_ARCH } > > But in the batch job this obviously hasn't been set up: > type cpu_arch > /bin/bash: line 3: type: cpu_arch: not found > > Question 1: Can we use /etc/profile.d/*.sh scripts to set up the environment > in jobs? > > If your answer is that /etc/profile.d/*.sh scripts are ignored (and then > please add this to the FAQ), then I don't understand why the Lmod "module" > command actually works inside the job (and seen if you do "srun bash -x"): > > type module > module is a function > module () > { > eval $($LMOD_CMD bash "$@"); > [ $? = 0 ] && eval $(${LMOD_SETTARG_CMD:-:} -s sh) > } > > which I believe is initialized by the script /etc/profile.d/z00_lmod.sh that > eventually sources /usr/share/lmod/lmod/init/sh for defining the module() > function. I'm really puzzled here! The user profile is captured at job submission time, then restored verbatim. Profiles scripts and the like are not run when setting up the user environment. This is also what allows lmod to continue functioning when restored on the job. > Question 2: Can you suggest better ways to set up a default environment in > jobs, which will differ from the environment in the login nodes? The goal > is for this to be automatic so that users won't have to worry about setting > the correct CPU architecture and variables like OMP_NUM_THREADS=1. Look into using a TaskProlog script. https://slurm.schedmd.com/prolog_epilog.html (In reply to Tim Wickberg from comment #1) > The user profile is captured at job submission time, then restored verbatim. > Profiles scripts and the like are not run when setting up the user > environment. This is also what allows lmod to continue functioning when > restored on the job. Thanks, now this makes sense to me! > > Question 2: Can you suggest better ways to set up a default environment in > > jobs, which will differ from the environment in the login nodes? The goal > > is for this to be automatic so that users won't have to worry about setting > > the correct CPU architecture and variables like OMP_NUM_THREADS=1. > > Look into using a TaskProlog script. > https://slurm.schedmd.com/prolog_epilog.html OK, it's time to dive into yet another Slurm feature :-) You may close this case now, thanks. |