We try to set up the environment in sbatch jobs by adding scripts to the /etc/profile.d/ directory, but they don't seem to be sourced by the bash shell in the job spawned by slurmd (we assume the bash shell on CentOS 7.3 systems). The FAQ https://slurm.schedmd.com/faq.html#user_env says that user .profile and .bashrc are not sourced, but how about the system files in /etc/profile.d/*.sh? Also, the bash man-page explains under INVOCATION how startup is done, but I don't know how slurmd does the startup. We would like to set up automatically for the users the batch job environment (which may well differ from the environment on the login nodes) such as this one: # cat /etc/profile.d/cpu_arch.sh export CPU_ARCH="broadwell" function cpu_arch { echo $CPU_ARCH } But in the batch job this obviously hasn't been set up: type cpu_arch /bin/bash: line 3: type: cpu_arch: not found Question 1: Can we use /etc/profile.d/*.sh scripts to set up the environment in jobs? If your answer is that /etc/profile.d/*.sh scripts are ignored (and then please add this to the FAQ), then I don't understand why the Lmod "module" command actually works inside the job (and seen if you do "srun bash -x"): type module module is a function module () { eval $($LMOD_CMD bash "$@"); [ $? = 0 ] && eval $(${LMOD_SETTARG_CMD:-:} -s sh) } which I believe is initialized by the script /etc/profile.d/z00_lmod.sh that eventually sources /usr/share/lmod/lmod/init/sh for defining the module() function. I'm really puzzled here! Question 2: Can you suggest better ways to set up a default environment in jobs, which will differ from the environment in the login nodes? The goal is for this to be automatic so that users won't have to worry about setting the correct CPU architecture and variables like OMP_NUM_THREADS=1.
(In reply to Ole.H.Nielsen@fysik.dtu.dk from comment #0) > We try to set up the environment in sbatch jobs by adding scripts to the > /etc/profile.d/ directory, but they don't seem to be sourced by the bash > shell in the job spawned by slurmd (we assume the bash shell on CentOS 7.3 > systems). > > The FAQ https://slurm.schedmd.com/faq.html#user_env says that user .profile > and .bashrc are not sourced, but how about the system files in > /etc/profile.d/*.sh? Also, the bash man-page explains under INVOCATION how > startup is done, but I don't know how slurmd does the startup. > > We would like to set up automatically for the users the batch job > environment (which may well differ from the environment on the login nodes) > such as this one: > > # cat /etc/profile.d/cpu_arch.sh > export CPU_ARCH="broadwell" > function cpu_arch { echo $CPU_ARCH } > > But in the batch job this obviously hasn't been set up: > type cpu_arch > /bin/bash: line 3: type: cpu_arch: not found > > Question 1: Can we use /etc/profile.d/*.sh scripts to set up the environment > in jobs? > > If your answer is that /etc/profile.d/*.sh scripts are ignored (and then > please add this to the FAQ), then I don't understand why the Lmod "module" > command actually works inside the job (and seen if you do "srun bash -x"): > > type module > module is a function > module () > { > eval $($LMOD_CMD bash "$@"); > [ $? = 0 ] && eval $(${LMOD_SETTARG_CMD:-:} -s sh) > } > > which I believe is initialized by the script /etc/profile.d/z00_lmod.sh that > eventually sources /usr/share/lmod/lmod/init/sh for defining the module() > function. I'm really puzzled here! The user profile is captured at job submission time, then restored verbatim. Profiles scripts and the like are not run when setting up the user environment. This is also what allows lmod to continue functioning when restored on the job. > Question 2: Can you suggest better ways to set up a default environment in > jobs, which will differ from the environment in the login nodes? The goal > is for this to be automatic so that users won't have to worry about setting > the correct CPU architecture and variables like OMP_NUM_THREADS=1. Look into using a TaskProlog script. https://slurm.schedmd.com/prolog_epilog.html
(In reply to Tim Wickberg from comment #1) > The user profile is captured at job submission time, then restored verbatim. > Profiles scripts and the like are not run when setting up the user > environment. This is also what allows lmod to continue functioning when > restored on the job. Thanks, now this makes sense to me! > > Question 2: Can you suggest better ways to set up a default environment in > > jobs, which will differ from the environment in the login nodes? The goal > > is for this to be automatic so that users won't have to worry about setting > > the correct CPU architecture and variables like OMP_NUM_THREADS=1. > > Look into using a TaskProlog script. > https://slurm.schedmd.com/prolog_epilog.html OK, it's time to dive into yet another Slurm feature :-) You may close this case now, thanks.