Ticket 6781

Summary: --export=NONE behavior difference between srun and sbatch
Product: Slurm Reporter: Alex Mamach <alex.mamach>
Component: User CommandsAssignee: Broderick Gardner <broderick>
Status: RESOLVED DUPLICATE QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: marshall
Version: 18.08.5   
Hardware: Linux   
OS: Linux   
Site: Northwestern Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: --- DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: RHEL Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---

Description Alex Mamach 2019-03-29 15:20:12 MDT
Hello,

I was hoping to understand this difference in behavior with the --export=NONE option when used in srun vs sbatch.

When used in sbatch, attempting to execute mpirun commands will lead to errors about launch failures, presumably because some environmental variable is unset.

However, doing the same via an interactive srun job leads to mpirun commands executing successfully.

Is there a difference in how --export=NONE behaves via srun vs sbatch?

We're hoping to understand why mpirun works in srun so we can direct users who are having trouble with sbatch but still prefer the "clean" environment they get with --export=NONE.
Comment 1 Broderick Gardner 2019-04-01 14:47:13 MDT
I'm investigating this to make sure I give you correct information. I should be able to post that information within a couple days. 

Thanks
Comment 2 Broderick Gardner 2019-04-03 15:21:13 MDT
Will you post exactly what you are running and what errors you get? Specifically, what do you expect to work? (in detail; show me example commands and shell scripts)

mpirun inside an interactive srun doesn't work for me.
❯ srun --mpi=pmix -N4 -n4 --preserve-env --pty /usr/bin/bash
[broderick@caesar caesar]$ mpirun mpi/xthi
--------------------------------------------------------------------------
While trying to determine what resources are available, the SLURM
resource allocator expects to find the following environment variables:

    SLURM_NODELIST
    SLURM_TASKS_PER_NODE

However, it was unable to find the following environment variable:

    SLURM_TASKS_PER_NODE
...

This looks like a bug; let me know if you have seen the same thing or if it is working for you.


What MPI implementation and version are you using? OpenMPI, IntelMPI, etc

What version of PMI are you using, if any?

Is there particular reason you are using `mpirun` instead of `srun`?

Thanks
Comment 4 Marshall Garey 2019-04-23 16:10:55 MDT
Marking as duplicate of bug 6772.

*** This ticket has been marked as a duplicate of ticket 6772 ***