Yo David We're hearing some squawks about PMI2 under Slurm that sound eerily like something we saw from Cray that causes lots of problems. Basically, it appears that PMI2 is opening some file descriptors to the spawned processes that are expected to "persist", even across fork/exec boundaries. In other words, if a process launched by Slurm wants to spawn a child process, the normal procedure is to (a) fork, (b) close all file descriptors other than 0-2, and then (c) exec. However, if you do this with PMI2 active, then PMI2 will barf. Here is a very simple way to demonstrate the problem, courtesy of one user: Here’s the PMI-only “this violates ‘no surprises’” demonstration. (Nice that I still had a couple of those PMI programs hanging around.) (18:15)m80<SALLOC:8on1>:~/upc$ cat pmi2-003.c /* cc -Wall -I/opt/slurm/include pmi2-003.c -L/opt/slurm/lib64 -lpmi2 cc -Wall -I$SLURM_ROOT/include pmi2-003.c -L$SLURM_ROOT/lib64 -lpmi2 */ #include "slurm/pmi2.h" int main(int argc, char **argv) { int spawned = -1, size = -1, rank = -1, appnum = -1; return PMI2_Init(&spawned, &size, &rank, &appnum); } (18:15)m80<SALLOC:8on1>:~/upc$ cc -Wall -I$SLURM_ROOT/include pmi2-003.c -L$SLURM_ROOT/lib64 -lpmi2 (18:16)m80<SALLOC:8on1>:~/upc$ srun -n 8 ./a.out (18:16)m80<SALLOC:8on1>:~/upc$ srun -n 8 bash -cf ./a.out (18:16)m80<SALLOC:8on1>:~/upc$ srun -n 8 csh -cf ./a.out srun: error: n016: tasks 3-4: Exited with exit code 14 (18:16)m80<SALLOC:8on1>:~/upc$ Note that bash doesn't close fd's prior to exec, but csh does. We can all argue about which behavior is "correct", but the fact remains that closing fd's is a long acknowledged (and even taught!) best-practice. Can you help us fix this mess? All that is required is for Slurm to pass an envar with the PMI2 server's socket, and for the PMI2 client to open its own socket during PMI2_Init to connect to the server. Thanks Ralph
I think this is the same problem I dealt with back in the days. What is happening is that srun sets in the environment the variable PMI2_fd which tells the PMI2 library which socket to talk to with the PMI2 backend, unfortunately smart csh closes some of these file descriptors before starting the application. Indeed it does not happen always, for sure it closes fd's 16, 18 which I tested using 4 component job, if I use only 2 no problems happen. I can see if on the srun side we can allocate higher socket number. David
Fixed in commit: 084787c0d8f26 Thanks, David