| Summary: | Link libpmi.so.0 with libslurmfull.so leads to bad debug function call and segv of application | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Regine Gaudin <regine.gaudin> |
| Component: | Other | Assignee: | Felip Moll <felip.moll> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | 3 - Medium Impact | ||
| Priority: | --- | CC: | bart.oldeman, regine.gaudin, remi.lacroix |
| Version: | 18.08.6 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| See Also: |
https://bugs.schedmd.com/show_bug.cgi?id=7448 https://bugs.schedmd.com/show_bug.cgi?id=4918 |
||
| Site: | CEA | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | 20.11.0pre1 | |
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
| Ticket Depends on: | 7448 | ||
| Ticket Blocks: | |||
Hi, We are currently aware of this issue and working out a solution. Note that this is only happening with libpmi. libpmi2 does not need to link against libslurmfull.so (or any slurm lib) and doesn't have this problem, but as you may already know Intel MPI 2019 has dropped support for PMI-2. See where and why we introduced this: https://bugs.schedmd.com/show_bug.cgi?id=4918#c47 And the talk with open-mpi: https://github.com/open-mpi/ompi/issues/5145 I guess Intel MPI is having the same issue. I will inform you when we've committed a definitive solution. "I guess Intel MPI is having the same issue." I'm not sure as there is no more "hard" versionning dependency between libpmi and slurm which weas due in some slurm17 spec files generating /usr/lib64/libslurm.la and /usr/lib64/libpmi.la for slurm-devel ------------------------------------------------------------------------ SLURM 17 : no problem with reproducer We have also fallen in bug 4918 in slurm17 and suppressing .la files has been enough to suppress mpi/openmpi dependency versionning because: The dependency versioning was due to .la files: grep slurm /usr/lib64/libslurm.la # libslurm.la - a libtool library file dlname='libslurm.so.29' library_names='libslurm.so.29.0.0 libslurm.so.29 libslurm.so' old_library='libslurm.a' # Version information for libslurm. # grep slurm /usr/lib64/libpmi.la dependency_libs=' /usr/lib64/libslurm.la -ldl -lpthread' We had suppressed these .la file. In slurm 17, libpmi.so is linked with -lslurm, as libslurm.so is the a link with versionned lib the is not direct versionning dependency. ls -altr /usr/lib64/libslurm* 1 root root 7383000 Mar 19 2019 /usr/lib64/libslurm.so.32.0.0 1 root root 18 Mar 20 2019 /usr/lib64/libslurm.so.32 -> libslurm.so.32.0.0 1 root root 18 Mar 20 2019 /usr/lib64/libslurm.so -> libslurm.so.32.0.0 IntelMPI + reproducer application does not lead to the segv ****************************************************************************** Now in slurm 18, there is no depencency versioning of libpmi.so for two reasons no .la files generated with your slurm.spec and you have linked libpmi with libslurmfull.so (-lslurmfull). So I don't think linking with -lslurmfull is necessary, the no generation of .la files is enough. If I rebuild libpmi.so with -lslurm (as in slurm17) there is no dependency versionning between pmi/openmpi as no .la files and the INTELMPI + reproducer does no more SEGV. ldd ~gaudin/libpmi.so libslurm.so.33 => /lib64/libslurm.so.33 (0x00002ae0e5343000) Sep 30 10:29 /usr/lib64/libslurm.so.33.0.0 1726536 Sep 30 10:29 /usr/lib64/libslurmdb.so.33.0.0 18 Sep 30 10:33 /usr/lib64/libslurm.so -> libslurm.so.33.0.0 *************************************************************************** dependency versioning libpmi.so reproducer slurm 17 no (no .la files) -lslurm ok slurm 18 no (no .la files +? -lslurmfull) -lslurmfull.so segv slurm 18 no (no .la files) -lslurm ok Thanks Regine As the failure seels to be due to a conjunction of calling debug function of slurmfull.so and debug variable definition in application, I ha As the failure is a conjunction of calling debug function in slurmfull.so and definition of debug variable in application, I have written another reproducer outside of MPI. I hope it does make sense:
cat hello.c
#include <stdio.h>
#include <slurm.h>
static int my_init_slurm_conf(const char *file_name)
{
char *name = (char *)file_name;
int rc =0;
debug("Reading slurm.conf file: %s", name);
return rc;
}
void main(argc,argv)
int argc;
char *argv[];
{
if (my_init_slurm_conf("/etc/slurm/slurm.conf") == 0)
printf("read succeed \n");
else
printf("read failed \n");
}
icc -g -o hello hello.c -I/usr/include/slurm -L/usr/lib64/slurm -lslurmfull
ldd hello
linux-vdso.so.1 => (0x00007ffdcd968000)
libslurmfull.so => /usr/lib64/slurm/libslurmfull.so (0x00007f162a9d9000)
libm.so.6 => /lib64/libm.so.6 (0x00007f162a6c9000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f162a4b3000)
libc.so.6 => /lib64/libc.so.6 (0x00007f162a0f2000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f1629eed000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f1629cd1000)
/lib64/ld-linux-x86-64.so.2 (0x00007f162ada5000)
./hello
read succeed
Now if I link with just a c file defining debug variable it segvs in debug call
cat just_define_debug.c
int debug = 0;
icc -fPIC -c -g just_define_debug.c
icc -shared just_define_debug.o -o ./libsup7intel.so
icc -g -o hello hello.c just_define_debug.c -I/usr/include/slurm -L/usr/lib64/slurm -lslurmfull -L$PWD -lsup7intel
ldd hello
linux-vdso.so.1 => (0x00007ffd027e2000)
libslurmfull.so => /usr/lib64/slurm/libslurmfull.so (0x00007fdb4ec2b000)
libsup7intel.so => /ccc/home/cont001/ocre/gaudinr/debug_so/libsup7intel.so (0x00007fdb4ea28000)
./hello
Segmentation fault
(gdb) where
#0 0x0000000000601030 in debug ()
#1 0x0000000000400674 in my_init_slurm_conf (file_name=0x40078c "/etc/slurm/slurm.conf") at hello.c:7
#2 0x0000000000400698 in main (argc=1, argv=0x7fffffffe1a8) at hello.c:15
If I rename debug variable in debug2 in works fine
cat just_define_debug.c
int debug2 = 0;
icc -fPIC -c -g just_define_debug.c
(=
$ icc -shared just_define_debug.o -o ./libsup7intel.so
$ icc -g -o hello hello.c just_define_debug.c -I/usr/include/slurm -L/usr/lib64/slurm -lslurmfull -L$PWD -lsup7intel
./hello
read succeed
I have to study more your comments, but the fact we are linking with the .la files is because OpenMPI (i.e. libtool) uses rpath/runpath linking. If the .la is removed from the system one will need to set LD_LIBRARY_PATH manually because rpath won't be set in the elf file. This situation is undesirable since most people is not installing the softwares in standard paths, like you seem to be doing (/usr/lib64/). You can read through https://github.com/open-mpi/ompi/issues/5145 See my comment here suggesting to remove .la files and further responses: https://github.com/open-mpi/ompi/issues/5145#issuecomment-387350602 The other solution we are working on is to create a new API for pmi and make others link to this unversioned one, but we're still working on it. I will work on this issue more tomorrow. Thanks Hi, That's just a note to confirm that we're working on this specific issue in the other internal bug 7980. I will let you know when it is fixed, or if you prefer I can close this issue and you will be automatically notified when bug 7980 is resolved. Hi Keep this one opened and notify when internal 7980 will be fixed. Thanks Regine Just a note that we (Compute Canada, where the Cedar cluster recently upgraded to Slurm 19) have also been bitten by this bug but using Open MPI instead. If Open MPI is compiled with --disable-dlopen so a compiled application directly links to libpmi.so it's even worse and can be reproduced using the simple:
#include <mpi.h>
int debug;
int main(int argc, char **argv) {
MPI_Init(&argc, &argv);
MPI_Finalize();
return 0;
}
We initially found it to crash with Gromacs 5.1.4, which also defines a "debug" variable.
Of course, as mentioned, with OpenMPI, using PMIx or libpmi2 avoids the issue, but still the underlying issue is that libslurmfull.so makes all its symbols global, unlike libslurm.so which only exports symbols following src/api/version.map:
{ global:
islurm_*;
slurm_*;
slurmdb_*;
plugin_context_*;
working_cluster_rec;
local: *;
};
so it seems to me the cure for fixing the .la versioning annoyance is worse than the disease...
Increase priority as another application is failing in this bug. War replacing debug by slurm_debug in _init_slurm_conf function is in a process of validation (In reply to Regine Gaudin from comment #9) > Increase priority as another application is failing in this bug. > > War replacing debug by slurm_debug in _init_slurm_conf function is in a > process > of validation Hi, I am trying to get the patch which fixes this issue commited in upstream ASAP. Will let you know, and will see if I can provide it to you as a pre-patch. (In reply to Felip Moll from comment #10) > (In reply to Regine Gaudin from comment #9) > > Increase priority as another application is failing in this bug. > > > > War replacing debug by slurm_debug in _init_slurm_conf function is in a > > process > > of validation > > Hi, I am trying to get the patch which fixes this issue commited in upstream > ASAP. > > Will let you know, and will see if I can provide it to you as a pre-patch. Hi Regine, I am trying to push for commiting the patch which will fix the issue in bug 7980. I just only wanted to let you know I haven't forgot about it. It's quite a big modification so I prefer first it to be reviewed before giving it to you directly. Sorry for the inconvenience and long delay. Hi, I just want to inform you again that bug 7448 is in our internal review process and when the proposed patch is accepted by the QA team, it will fix your issues. I'll inform when it is done. Thanks for your patience and sorry for being suffering this delay. Hi, (In reply to Felip Moll from comment #13) > I just want to inform you again that bug 7448 is in our internal review > process and when the proposed patch is accepted by the QA team, it will fix > your issues. > > I'll inform when it is done. Any news about this? We are facing the same bug at IDRIS (supercomputing center in France). Best regards, Rémi Lacroix (In reply to Rémi Lacroix from comment #14) > Hi, > > (In reply to Felip Moll from comment #13) > > I just want to inform you again that bug 7448 is in our internal review > > process and when the proposed patch is accepted by the QA team, it will fix > > your issues. > > > > I'll inform when it is done. > > Any news about this? > > We are facing the same bug at IDRIS (supercomputing center in France). > > Best regards, > Rémi Lacroix Hi Rémi, I don't have any news. We're in the middle of QA for 20.02 release and bug 7448 hasn't been considered yet for review. I'll inform when it is. I don't have you in our contact list, are you working with Philippe Collinet and are authorized to open/interact with IDRIS bugs? Thanks (In reply to Felip Moll from comment #15) > I don't have any news. We're in the middle of QA for 20.02 release and bug > 7448 hasn't been considered yet for review. I'll inform when it is. Ok, any chance for it be backported to the 19.X branch? I am not sure we are planning to upgrade to 20.X in the near future. > I don't have you in our contact list, are you working with Philippe Collinet > and are authorized to open/interact with IDRIS bugs? I am working with Philippe but I'm not a sysadmin at IDRIS, I'm part of the user support team (so working on MPI, HPC codes, that sort of things). Philippe usually open the bugs but I interact directly with them afterwards (or when they exist already like this one). Rémi > Ok, any chance for it be backported to the 19.X branch? I am not sure we are
> planning to upgrade to 20.X in the near future.
Not officially, but you'll probably be able to backport it.
Hi, finally the issue has been fixed and you shouldn't experience this anymore in versions starting at 20.11 (current master). commit d3585a55d5820ee7ae0108d0b730ce9e9be661f8 Author: Broderick Gardner <broderick@schedmd.com> AuthorDate: Thu Sep 19 11:14:04 2019 -0600 Add libslurm_pmi.so as unversioned lib. Allows libpmi.so to link to an unversioned slurm lib, which will help avoid issues when statically compiling OpenMPI. Bug 7448. |
Hello Please find: ================= PROBLEM DESCRIPTION ===================== One fortran + c application using IntelMPI is failing in segv after upgrade of slurm in 18.08.06.Problem does not appear with OPENMPI. LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PWD srun -c 1 -n 1 -N 1-1 -p broadwell --time 1:0 ./toto HERE forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source toto 0000000000403CD4 Unknown Unknown Unknown libpthread-2.17.s 00002B87764C85D0 Unknown Unknown Unknown libsup7_intel.so 00002B877507401C debug Unknown Unknown srun: error: eio_message_socket_accept: slurm_receive_msg[@ip]: Zero Bytes were transmitted or received srun: error: machine17: task 0: Exited with exit code 174 Regarding the stack the failure happens after going through pmi.so then slurmfull.so and then using debug defined symbol of application instead of calling debug function of slurm. We have then made a reproducer: (gdb) #0 0x00002aaaab0db01c in debug () from ~gaudin/debug_so/libsup7_intel.so #1 0x00002aaaaf0f2860 in _init_slurm_conf () from /usr/lib64/slurm/libslurmfull.so #2 0x00002aaaaf0f83d1 in slurm_conf_lock () from /usr/lib64/slurm/libslurmfull.so #3 0x00002aaaaf10bb0b in slurm_get_tcp_timeout () from /usr/lib64/slurm/libslurmfull.so #4 0x00002aaaaf148d58 in slurm_open_stream () from /usr/lib64/slurm/libslurmfull.so #5 0x00002aaaaf10d564 in slurm_open_msg_conn () from /usr/lib64/slurm/libslurmfull.so #6 0x00002aaaaf10fea7 in slurm_send_recv_rc_msg_only_one () from /usr/lib64/slurm/libslurmfull.so #7 0x00002aaaaf088d71 in slurm_send_kvs_comm_set () from /usr/lib64/slurm/libslurmfull.so #8 0x00002aaaaee33c89 in PMI_KVS_Commit () from /usr/lib64/libpmi.so.0 #9 0x00002aaaab90882a in iPMI_Get_r2h_table (table=0x2aaaaf1b2595) at ../../src/pmi/simple/simple_pmi.c:1780 #10 0x00002aaaab90b5dd in iPMI_Init_Ext () at ../../src/pmi/simple/simple_pmi.c:360 #11 0x00002aaaab7d9701 in MPID_Init (argc=0x2aaaaf1b2595, argv=0x2aaaaf1a87dc, requested=1, provided=0x43, has_args=0x0, has_env=0x8) at ../../src/mpid/ch3/src/mpid_init.c:2141 #12 0x00002aaaab77a64b in MPIR_Init_thread (argc=0x2aaaaf1b2595, argv=0x2aaaaf1a87dc, required=1, provided=0x43) at ../../src/mpi/init/initthread.c:717 #13 0x00002aaaab767bdb in PMPI_Init (argc=0x2aaaaf1b2595, argv=0x2aaaaf1a87dc) at ../../src/mpi/init/init.c:253 #14 0x00002aaaab1b5240 in pmpi_init_ (ierr=0x2aaaaf1b2595) at ../../src/binding/fortran/mpif_h/initf.c:275 #15 0x00000000004039e9 in main () at toto.f90:10 #16 0x000000000040395e in main () ^Csrun: interrupt (one more within 1 sec to abort) ======================= REPRODUCER ****************************************** ************** c file just defining debug variable**** cat toto.c int debug = 0; ************** fortran file calling basic MPI calls**** cat toto.f90 PROGRAM main USE mpi IMPLICIT NONE INTEGER :: info, comm ! Begin MPI session PRINT*, "HERE" CALL MPI_INIT ( info ) PRINT*, "SHERE2" comm = MPI_COMM_WORLD PRINT*, "SHERE" STOP END PROGRAM **********************compiling with intelmpi *********************** cat compil.sh #/bin/sh module purge module load intel/17.0.6.256 module load mpi/intelmpi/2018.0.3.222 export LIB=_intel export CC=mpicc rm -rf toto libsup7${LIB}.so rm -f *.o ${CC} ${CFLAGS} -fPIC -c toto.c -g -O2 ${CC} -shared toto.o -g -O2 -o ./libsup7${LIB}.so rm -f *.o mpif90 -g -O2 toto.f90 -o toto -L./ -lsup7${LIB} *************** launching using local lib sup7******* cat launch.sh #!/usr/bin/env bash module purge module load intel/17.0.6.256 module load mpi/intelmpi/2018.0.3.222 (some ~ paths have modified for confidentiality) LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PWD ldd toto linux-vdso.so.1 => (0x00007fff8e138000) /usr/$LIB/libshook.so => /usr/lib64/libshook.so (0x00002aab3953d000) libsup7_intel.so => ~gaudin/debug_so/libsup7_intel.so (0x00002aab39748000) libmpifort.so.12 =>~intelmpi-2018.0.3.222/system/default/lib64/libmpifort.so.12 (0x00002aab3994a000) libmpi.so.12 ==>~intelmpi-2018.0.3.222/system/default/lib64/libmpi.so.12 (0x00002aab39cf3000) libdl.so.2 => /lib64/libdl.so.2 (0x00002aab3a982000) librt.so.1 => /lib64/librt.so.1 (0x00002aab3ab86000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00002aab3ad8e000) libm.so.6 => /lib64/libm.so.6 (0x00002aab3afaa000) libc.so.6 => /lib64/libc.so.6 (0x00002aab3b2ac000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00002aab3b679000) liblustreapi.so => /lib64/liblustreapi.so (0x00002aab3b88f000) /lib64/ld-linux-x86-64.so.2 (0x00002aab39319000) libimf.so => /opt/intel/ifort-17.0.6.256/system/default/lib/intel64/libimf.so (0x00002aab3baaa000) libsvml.so => /opt/intel/ifort-17.0.6.256/system/default/lib/intel64/libsvml.so (0x00002aab3bf97000) libirng.so => /opt/intel/ifort-17.0.6.256/system/default/lib/intel64/libirng.so (0x00002aab3ceb5000) libintlc.so.5 => /opt/intel/ifort-17.0.6.256/system/default/lib/intel64/libintlc.so.5 (0x00002aab3d228000) LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PWD srun -c 1 -n 4 -N 1-1 -p broadwell --time 1:0 ./toto forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source toto 0000000000403CD4 Unknown Unknown Unknown libpthread-2.17.s 00002B68B0E2A5D0 Unknown Unknown Unknown libsup7_intel.so 00002B68AF9D601C debug Unknown Unknown forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source toto 0000000000403CD4 Unknown Unknown Unknown libpthread-2.17.s 00002B3DE64465D0 Unknown Unknown Unknown libsup7_intel.so 00002B3DE4FF201C debug Unknown Unknown forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source ======================= WAR ========================================= Linking libpmi.so with -lslurm instead of -lslurmfull (as it was in slurm17 solves the problem) ldd ~gaudin/libpmi.so linux-vdso.so.1 => (0x00007ffe3bf95000) /usr/$LIB/libshook.so => /usr/lib64/libshook.so (0x00002ae0e4f34000) libdl.so.2 => /lib64/libdl.so.2 (0x00002ae0e513f000) libslurm.so.33 => /lib64/libslurm.so.33 (0x00002ae0e5343000) =========> libpthread.so.0 => /lib64/libpthread.so.0 (0x00002ae0e56f3000) libc.so.6 => /lib64/libc.so.6 (0x00002ae0e590f000) liblustreapi.so => /lib64/liblustreapi.so (0x00002ae0e5cdc000) /lib64/ld-linux-x86-64.so.2 (0x00002ae0e4b0a000 export I_MPI_PMI_LIBRARY=~gaudin/libpmi.so $LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PWD srun -c 1 -n 1 -N 1-1 -p broadwell -A root --time 1:0 ./toto HERE SHERE2 SHERE while ldd /usr/lib64/libpmi.so (with slurm18) linux-vdso.so.1 => (0x00007ffe65def000) /usr/$LIB/libshook.so => /usr/lib64/libshook.so (0x00002b3115979000) libdl.so.2 => /lib64/libdl.so.2 (0x00002b3115b84000) libslurmfull.so => /usr/lib64/slurm/libslurmfull.so (0x00002b3115d88000) =========> libpthread.so.0 => /lib64/libpthread.so.0 (0x00002b3116153000) libc.so.6 => /lib64/libc.so.6 (0x00002b311636f000) liblustreapi.so => /lib64/liblustreapi.so (0x00002b311673c000) /lib64/ld-linux-x86-64.so.2 (0x00002b311554f000) LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PWD srun -c 1 -n 1 -N 1-1 -p broadwell -A root --time 1:0 ./toto HERE forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source toto 0000000000403CD4 Unknown Unknown Unknown libpthread-2.17.s 00002B87764C85D0 Unknown Unknown Unknown libsup7_intel.so 00002B877507401C debug Unknown Unknown srun: error: eio_message_socket_accept: slurm_receive_msg[@ip ] : Zero Bytes were transmitted or received srun: error: machine1217: task 0: Exited with exit code 174 What is slurmfull.so for as it is not the first time we have unresolved symbol or bad resolution linking with it? Thanks Regine