Ticket 10099

Summary: sbatch/srun/salloc all fail with "Plugin loading failed due to missing symbols. Plugin is corrupted."
Product: Slurm Reporter: Chris Samuel (NERSC) <csamuel>
Component: User CommandsAssignee: Danny Auble <da>
Status: RESOLVED FIXED QA Contact:
Severity: 3 - Medium Impact    
Priority: --- CC: dmjacobsen
Version: 20.11.x   
Hardware: Linux   
OS: Linux   
Site: NERSC Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: 20.11.0-pre1 Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Chris Samuel (NERSC) 2020-10-29 15:00:34 MDT
Hi there,

Carrying on with trying to test 20.11 from git and I've run into a puzzle:

csamuel@gert01:/global/gscratch1/sd/csamuel/slurm/es-20.02> ./bin/sbatch --help
sbatch: fatal: plugin_load_and_link: Plugin loading failed due to missing symbols. Plugin is corrupted.

I thought I might have missed something in modifying our config to not reference anything outside of the install and perhaps it was picking up the wrong object file, but strace doesn't seem to show that:

csamuel@gert01:/global/gscratch1/sd/csamuel/slurm/es-20.02> strace -f -e openat ./bin/sbatch --help    
openat(AT_FDCWD, "/global/gscratch1/sd/csamuel/slurm/es-20.02/lib/slurm/tls/haswell/x86_64/libslurmfull.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/global/gscratch1/sd/csamuel/slurm/es-20.02/lib/slurm/tls/haswell/libslurmfull.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/global/gscratch1/sd/csamuel/slurm/es-20.02/lib/slurm/tls/x86_64/libslurmfull.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/global/gscratch1/sd/csamuel/slurm/es-20.02/lib/slurm/tls/libslurmfull.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/global/gscratch1/sd/csamuel/slurm/es-20.02/lib/slurm/haswell/x86_64/libslurmfull.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/global/gscratch1/sd/csamuel/slurm/es-20.02/lib/slurm/haswell/libslurmfull.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/global/gscratch1/sd/csamuel/slurm/es-20.02/lib/slurm/x86_64/libslurmfull.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/global/gscratch1/sd/csamuel/slurm/es-20.02/lib/slurm/libslurmfull.so", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/global/gscratch1/sd/csamuel/slurm/es-20.02/lib/slurm/libdl.so.2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/global/gscratch1/sd/csamuel/slurm/es-20.02/lib/slurm/libm.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/lib64/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/global/gscratch1/sd/csamuel/slurm/es-20.02/lib/slurm/libresolv.so.2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/lib64/libresolv.so.2", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/global/gscratch1/sd/csamuel/slurm/es-20.02/lib/slurm/libpthread.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/lib64/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/global/gscratch1/sd/csamuel/slurm/es-20.02/lib/slurm/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/global/gscratch1/sd/csamuel/slurm/es-20.02/etc/slurm.conf", O_RDONLY) = 3
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libnss_compat.so.2", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libnss_nis.so.2", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/usr/lib64/libnsl.so.2", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libtirpc.so.3", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/usr/lib64/libgssapi_krb5.so.2", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/usr/lib64/libkrb5.so.3", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/usr/lib64/libk5crypto.so.3", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libcom_err.so.2", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/usr/lib64/libkrb5support.so.0", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/usr/lib64/libkeyutils.so.1", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/usr/lib64/libpcre.so.1", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/proc/filesystems", O_RDONLY) = 3
openat(AT_FDCWD, "/etc/passwd", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/global/gscratch1/sd/csamuel/slurm/es-20.02/etc/plugstack.conf", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/global/gscratch1/sd/csamuel/slurm/es-20.02/lib/slurm/cli_filter_user_defaults.so", O_RDONLY|O_CLOEXEC) = 3
sbatch: fatal: plugin_load_and_link: Plugin loading failed due to missing symbols. Plugin is corrupted.
+++ exited with 1 +++

This doesn't seem to impact squeue, scrontab, sinfo, sdiag, scontrol.

This is a pristine git tree of master and configured with:

configure --prefix=/global/gscratch1/sd/csamuel/slurm/es-20.02


Any ideas?

All the best,
Chris

All the best,
Chris
Comment 1 Danny Auble 2020-10-29 15:23:02 MDT
Sorry Chris, thanks for reporting.  Luckily it was only this one cli_filter plugin that we missed.  Good news is you found this before 20.11 was out the door ;).

This has already been fixed in commit 8b9d4311cf8b

Please reopen if you need anything else.
Comment 2 Chris Samuel (NERSC) 2020-10-29 15:27:12 MDT
Thanks Danny, I'll pull the fixes now and rebuild.
Comment 3 Chris Samuel (NERSC) 2020-10-29 15:36:15 MDT
Confirming that's working now.

csamuel@gert01:/global/gscratch1/sd/csamuel/slurm/es-20.02> ./bin/srun -q xfer -A nstaff hostname
gert02

Thanks Danny!