Ticket 4269 - Setting LD_BIND_NOW=1 results in undefined symbol in plugins
Summary: Setting LD_BIND_NOW=1 results in undefined symbol in plugins
Status: RESOLVED INVALID
Alias: None
Product: Slurm
Classification: Unclassified
Component: Build System and Packaging (show other tickets)
Version: - Unsupported Older Versions
Hardware: Linux Linux
: 6 - No support contract
Assignee: Jacob Jenson
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2017-10-17 04:38 MDT by James Sharpe
Modified: 2019-03-01 16:11 MST (History)
1 user (show)

See Also:
Site: -Other-
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description James Sharpe 2017-10-17 04:38:59 MDT
This is really following on from an old mailing list post here: https://groups.google.com/d/topic/slurm-devel/eCrFsV60zQo/discussion
where a workaround for this issue was used but it doesn't resolve the underlying problem.

I've observed this on a slurm install at version 15.08 (this is a third party site that I have no control over). I don't currently have a newer slurm install to hand to check whether this is still an issue on current versions but will check when I have time.



Basically the issue is that the slurm plugins have unresolved symbols and so forcing them to be resolved at load time by setting LD_BIND_NOW=1 in the environment causes slurm jobs to fail (and hang until they timeout due to time limits, although this may also be due to a deadlock in the PMI implementation of Intel MPI)