Ticket 2131

Summary: fix lua dlopen calls to avoid mismatch when library + -dev packages aren't in sync
Product: Slurm Reporter: Tim Wickberg <tim>
Component: slurmctldAssignee: Unassigned Developer <dev-unassigned>
Status: RESOLVED FIXED QA Contact:
Severity: 5 - Enhancement    
Priority: ---    
Version: 16.05.x   
Hardware: Linux   
OS: Linux   
See Also: https://bugs.schedmd.com/show_bug.cgi?id=3681
https://bugs.schedmd.com/show_bug.cgi?id=8453
Site: SchedMD Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: 17.02.4 17.11.0-pre1 Target Release: 16.05
DevPrio: --- Emory-Cloud Sites: ---
Attachments: 0001-refactor-common-dlopen-calls-in-lua-plugins.patch
0002-make-lua-dlopen-conditional-on-version-found-at-buil.patch

Description Tim Wickberg 2015-11-10 10:02:31 MST
smd-server has:

ii  liblua5.1-0:amd64         
ii  liblua5.1-0-dev:amd64     
ii  liblua5.2-0:amd64         
ii  liblua5.2-rrd0            
ii  lua5.1                    
ii  lua5.2                  

Note that the newest lua (5.2) doesn't have the dev headers installed. The current dlopen() call blindly tries to find the newest lua version, which in our case will not match the version discovered and linked against during build. Some sort of symbol mismatch then leads to errors like:

slurmctld: error: lua: /home/tim/15.08/etc/job_submit.lua: attempt to load a text chunk (mode is '')
slurmctld: error: Couldn't load specified plugin name for job_submit/lua: Plugin init() callback failed
slurmctld: error: cannot create job_submit context for job_submit/lua
slurmctld: fatal: failed to initialize job_submit plugin


Attached patches first refactor a common section in two affected plugins, then second patch adds some autotools magic to identify the correct version and match up the dlopen calls in that now shared xlua_dlopen() function.

This should also make it easier to support lua5.3 in the future.
Comment 1 Tim Wickberg 2015-11-10 10:03:30 MST
Created attachment 2403 [details]
0001-refactor-common-dlopen-calls-in-lua-plugins.patch
Comment 2 Tim Wickberg 2015-11-10 10:03:55 MST
Created attachment 2404 [details]
0002-make-lua-dlopen-conditional-on-version-found-at-buil.patch
Comment 3 Tim Wickberg 2015-12-02 09:42:00 MST
Promote to Sev4 and assign over to Danny for review. It keeps causing problems for me on smd-server.
Comment 4 Danny Auble 2015-12-02 10:12:22 MST
Thanks Tim, these have been committed.  I ended up putting them in 15.08 FYI.
Comment 5 Tim Wickberg 2015-12-15 03:44:08 MST
Need to revisit in 16.05 with additional changes to autoconf macros and properly handling differences in packaging between RHEL and Debian distributions.

Bug #2243 documents how this fix did not work as intended originally.
Comment 6 Danny Auble 2017-05-19 15:54:34 MDT
This is fixed in commit e75f6118540a