Ticket 2131 - fix lua dlopen calls to avoid mismatch when library + -dev packages aren't in sync
Summary: fix lua dlopen calls to avoid mismatch when library + -dev packages aren't in...
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: slurmctld (show other tickets)
Version: 16.05.x
Hardware: Linux Linux
: --- 5 - Enhancement
Assignee: Unassigned Developer
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2015-11-10 10:02 MST by Tim Wickberg
Modified: 2020-02-05 11:47 MST (History)
0 users

See Also:
Site: SchedMD
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 17.02.4 17.11.0-pre1
Target Release: 16.05
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
0001-refactor-common-dlopen-calls-in-lua-plugins.patch (11.09 KB, patch)
2015-11-10 10:03 MST, Tim Wickberg
Details | Diff
0002-make-lua-dlopen-conditional-on-version-found-at-buil.patch (2.72 KB, patch)
2015-11-10 10:03 MST, Tim Wickberg
Details | Diff

Note You need to log in before you can comment on or make changes to this ticket.
Description Tim Wickberg 2015-11-10 10:02:31 MST
smd-server has:

ii  liblua5.1-0:amd64         
ii  liblua5.1-0-dev:amd64     
ii  liblua5.2-0:amd64         
ii  liblua5.2-rrd0            
ii  lua5.1                    
ii  lua5.2                  

Note that the newest lua (5.2) doesn't have the dev headers installed. The current dlopen() call blindly tries to find the newest lua version, which in our case will not match the version discovered and linked against during build. Some sort of symbol mismatch then leads to errors like:

slurmctld: error: lua: /home/tim/15.08/etc/job_submit.lua: attempt to load a text chunk (mode is '')
slurmctld: error: Couldn't load specified plugin name for job_submit/lua: Plugin init() callback failed
slurmctld: error: cannot create job_submit context for job_submit/lua
slurmctld: fatal: failed to initialize job_submit plugin


Attached patches first refactor a common section in two affected plugins, then second patch adds some autotools magic to identify the correct version and match up the dlopen calls in that now shared xlua_dlopen() function.

This should also make it easier to support lua5.3 in the future.
Comment 1 Tim Wickberg 2015-11-10 10:03:30 MST
Created attachment 2403 [details]
0001-refactor-common-dlopen-calls-in-lua-plugins.patch
Comment 2 Tim Wickberg 2015-11-10 10:03:55 MST
Created attachment 2404 [details]
0002-make-lua-dlopen-conditional-on-version-found-at-buil.patch
Comment 3 Tim Wickberg 2015-12-02 09:42:00 MST
Promote to Sev4 and assign over to Danny for review. It keeps causing problems for me on smd-server.
Comment 4 Danny Auble 2015-12-02 10:12:22 MST
Thanks Tim, these have been committed.  I ended up putting them in 15.08 FYI.
Comment 5 Tim Wickberg 2015-12-15 03:44:08 MST
Need to revisit in 16.05 with additional changes to autoconf macros and properly handling differences in packaging between RHEL and Debian distributions.

Bug #2243 documents how this fix did not work as intended originally.
Comment 6 Danny Auble 2017-05-19 15:54:34 MDT
This is fixed in commit e75f6118540a