Bug 2440 - undefined symbol powercap_get_cluster_current_cap
Summary: undefined symbol powercap_get_cluster_current_cap
Status: RESOLVED DUPLICATE of bug 2443
Alias: None
Product: Slurm
Classification: Unclassified
Component: Other (show other bugs)
Version: 15.08.7
Hardware: Linux Linux
: --- 4 - Minor Issue
Assignee: Director of Support
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2016-02-10 04:54 MST by Moe Jette
Modified: 2016-02-19 08:45 MST (History)
0 users

See Also:
Site: -Other-
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Moe Jette 2016-02-10 04:54:16 MST
Nenad Vukicevic <nenad@intrepid.com>:

I was able to build Slurm RPMs on Fedora 23 by following suggestion in
Adam Huffman's post -
https://groups.google.com/forum/#!topic/slurm-devel/HiltSkNiGJU.

However, slurmd fails to run with the following error:

----
Feb 09 18:55:34 dev slurmd[6700]: error: plugin_load_from_file:
dlopen(/usr/lib64/slurm/select_cons_res.so):
/usr/lib64/slurm/select_cons_res.so: undefined symbol:
powercap_get_cluster_current_cap
Feb 09 18:55:34 dev slurmd[6700]: error: Couldn't load specified
plugin name for select/cons_res: Dlopen of plugin file failed
Feb 09 18:55:34 dev slurmd[6700]: fatal: Can't find plugin for select/cons_res
----

Our slurm uses scheduling type cons_res:

SelectType=select/cons_res

The powercap_get_cluster_current_cap() is defined in
slurmctld/powercapping.c which is not used when slurmd is being
linked.

Any idea what is going on?  I am using the latest 15.08.7.
Comment 1 Moe Jette 2016-02-10 04:56:43 MST
There are several functions referenced in the select/cons_res plugin that only exist in the slurmctld, so there will be linking issues in some environments. For example, these functions exist only in slurmctld:

		if ((powercap_get_cluster_current_cap() != 0) &&
		    (which_power_layout() == 2)) {

Note this logic was added for the Bull-power management and is completely separate from the power plugin work for Cray systems.
Comment 2 Danny Auble 2016-02-10 05:13:43 MST
I find it strange it is trying to resolve these on load, it should be doing a lazy load from my understanding right?
Comment 3 Moe Jette 2016-02-10 05:25:21 MST
(In reply to Danny Auble from comment #2)
> I find it strange it is trying to resolve these on load, it should be doing
> a lazy load from my understanding right?

Yes (see src/common/plugin.c):
plug = dlopen(fq_path, RTLD_LAZY);

Perhaps Fedora 23 doesn't handle "lazy" very well.
Comment 4 Tim Wickberg 2016-02-10 05:35:25 MST
https://fedoraproject.org/wiki/Changes/Harden_All_Packages

"-z now is always passed to the linker."

So... they're resolving everything at link time.

Clearing "-z now" out of LDFLAGS looks like the simplest path forward, short of reworking slurm's module loading system.
Comment 5 Danny Auble 2016-02-10 05:38:44 MST
Yeah, it could be fairly tough otherwise to get everything working correctly.
Comment 6 Tim Wickberg 2016-02-19 08:45:03 MST
Marking as duplicate of 2443 as that one has a patch that should address the -z,now problem.

*** This bug has been marked as a duplicate of bug 2443 ***