Hi while building slurm 19.05.5 within a CentOS 8 container was possible after applying the following patch to the slurm.spec file, we see problems with the slurm shared libraries, that are not there if we build the same source within a CentOS 7 container. diff org/slurm-19.05.5/slurm.spec centos8/slurm-19.05.5/slurm.spec 65c65 < BuildRequires: python --- > BuildRequires: python3 Some functions like scontrol seems to work root@slurm_master / $ scontrol PING Slurmctld(primary) at slurm_master is UP but others like sinfo not root@slurm_master / $ sinfo sinfo: error: plugin_load_from_file: dlopen(/usr/lib64/slurm/select_cons_res.so): /usr/lib64/slurm/select_cons_res.so: undefined symbol: powercap_get_cluster_current_cap sinfo: error: Couldn't load specified plugin name for select/cons_res: Dlopen of plugin file failed sinfo: error: plugin_load_from_file: dlopen(/usr/lib64/slurm/select_cons_tres.so): /usr/lib64/slurm/select_cons_tres.so: undefined symbol: powercap_get_cluster_current_cap sinfo: error: Couldn't load specified plugin name for select/cons_tres: Dlopen of plugin file failed sinfo: error: plugin_load_from_file: dlopen(/usr/lib64/slurm/select_cray_aries.so): /usr/lib64/slurm/select_cray_aries.so: undefined symbol: unlock_slurmctld sinfo: error: Couldn't load specified plugin name for select/cray_aries: Dlopen of plugin file failed sinfo: error: plugin_load_from_file: dlopen(/usr/lib64/slurm/select_linear.so): /usr/lib64/slurm/select_linear.so: undefined symbol: slurm_job_preempt_mode sinfo: error: Couldn't load specified plugin name for select/linear: Dlopen of plugin file failed sinfo: fatal: Can't find plugin for select/cons_res Maybe this problem is related to https://bugs.schedmd.com/show_bug.cgi?id=2443 If required we can provide the resulting rpms ~19MB Best Regards, Stephan Walter
Hi I'm updating this bug as I have the same after build of slurm 19.05.5 within a CentOS 8 sacct sacct: error: plugin_load_from_file: dlopen(/usr/lib64/slurm/accounting_storage_slurmdbd.so): /usr/lib64/slurm/accounting_storage_slurmdbd.so: undefined symbol: unlock_slurmctld sacct: error: Couldn't load specified plugin name for accounting_storage/slurmdbd: Dlopen of plugin file failed sacct: error: cannot create accounting_storage context for accounting_storage/slurmdbd Slurm unable to initialize storage plugin slurmctld -D -vv slurmctld: debug: Log file re-opened slurmctld: pidfile not locked, assuming no running daemon slurmctld: error: Configured MailProg is invalid slurmctld: slurmctld version 19.05.3-2 started on cluster vm slurmctld: Munge credential signature plugin loaded slurmctld: debug: Munge authentication plugin loaded slurmctld: Cray/Aries node selection plugin loaded slurmctld: Linear node selection plugin loaded with argument 4356 slurmctld: Consumable Resources (CR) Node Selection plugin loaded with argument 4356 slurmctld: select/cons_tres loaded with argument 4356 slurmctld: preempt/none loaded slurmctld: debug: Checkpoint plugin loaded: checkpoint/none slurmctld: debug: AcctGatherEnergy NONE plugin loaded slurmctld: debug: AcctGatherProfile NONE plugin loaded slurmctld: debug: AcctGatherInterconnect NONE plugin loaded slurmctld: debug: AcctGatherFilesystem NONE plugin loaded slurmctld: error: plugin_load_from_file: dlopen(/usr/lib64/slurm/jobacct_gather_linux.so): /usr/lib64/slurm/jobacct_gather_linux.so: undefined symbol: proctrack_g_get_pids slurmctld: error: Couldn't load specified plugin name for jobacct_gather/linux: Dlopen of plugin file failed slurmctld: error: cannot create jobacct_gather context for jobacct_gather/linux slurmctld: fatal: failed to initialize jobacct_gather plugin It has an high impact. Thanks
I have also tested the version 20.02.0-0rc1, but still see undefined symbol errors. It is even worse. The slurmctld exit immediately. slurmctld: slurmctld version 20.02.0-0rc1 started on cluster linux slurmctld: Munge credential signature plugin loaded slurmctld: debug: Munge authentication plugin loaded slurmctld: Cray/Aries node selection plugin loaded slurmctld: preempt/none loaded slurmctld: debug: AcctGatherEnergy NONE plugin loaded slurmctld: debug: AcctGatherProfile NONE plugin loaded slurmctld: debug: AcctGatherInterconnect NONE plugin loaded slurmctld: debug: AcctGatherFilesystem NONE plugin loaded slurmctld: debug2: No acct_gather.conf file (/etc/slurm/acct_gather.conf) slurmctld: debug: Job accounting gather NOT_INVOKED plugin loaded slurmctld: error: plugin_load_from_file: dlopen(/usr/lib64/slurm/prep_script.so): /usr/lib64/slurm/prep_script.so: undefined symbol: run_script slurmctld: error: Couldn't load specified plugin name for prep/script: Dlopen of plugin file failed slurmctld: error: prep_plugin_init: cannot create prep context for prep/script slurmctld: fatal: failed to initialize prep plugin
Hi, I was able to solve the problem with the explanation from https://bugs.schedmd.com/show_bug.cgi?id=2443 The problem is also the hardening. The following patch fixed the problem for me. 309a310,313 > %undefine _hardened_build > %global _hardened_cflags "-Wl,-z,lazy" > %global _hardened_ldflags "-Wl,-z,lazy" > It would be great if this problem could be resolved without this modification. Best Regards, Stephan
I have forgotten to mention the modified file. slurm.spec