Created attachment 7679 [details] Log of rpmbuild command After upgrading from 17.11.9 to 17.11.9-2 on our test cluster, whenever we run the sshare command, we get the error sshare: error: plugin_load_from_file: dlopen(/usr/lib64/slurm/priority_multifactor.so): /usr/lib64/slurm/priority_multifactor.so: undefined symbol: sort_part_tier sshare: error: Couldn't load specified plugin name for priority/multifactor: Dlopen of plugin file failed sshare: error: cannot create priority context for priority/multifactor Like so: # sshare Account User RawShares NormShares RawUsage EffectvUsage FairShare -------------------- ---------- ---------- ----------- ----------- ------------- ---------- root 1.000000 927728 1.000000 sshare: error: plugin_load_from_file: dlopen(/usr/lib64/slurm/priority_multifactor.so): /usr/lib64/slurm/priority_multifactor.so: undefined symbol: sort_part_tier sshare: error: Couldn't load specified plugin name for priority/multifactor: Dlopen of plugin file failed sshare: error: cannot create priority context for priority/multifactor 0.000000 root root 1 0.333333 273705 0.295027 sshare: error: plugin_load_from_file: dlopen(/usr/lib64/slurm/priority_multifactor.so): /usr/lib64/slurm/priority_multifactor.so: undefined symbol: sort_part_tier sshare: error: Couldn't load specified plugin name for priority/multifactor: Dlopen of plugin file failed sshare: error: cannot create priority context for priority/multifactor 0.000000 normal 1 0.333333 654023 0.704973 sshare: error: plugin_load_from_file: dlopen(/usr/lib64/slurm/priority_multifactor.so): /usr/lib64/slurm/priority_multifactor.so: undefined symbol: sort_part_tier sshare: error: Couldn't load specified plugin name for priority/multifactor: Dlopen of plugin file failed sshare: error: cannot create priority context for priority/multifactor 0.000000 nn9999k 1 0.333333 654023 0.704973 sshare: error: plugin_load_from_file: dlopen(/usr/lib64/slurm/priority_multifactor.so): /usr/lib64/slurm/priority_multifactor.so: undefined symbol: sort_part_tier sshare: error: Couldn't load specified plugin name for priority/multifactor: Dlopen of plugin file failed sshare: error: cannot create priority context for priority/multifactor 0.000000 optimist 1 0.333333 0 0.000000 sshare: error: plugin_load_from_file: dlopen(/usr/lib64/slurm/priority_multifactor.so): /usr/lib64/slurm/priority_multifactor.so: undefined symbol: sort_part_tier sshare: error: Couldn't load specified plugin name for priority/multifactor: Dlopen of plugin file failed sshare: error: cannot create priority context for priority/multifactor 0.000000 nn9999o 1 0.333333 0 0.000000 sshare: error: plugin_load_from_file: dlopen(/usr/lib64/slurm/priority_multifactor.so): /usr/lib64/slurm/priority_multifactor.so: undefined symbol: sort_part_tier sshare: error: Couldn't load specified plugin name for priority/multifactor: Dlopen of plugin file failed sshare: error: cannot create priority context for priority/multifactor 0.000000 We have the following RPMs installed (on the controller node): # rpm -qa|grep slurm|sort slurm-17.11.9-2.el7.x86_64 slurm-devel-17.11.9-2.el7.x86_64 slurm-libpmi-17.11.9-2.el7.x86_64 slurm-perlapi-17.11.9-2.el7.x86_64 slurm-slurmctld-17.11.9-2.el7.x86_64 slurm-slurmdbd-17.11.9-2.el7.x86_64 Are we missing some RPMs? We build the RPMs with MKDIR_P='mkdir -p' rpmbuild -tb --clean --rmsource --without zlib --without debug --with lua slurm-17.11.9-2.tar.bz2 I've attached a log of the output of this command. I've also attached the slurm config file for the cluster. For what it's worth, I saw in the NEWS file of 17.11.9-2 that there was a fix regarding sorting of multi-partition jobs. In our config, we have two partition with different PriorityTier (normal and optimist). Could that be related?
Created attachment 7680 [details] Main slurm config file
Created attachment 7681 [details] Node definition file
This was fixed in 17.11.9-2 https://github.com/SchedMD/slurm/commit/21d2ab6ed16 *** This ticket has been marked as a duplicate of ticket 5579 ***
Sorry, went too fast. You're already on 17.11.9-2. Can you try with applying this on top of that? https://github.com/SchedMD/slurm/commit/67a82c369a7530ce7838e6294973af0082d8905b which will be in 17.11.10?
(In reply to Alejandro Sanchez from comment #4) > Sorry, went too fast. You're already on 17.11.9-2. Can you try with applying > this on top of that? > > https://github.com/SchedMD/slurm/commit/ > 67a82c369a7530ce7838e6294973af0082d8905b > > which will be in 17.11.10? and btw I went so fast that wanted to reference this bug when marking as duplicate, where the undefined symbol missing problem is tracked and the .10 mentioned patch is discussed in a private comment: https://bugs.schedmd.com/show_bug.cgi?id=5552
I tried the patch now, and it worked fine! Thanks for the quick response! :D
(In reply to Bjørn-Helge Mevik from comment #6) > I tried the patch now, and it worked fine! Thanks for the quick response! :D Great, thanks for the feedback. *** This ticket has been marked as a duplicate of ticket 5552 ***