Dear Support, I'm tryng to build and install Slurm 20.02.04, but I'm facing a dependency problem on the generated rpms. Here the steps I did: 1) Build command VER=20.02.4 ; rpmbuild -tb --define "_prefix /opt/slurm/$VER" --define "_sysconfdir /etc/slurm" --define "_slurm_sysconfdir /etc/slurm" slurm-$VER.tar.bz2 2) the rpms are correctly generated, but during the install rpm -Uvh slurm-20.02.4-1.el7.x86_64.rpm error: Failed dependencies: libnvidia-ml.so.1()(64bit) is needed by slurm-20.02.4-1.el7.x86_64 Note the libnvidia-ml.so.1 are present see ldconfig report ldconfig -p | grep libnvidia-ml libnvidia-ml.so.1 (libc6,x86-64) => /lib64/libnvidia-ml.so.1 libnvidia-ml.so.1 (libc6) => /lib/libnvidia-ml.so.1 libnvidia-ml.so (libc6,x86-64) => /lib64/libnvidia-ml.so libnvidia-ml.so (libc6) => /lib/libnvidia-ml.so So the build was configured to use it ... configure:21362: checking for nvml.h configure:21362: result: yes configure:21370: checking for nvmlInit in -lnvidia-ml configure:21395: gcc -o conftest -DNUMA_VERSION1_COMPATIBILITY -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -Wl,-z,lazy -m64 -mtune=generic -std=gnu99 -pthread -I/usr/local/cuda/include -I/usr/cuda/include -Wl,-z,relro -Wl,-z,lazy conftest.c -lnvidia-ml -lresolv >&5 configure:21395: $? = 0 configure:21404: result: yes ... but during the install fails, with the message as report above. These nvdia files are not part of any installed rpms (were installed with the original NVIDIA-installer), so could be this the problem ? Thank you Marco Induni
Created attachment 15350 [details] bug9525 workaround Hi Marco, That is indeed the issue. rpm checks the list of installed rpms for what they provide, and if it doesn't find libnvidia-ml.so, it fails. Since you are installing manually, there is no rpm to look at. I've attached a workaround patch to the spec file will exclude libnvida-ml from the requirements, but we generally anticipate that if you are installing slurm with rpms, cuda would be as well. I don't think we would want to do this in general though since it could cause some issues for sites that do expect cuda to be installed with the RPMs. Let me know if this works for you! Thanks! --Tim
> Hi Marco, Hi Tim, > That is indeed the issue. rpm checks the list of installed rpms for what > they provide, and if it doesn't find libnvidia-ml.so, it fails. but if I understood correctly, the build doesn't check the rpms and it looks for libraries directly, (see log ... configure:21370: checking for nvmlInit in -lnvidia-ml ...configure:21404: result: yes) and this will enable the support for the nvidia-ml. So the NVIDIA support is build, but then it fails the install because the rpms reqested are not found. I think this is a little odd, for one process is looking some files and for the install other ones. I can anyway cop with the SPEC workaround (thank you for the attachment), but maybe this is something worthy to be mentioned on the Documentation. Thank you Marco
Hey Marco, I agree that it is a little odd, and I think we are actually doing the same operation to allow manual installation of pmix. I formalized the workaround into a patch and have it up for review. If it is decided to not use that, I will make sure the documentation is clear! Thanks! --Tim
Hey Marco, We chatted about this internally and have pushed the this into the spec file. It should start showing up in 20.02.6/20.11! I'm going to close this out for now, but let me know if you have any questions! Thanks! --Tim
*** Bug 7919 has been marked as a duplicate of this bug. ***