Hi there, I'm helping out the Research Platforms folks here and they're upgrading from 16.5.x to 17.02.7. As part of this I thought it would be useful to get PMIx support enabled, but it looks like that is broken. As posted to the mailing list (in case Ralph saw it and could shed some light for me): PMIX v1.2.2: Slurm complains and tells me it wants v2. PMIX v2.0.1: Slurm can't find it because the header files are not where it is looking for them, and when I do a symlink hack to make PMIX detection work it then fails to compile, saying: /bin/sh ../../../../libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. -I../../../.. -I../../../../slurm -I../../../.. -I../../../../src/common -I/usr/include -I/usr/local/pmix/latest/include -DHAVE_PMIX_VER=2 -g -O0 -pthread -Wall -g -O0 -fno-strict-aliasing -MT mpi_pmix_v2_la-pmixp_client.lo -MD -MP -MF .deps/mpi_pmix_v2_la-pmixp_client.Tpo -c -o mpi_pmix_v2_la-pmixp_client.lo `test -f 'pmixp_client.c' || echo './'`pmixp_client.c libtool: compile: gcc -DHAVE_CONFIG_H -I. -I../../../.. -I../../../../slurm -I../../../.. -I../../../../src/common -I/usr/include -I/usr/local/pmix/latest/include -DHAVE_PMIX_VER=2 -g -O0 -pthread -Wall -g -O0 -fno-strict-aliasing -MT mpi_pmix_v2_la-pmixp_client.lo -MD -MP -MF .deps/mpi_pmix_v2_la-pmixp_client.Tpo -c pmixp_client.c -fPIC -DPIC -o .libs/mpi_pmix_v2_la-pmixp_client.o pmixp_client.c: In function ‘_set_procdatas’: pmixp_client.c:468:24: error: request for member ‘size’ in something not a structure or union kvp->value.data.array.size = count; ^ pmixp_client.c:482:24: error: request for member ‘array’ in something not a structure or union kvp->value.data.array.array = (pmix_info_t *)info; ^ make[4]: *** [mpi_pmix_v2_la-pmixp_client.lo] Error 1 So I'm guessing that either I'm missing something (the documentation for PMIX in Slurm seems pretty much non-existent) or this is broken. Any ideas? All the best, Chris
Hi Chris, (In reply to Chris Samuel from comment #0) > Hi there, > > I'm helping out the Research Platforms folks here and they're upgrading from > 16.5.x to 17.02.7. As part of this I thought it would be useful to get > PMIx support enabled, but it looks like that is broken. I've just tried to build locally and it worked for me. I've used the following components version: pmix v1.2 slurm 17.02.7 gcc (Ubuntu 6.3.0-12ubuntu2) 6.3.0 20170406 > As posted to the mailing list (in case Ralph saw it and could shed some > light for me): > > PMIX v1.2.2: Slurm complains and tells me it wants v2. slurm 17.02.7 doesn't complain for me about that. How are you building pmix and slurm? Can I see the configure options / paths you use? Which compiler / version are you using? > PMIX v2.0.1: Slurm can't find it because the header files are not where it > is looking for them, and when I do a symlink hack to make PMIX detection > work it then fails to compile, saying: Slurm does not support pmix v2 yet. There's bug 4131 open to track this. > So I'm guessing that either I'm missing something (the documentation for > PMIX in Slurm seems pretty much non-existent) or this is broken. I wrote an internal document based upon team advice on how to build pmix, slurm and optionally ompi too. I'll update the schedmd webpage somewhere in the mpi guide to transfer that internal knowledge publicly. > Any ideas? In the meantime, this is an excerpt of the procedure indicated in that document and worked for me. You can try to follow it yourself and see if that helps or report the steps you took and where you got stuck otherwise: 1. Install the following packages (doing so through APT worked well for me): libevent-dev libhwloc-dev flex 2. Install pmix: alex@ibiza:~/git$ git clone git@github.com:pmix/master.git pmix alex@ibiza:~/git$ cd pmix alex@ibiza:~/git/pmix$ git branch -a alex@ibiza:~/git/pmix$ git checkout v1.2 alex@ibiza:~/git/pmix$ ./autogen.sh alex@ibiza:~/git/pmix$ cd .. alex@ibiza:~/git$ mkdir pmix_build alex@ibiza:~/git$ cd pmix_build alex@ibiza:~/git/pmix_build$ mkdir ../pmix_install alex@ibiza:~/git/pmix_build$ ../pmix/configure --prefix=/home/alex/git/pmix_install alex@ibiza:~/git/pmix_build$ make -j install >/dev/null alex@ibiza:~/git/pmix_build$ cd ../pmix_install alex@ibiza:~/git/pmix_install$ ls include lib share alex@ibiza:~/git/pmix_install$ 3. Install slurm: alex@ibiza:~/slurm/17.02/ibiza/slurm$ ../../slurm/configure \ --prefix=/home/alex/slurm/17.02/ibiza --enable-multiple-slurmd \ --enable-developer --enable-memory-leak-debug \ --with-pmix=/home/alex/git/pmix_install ... checking for hwloc installation... /usr checking for pmix installation... /home/alex/git/pmix_install ... alex@ibiza:~/slurm/17.02/ibiza/slurm$ make -j install > /dev/null alex@ibiza:~/slurm/17.02/ibiza/slurm$ ls -l ../lib/slurm | grep pmi -rw-r--r-- 1 alex alex 979866 sep 8 13:19 mpi_pmi2.a -rwxr-xr-x 1 alex alex 962 sep 8 13:19 mpi_pmi2.la -rwxr-xr-x 1 alex alex 385960 sep 8 13:19 mpi_pmi2.so lrwxrwxrwx 1 alex alex 16 sep 8 13:19 mpi_pmix.so -> ./mpi_pmix_v1.so -rw-r--r-- 1 alex alex 907236 sep 8 13:19 mpi_pmix_v1.a -rwxr-xr-x 1 alex alex 1118 sep 8 13:19 mpi_pmix_v1.la -rwxr-xr-x 1 alex alex 400256 sep 8 13:19 mpi_pmix_v1.so alex@ibiza:~/slurm/17.02/ibiza/slurm$ cd contribs/pmi2 alex@ibiza:~/slurm/17.02/ibiza/slurm/contribs/pmi2$ make install alex@ibiza:~/slurm/17.02/ibiza/slurm/contribs/pmi2$ cd ../../../lib alex@ibiza:~/slurm/17.02/ibiza/lib$ ls -l | grep pmi2 -rw-r--r-- 1 alex alex 114512 sep 8 16:46 libpmi2.a -rwxr-xr-x 1 alex alex 958 sep 8 16:46 libpmi2.la lrwxrwxrwx 1 alex alex 16 sep 8 16:46 libpmi2.so -> libpmi2.so.0.0.0 lrwxrwxrwx 1 alex alex 16 sep 8 16:46 libpmi2.so.0 -> libpmi2.so.0.0.0 -rwxr-xr-x 1 alex alex 81976 sep 8 16:46 libpmi2.so.0.0.0 alex@ibiza:~/slurm/17.02/ibiza/lib$ > All the best, > Chris
Hi there, thanks for your email. I now work part time at Melbourne Bioinformatics (MB, formerly known as VLSCI). Currently I work Monday, Wednesday and Thursday. If your email is about the MB supercomputers then can you please resend it to the VLSCI helpdesk at: help@vlsci.org.au For requests to join the Beowulf list please wait for my response. For other aspects of MB please see our website for details: https://www.melbournebioinformatics.org.au/contact-us/ Otherwise I will attend to it on my return. All the best, Chris -- Christopher Samuel Senior Systems Administrator Melbourne Bioinformatics - The University of Melbourne Email: samuel@unimelb.edu.au Phone: +61 (0)3 903 55545
One note. When configuring Slurm, you don't need the following options: --enable-multiple-slurmd --enable-developer --enable-memory-leak-debug
Hi there, I've just had a chance to try again and now it works. I suspect it might be because of fixing a problem I hit trying to get PMIx v2 installed which was the people who built this system didn't install a C++ compiler which PMIx v2 failed during configure saying that pthreads wasn't available (took a while to track that one down). Anyway I've just done a test configure/build and it worked fine now so I think this was a local system config issue. Hey ho! Too late for their outage window so we'll stick with PMI2 for now. Sorry to bother you all.. All the best, Chris
(In reply to Chris Samuel from comment #5) > Hi there, > > I've just had a chance to try again and now it works. > > I suspect it might be because of fixing a problem I hit trying to get PMIx > v2 installed which was the people who built this system didn't install a C++ > compiler which PMIx v2 failed during configure saying that pthreads wasn't > available (took a while to track that one down). > > Anyway I've just done a test configure/build and it worked fine now so I > think this was a local system config issue. Hey ho! Too late for their > outage window so we'll stick with PMI2 for now. > > Sorry to bother you all.. > > All the best, > Chris No problem. I'll keep this open to add some guidance in our webpage on how to build Slurm with pmix support.
Hi there, thanks for your email. I now work part time at Melbourne Bioinformatics (MB, formerly known as VLSCI). Normally I work Monday, Wednesday and Thursday but this week I am working Monday to Wednesday and away Thursday & Friday. If your email is about the MB supercomputers then can you please resend it to the VLSCI helpdesk at: help@vlsci.org.au For requests to join the Beowulf list please wait for my response. For other aspects of MB please see our website for details: https://www.melbournebioinformatics.org.au/contact-us/ Otherwise I will attend to it on my return. All the best, Chris -- Christopher Samuel Senior Systems Administrator Melbourne Bioinformatics - The University of Melbourne Email: samuel@unimelb.edu.au Phone: +61 (0)3 903 55545
Hi there, thanks for your email. I'm afraid that I'm leaving the University of Melbourne to take a position at Swinburne University of Technology Centre for Astrophysics and Supercomputing as part of the ARC Centre of Excellence for Gravitational Wave Discovery (OzGrav). If your email is about the MB supercomputers then can you please resend it to the VLSCI helpdesk at: help@vlsci.org.au For management issues please contact Andrew Isaac at: aisaac@unimelb.edu.au For other aspects of MB please see our website for details: https://www.melbournebioinformatics.org.au/contact-us/ All the best, Chris -- Christopher Samuel Senior Systems Administrator Melbourne Bioinformatics - The University of Melbourne Email: samuel@unimelb.edu.au Phone: +61 (0)3 903 55545
Documentation added in following 17.11.1+ commit: commit 34209c471a29aeb5cf44e3521c9172c30f4b8dbb (HEAD -> slurm-17.11, origin/slurm-17.11) Author: Alejandro Sanchez <alex@schedmd.com> AuthorDate: Tue Dec 19 11:53:34 2017 +0100 Commit: Alejandro Sanchez <alex@schedmd.com> CommitDate: Tue Dec 19 11:53:34 2017 +0100 Docs - add Slurm/PMIx and OpenMPI build notes to the mpi_guide page. Bug 4222.