I've been struggling to build Slurm RPMs with support for multiple PMIx versions, I suspect I'm doing something wrong but was hoping for some guidance. I've compiled and installed both pmix-2.2.3 and pmix-3.1.5, installing them into /usr/local/pmix-2.2.3 and /usr/local/pmix-3.1.5 respectively. At first I tried to build the Slurm RPMs this via an .rpmmacros file like this: %_with_ucx /usr/local/ucx-1.7.0 %_with_pmix /usr/local/pmix-3.1.5 %_with_pmix /usr/local/pmix-2.2.3 However I received an error with this output: + ./configure --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --program-prefix= --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc/slurm --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info /usr/local/pmix-3.1.5 /usr/local/ucx-1.7.0 configure: WARNING: you should use --build, --host, --target configure: WARNING: invalid host type: /usr/local/pmix-3.1.5 configure: WARNING: you should use --build, --host, --target configure: WARNING: invalid host type: /usr/local/ucx-1.7.0 checking build system type... x86_64-redhat-linux-gnu checking host system type... x86_64-redhat-linux-gnu checking target system type... /usr/local/pmix-3.1.5 configure: error: invalid value of canonical target error: Bad exit status from /var/tmp/rpm-tmp.r6Buuw (%build) RPM build errors: Bad exit status from /var/tmp/rpm-tmp.r6Buuw (%build) Issue appears to be that rpmmacros variables are being treated as --includedir values for some reason. I also tried this method: rpmbuild --define '_with_ucx --with-ucx=/usr/local/ucx-1.7.0' --define '_with_pmix --with-pmix=/usr/local/pmix-2.2.3:/usr/local/pmix-3.1.5' -tb slurm-20.02.0.tar.bz2 Which builds the RPMs successfully, but trying to install them causes a conflict with the pmix-2 RPM I already built and installed: Transaction check error: file /usr/local/pmix-2.2.3/lib64/libpmi.so from install of slurm-libpmi-20.02.0-1.el7.x86_64 conflicts with file from package legacy-pmix-2.2.3-1.el7.x86_64 file /usr/local/pmix-2.2.3/lib64/libpmi2.so from install of slurm-libpmi-20.02.0-1.el7.x86_64 conflicts with file from package legacy-pmix-2.2.3-1.el7.x86_64 Maybe I'm misunderstanding, but I thought pointing to the pmix2 location in the compilation would have prevented this error. Even stranger, if I then remove the pmix2 package and install Slurm, srun/sacct/etc are all installed in /usr/local/pmix-2.2.3/ instead of /usr/bin/. Can you help me understand how I should be building these RPMs with PMIx support? I've read through the documentation and I don't see where I'm going wrong. Thanks!
From what I see right now, support for rpmbuild with pmix is such that only takes into account the system installed version, and it seems you cannot use another version. The code in the spec file was introduced in bug 6598, commit 35bb9afb. I will do some tests and come back to you with the conclusions.
Created attachment 13284 [details] workaround_with_pmix_2002.patch This is a quick workaround that seems to generate the correct config line. I haven't tried to install the rpms nevertheless, I need more time to do so. I am lowering the severity of this bug to sev-4 since this is doesn't have such impact. Please see https://www.schedmd.com/support.php for description of our sev levels.
Forgot to add.. compile with: rpmbuild --define "slurm_with_pmix /path/to/your/pmix_v1:/path/pmix_v2" ... and note like the configure line looks like: + ./configure --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --program-prefix= --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc/slurm --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-pmix=/path/to/your/pmix_v1:/path/pmix_v2
I analyzed more the issues. I think you are doing it correctly: there's obviously some problem when using the rpmmacros file, but not when using a define in the command line. This works for me: > rpmbuild --define '_with_ucx --with-ucx=/usr/local/ucx-1.7.0' --define > '_with_pmix --with-pmix=/usr/local/pmix-2.2.3:/usr/local/pmix-3.1.5' -tb > slurm-20.02.0.tar.bz2 In what regards to this: > the compilation would have prevented this error. Even stranger, if I then > remove the pmix2 package and install Slurm, srun/sacct/etc are all installed > in /usr/local/pmix-2.2.3/ instead of /usr/bin/. I am wondering if you're using an RPM compiled with something in rpmmacros. Can you remove your rpmmacros and show me the configure line that rpmbuild is showing up? I think you must have a prefix defined somewhere maybe from the time you built pmix-2.2.3. This is my generated file: ]$ rpm -qpl rpmbuild/RPMS/x86_64/slurm-libpmi-20.02.0-1.fc30.x86_64.rpm /usr/lib/.build-id /usr/lib/.build-id/07/a2a7ae193db2cc26e94df57bddb22d9826bc6b /usr/lib/.build-id/24/1a9880a469ad681191c6cc7c9d981a90fa66fd /usr/lib64/libpmi.so /usr/lib64/libpmi.so.0 /usr/lib64/libpmi.so.0.0.0 /usr/lib64/libpmi2.so /usr/lib64/libpmi2.so.0 /usr/lib64/libpmi2.so.0.0.0 This is my configure line shown by rpmbuild: + ./configure --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --program-prefix= --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc/slurm --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-pmix=/home/lipi/bin/pmix_v1:/home/lipi/bin/pmix I am obsoleting the attached patch since it is not really needed.
Hi, thanks for taking a look at this. After some more testing I can confirm you are correct, sorry for the dumb question and false alarm ;)