Ticket 4511 - slurm RPM provides libpmi, which overlaps with libraries from the pmix RPM
Summary: slurm RPM provides libpmi, which overlaps with libraries from the pmix RPM
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: Other (show other tickets)
Version: 17.11.0
Hardware: Linux Linux
: 4 - Minor Issue
Assignee: Tim Wickberg
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2017-12-12 22:28 MST by Kilian Cavalotti
Modified: 2019-04-30 22:09 MDT (History)
5 users (show)

See Also:
Site: Stanford
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: Slurm 17.11.1, PMIx commit 9d7071e534
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
spec change to put libpmi into it's own rpm (1.41 KB, patch)
2017-12-13 15:00 MST, Danny Auble
Details | Diff

Note You need to log in before you can comment on or make changes to this ticket.
Description Kilian Cavalotti 2017-12-12 22:28:32 MST
Hi!

I installed PMIx v2, with "rpmbuild --rebuild ...src.rpm" from https://github.com/pmix/pmix/releases

Then, Slurm 17.11 was compiled with 'rpmbuild -ta slurm-17.11.0.tar.bz2', on a host with the pmix-2.0.2 RPM installed. It picked up the PMIx libs and included them in the "slurm" RPM.

The issue is that the slurm RPM installs libpmi.so and libpmi2.so, which are already provided by the PMIx RPM. 

# rpm -q pmix
pmix-2.0.2-1.el7.centos.x86_64
# yum install slurm
[...]
Transaction check error:
  file /usr/lib64/libpmi.so from install of slurm-17.11.0-1.el7.centos.x86_64 conflicts with file from package pmix-2.0.2-1.el7.centos.x86_64
  file /usr/lib64/libpmi2.so from install of slurm-17.11.0-1.el7.centos.x86_64 conflicts with file from package pmix-2.0.2-1.el7.centos.x86_64


Of course, to use PMIx, the pmix RPM needs to be installed, as Slurm doesn't provide any libpmix.so file.


What's the correct way to compile and install Slurm and PMIx?

Thanks!
-- 
Kilian
Comment 1 Karl Kornel 2017-12-12 22:55:20 MST
FYI, this came up in Debian, with the PMIx Debian maintainer changing the path of their installed PMIx libraries.

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=882033

There was also some discussion about this on the OpenMPI GitHub, but it has since stalled.

https://github.com/open-mpi/ompi/issues/4072
Comment 4 Ralph Castain 2017-12-13 09:24:43 MST
Yes, we are aware of the issue in both the PMIx and OMPI communities. The problem arose because users wanted to use the PMIx backward compatibility interfaces (that support PMI and PMI-2 calls), but their codes are hard-wired to look for a "libpmi.so" or "libpmi2.so". So we agreed to create what are actually just carbon-copies of libpmix with those names.

I'm not sure how best to resolve the problem as we have conflicting needs from different users. There is no real issue so long as the install locations are kept separate and the correct paths are provided. Unfortunately, packagers put both sets of libraries in the same system directory, causing the overwrite.

So it isn't as much that things have "stalled" as it is that we simply don't know a solution. Suggestions are welcome!

Ralph
Comment 5 Kilian Cavalotti 2017-12-13 10:24:46 MST
Hi Ralph, 

Since Slurm depends on PMIx but conflicts on it, I propose to break the dependency relationship, but keep the conflict.

My naive approach would be the following:

* the pmix RPM continues to provide all the libraries it does today, in the same location (libpmi.so, libpmi2.so, libpmix.so), since users may need those outside of the scope of Slurm,

* the slurm RPM provides its own version of libpmix.so, as it does today for libpmi.so and libpmi2.so.


This way both RPMs could be installed independently (although not at the same time). Slurm would need PMIx at build time, but not to run.

Thoughts?

Cheers,
-- 
Kilian
Comment 6 Ralph Castain 2017-12-13 10:39:35 MST
This has always been a troubling point, and I'm not sure there really is a correct answer. The problem here is that apps do call PMIx directly, and so in your proposed method, the app could be linked against a different version of PMIx than SLURM.

If it were me, what I would do is simply manually delete libpmi.* and libpmi2.* from the PMIx rpm after installation. This removes the confusion and represents the minimal change.

I would recommend going that route.
Comment 7 Danny Auble 2017-12-13 11:38:14 MST
I am not sure I like the idea of manually deleting anything an rpm package installs.  The way Debian solved this isn't that bad of an idea.  If you are going to install these in the same place perhaps we both move libpmi to it's own RPM and have them conflict with each other.  I think this would solve the issue or almost all sites as I doubt they would want to support both libpmi versions.

Anyone not like this idea?
Comment 8 Artem Polyakov 2017-12-13 11:40:43 MST
Sounds good to me.
Comment 9 Ralph Castain 2017-12-13 11:59:40 MST
Fine with me, too!
Comment 10 Kilian Cavalotti 2017-12-13 12:02:51 MST
That sounds like a good solution indeed. slurm-pmi FTW :)
Comment 11 Danny Auble 2017-12-13 15:00:44 MST
Created attachment 5738 [details]
spec change to put libpmi into it's own rpm

Here is the first draft to the slurm.spec to put libpmi into it's own rpm.  It assumes the package for pmix is called libpmi-pmix.

Let me know if anything needs to be changed for this to work like we planned.
Comment 13 Tim Wickberg 2017-12-20 13:44:22 MST
Commit 5fb94be9598 will be in 17.11.1 and above, and move the libpmi libraries to a separate "slurm-libpmi" package which will conflict with the anticipated "pmix-libpmi" package.

Ralph + Artem: Are you able to split the PMIx libpmi off in the pmix.spec file, or would you prefer me to submit a patch / pull request to make this change?
Comment 14 Artem Polyakov 2017-12-20 13:52:22 MST
To be honest I'm not familiar with spec files so it will probably take more time for me to do that.
Ralph can you address this thing? I don't want Tim doing that as this is our part.
Comment 15 Ralph Castain 2017-12-20 14:19:30 MST
Just to be clear: you want libpmi and libpmi2, but not libpmix, split off - yes?

I can do it based on what you have here, I think. If not, I'll ask for advice.
Comment 16 Tim Wickberg 2017-12-20 14:24:48 MST
(In reply to Ralph Castain from comment #15)
> Just to be clear: you want libpmi and libpmi2, but not libpmix, split off -
> yes?

Correct. Slurm doesn't ship a libpmix, only libpmi.so and libpmi2.so, which are the ones in conflict. For 17.11.1 those are now isolated to the new slurm-libpmi package, which is set to conflict with your (not yet extant) pmix-libpmi package.

> I can do it based on what you have here, I think. If not, I'll ask for
> advice.

Sounds good. Let me know if you want a hand, I've spent ~30 hours over the past few months hacking on our spec file so I've gotten at least a bit used to their quirks.

- Tim
Comment 17 Ralph Castain 2017-12-21 12:44:35 MST
Hmmm...I have discovered a little problem here. We actually only distribute the SOURCE rpm, not binaries, and so there is no way I can have our spec file separate out the libraries as we don't build them. 

What I can do is pass along to the downstream packagers that they do split those off and set the necessary conflict flag. I cannot guarantee that they will do so.

Or perhaps I am missing something? I honestly don't know if the packagers are using our spec file or their own (I totally expect the latter, but have never confirmed). I see you only distribute source as well in the form of tarballs - so is your spec file for the packagers? Perhaps we need to create a separate one for building actual binary rpm's?

Meantime, I'll check with the packagers and see how they respond.
Comment 18 Tim Wickberg 2018-01-18 18:23:43 MST
(In reply to Ralph Castain from comment #17)
> Hmmm...I have discovered a little problem here. We actually only distribute
> the SOURCE rpm, not binaries, and so there is no way I can have our spec
> file separate out the libraries as we don't build them. 

I'm assuming the issue is that the spec file you ship as contrib/pmix.spec builds everything into a single package. It should be straightforward to add a second package definition isolating libpmi.so and libpmi2.so to that spec file; I can prepare a pull request if it would expedite things

> What I can do is pass along to the downstream packagers that they do split
> those off and set the necessary conflict flag. I cannot guarantee that they
> will do so.

That would be good to know; or if theirs is significantly out of sync with yours I'd suggest looking into bringing them in sync again.

> Or perhaps I am missing something? I honestly don't know if the packagers
> are using our spec file or their own (I totally expect the latter, but have
> never confirmed). I see you only distribute source as well in the form of
> tarballs - so is your spec file for the packagers? Perhaps we need to create
> a separate one for building actual binary rpm's?

We ship (two actually) spec files that can build the tarball into RPMs.
Comment 19 Ralph Castain 2018-01-18 20:33:22 MST
(In reply to Tim Wickberg from comment #18)
> (In reply to Ralph Castain from comment #17)
> > Hmmm...I have discovered a little problem here. We actually only distribute
> > the SOURCE rpm, not binaries, and so there is no way I can have our spec
> > file separate out the libraries as we don't build them. 
> 
> I'm assuming the issue is that the spec file you ship as contrib/pmix.spec
> builds everything into a single package. It should be straightforward to add
> a second package definition isolating libpmi.so and libpmi2.so to that spec
> file; I can prepare a pull request if it would expedite things

Yes, please! We would really appreciate it - it has proven beyond my comfort zone.

> 
> > What I can do is pass along to the downstream packagers that they do split
> > those off and set the necessary conflict flag. I cannot guarantee that they
> > will do so.
> 
> That would be good to know; or if theirs is significantly out of sync with
> yours I'd suggest looking into bringing them in sync again.

They are indeed packaging things to avoid conflict - please see the following for their comments:

https://github.com/open-mpi/ompi/issues/4072

> 
> > Or perhaps I am missing something? I honestly don't know if the packagers
> > are using our spec file or their own (I totally expect the latter, but have
> > never confirmed). I see you only distribute source as well in the form of
> > tarballs - so is your spec file for the packagers? Perhaps we need to create
> > a separate one for building actual binary rpm's?
> 
> We ship (two actually) spec files that can build the tarball into RPMs.

You are welcome to do the same with ours, if you like - I defer to your expertise.
Comment 20 Tim Wickberg 2018-06-27 14:42:37 MDT
Ralph -

Is there a particular branch I should be targeting with any patches? The way OpenMPI uses their branches is a bit different that I expect, and I want to avoid any confusion with a pull request when I get one generated.
Comment 21 Artem Polyakov 2018-06-27 15:16:23 MDT
Hi, Tim.

You should target master. And once merged it will be cherry-picked into the appropriate release branches.
Note also that PMIx is a separate project, not MPI:
https://github.com/pmix/pmix
Comment 22 Kilian Cavalotti 2019-04-30 16:36:01 MDT
Hi Tim, Ralph, Artem,

Sorry to revive this old issue, but we've been hitting the same problem when trying to install PMIx 3.x here: although the libpmi{1,2}.so libs are now separately packaged by Slurm (in the slurm-libpmi RPM), PMIx still ships a monolithic RPM that provides both compatibility libs (libpmi.so and libpmi2.so) and that conflicts with slurm-libpmi.

We can't really use PMIx's compat libs because they're actually libpmi.so.1 and libpmi2.so.1 and a number of ou user codes have been compiled against Slurm's libpmi.so.0 and libpmi2.so.0.

I was hoping the PMIx package would have been split the same way, as Tim proposed in #c18, but I didn't find any relevant in the PMIx GitHub repository.

So I went ahead and I'm proposing a patch to the PMIx SPEC file in https://github.com/pmix/pmix/pull/1230

My ultimate goal is to be able to use PMI-1 and PMI-2 compatibility libs from Slurm (because we have apps that already use them and we don't want to break them), alongside PMIx for newer things.

Cheers,
-- 
Kilian
Comment 23 Tim Wickberg 2019-04-30 17:11:40 MDT
> I was hoping the PMIx package would have been split the same way, as Tim
> proposed in #c18, but I didn't find any relevant in the PMIx GitHub
> repository.

I unfortunately haven't had time to look into this further since splitting our PMI libraries off.

> So I went ahead and I'm proposing a patch to the PMIx SPEC file in
> https://github.com/pmix/pmix/pull/1230
> 
> My ultimate goal is to be able to use PMI-1 and PMI-2 compatibility libs
> from Slurm (because we have apps that already use them and we don't want to
> break them), alongside PMIx for newer things.

This looks pretty good. I have a few minor suggestions, but I'll move those to the PR so the PMIx folks are all seeing them alongside the patch.

- Tim
Comment 24 Tim Wickberg 2019-04-30 22:09:31 MDT
Thanks Killian for following through with this with the PMIx folks. I'm finally tagging this as resolved since they've accepted your pull request, so both our and their RPM packing are finally in sync.

- Tim