Ticket 7263 - Adding some PMIx calls to the corresponding plugin - pmix/v4
Summary: Adding some PMIx calls to the corresponding plugin - pmix/v4
Status: OPEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: PMIx (show other tickets)
Version: 21.08.x
Hardware: Linux Linux
: 5 - Enhancement
Assignee: Unassigned Developer
QA Contact:
URL:
: 10683 11489 12771 13438 (view as ticket list)
Depends on:
Blocks: 10684
  Show dependency treegraph
 
Reported: 2019-06-19 12:18 MDT by Ralph Castain
Modified: 2022-04-13 14:25 MDT (History)
11 users (show)

See Also:
Site: -Other-
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
Git diff of preview code (5.40 KB, text/plain)
2019-06-19 12:18 MDT, Ralph Castain
Details
Updated changes (10.35 KB, patch)
2019-06-19 17:03 MDT, Ralph Castain
Details | Diff
Modifications needed for pmix v4 (3.62 KB, patch)
2019-07-29 16:27 MDT, Danny Auble
Details | Diff
mpi/pmix: adds `srun` identifying `pmix_v4` pluggin (1.15 KB, patch)
2020-03-25 09:47 MDT, Boris Karasev
Details | Diff
mpi/pmix: added support `PMIx_server_setup_application` API (5.12 KB, patch)
2020-03-25 09:52 MDT, Boris Karasev
Details | Diff
0001-mpi-pmix-fixed-detection-of-version-4-PMIx-lib.patch (1.86 KB, patch)
2020-06-04 03:53 MDT, Boris Karasev
Details | Diff
0002-common-added-base64-encode-decode.patch (9.15 KB, patch)
2020-06-04 03:53 MDT, Boris Karasev
Details | Diff
0003-common-added-slurm_env_get_val_maxlen.patch (2.28 KB, patch)
2020-06-04 03:53 MDT, Boris Karasev
Details | Diff
0004-mpi-pmix-added-support-PMIx_server_setup_application.patch (36.54 KB, patch)
2020-06-04 03:54 MDT, Boris Karasev
Details | Diff
0002-common-added-base64-encode-decode.patch (9.20 KB, patch)
2020-06-15 02:13 MDT, Boris Karasev
Details | Diff
original patchset with autoreconf ran and fixed for current master (55.02 KB, patch)
2020-09-17 13:48 MDT, Danny Auble
Details | Diff
bug7263_2011_fixed_remarks.patch (71.80 KB, patch)
2020-12-01 21:56 MST, Boris Karasev
Details | Diff
bug7263.patch (64.72 KB, patch)
2021-01-25 02:52 MST, Boris Karasev
Details | Diff
v2 (64.80 KB, patch)
2021-03-10 14:53 MST, Danny Auble
Details | Diff

Note You need to log in before you can comment on or make changes to this ticket.
Description Ralph Castain 2019-06-19 12:18:51 MDT
Created attachment 10651 [details]
Git diff of preview code

I'm providing this preview of the final code contribution as I have a couple of questions. The objective of the new code is to utilize the PMIx_server_setup_application and PMIx_server_setup_local_support APIs to provide support for the "instant on" feature. My questions are:

* I can’t seem to build the mpi/pmix plugin as running “make install” in the mpi/pmix directory yields an error about the .so not being found. Indeed, it isn’t built, but darned if I can see why. The PMIx v4 linkage code is verbatim from the other versions, so I assume I’m just missing something elsewhere?

* I put the call to setup_application in the srun prelaunch hook as that appears to have the proper information and come at the right place in the launch procedure. Basically, we need to call that API once we know the process map (i.e., where each process rank is going) so PMIx subsystems can allocate subsystem-level resources (e.g., network endpts).

However, the prelaunch hook appears to only return an envar. The setup_app API returns an array of data values that I can aggregate into a single "blob" which needs to be sent along with the launch message to every participating slurmstepd. What is the best way for me to pass that "blob" along?

Thanks
Ralph
Comment 1 Ralph Castain 2019-06-19 17:02:59 MDT
I was able to get the plugin to install a .so file in addition to the .la, but had to comment out a line of the Makefile.am to do it - basically had to let libtool install the .so on its own.

I also neglected to include the .m4 modifications - please see updated attachment.

Still need advice on how to pass a blob from srun to the slurmstepd. Any thoughts would be appreciated.

Ralph
Comment 2 Ralph Castain 2019-06-19 17:03:44 MDT
Created attachment 10658 [details]
Updated changes
Comment 3 Danny Auble 2019-07-29 16:27:58 MDT
Created attachment 11047 [details]
Modifications needed for pmix v4

Ralph, I touched up and committed the logic needed to start building with v4 commits

bece959ce402
999c11f68c8e
18235239aee2

here is an updated patch of what you added to mpi_pmix.c.  It needed a little work to keep the code building with any other version of pmix, but that wasn't horrible.

Note, while this compiles and installs correctly, srun asserts when running it.

(gdb) where
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007fe122952535 in __GI_abort () at abort.c:79
#2  0x00007fe122ccdf0a in __xassert_failed (expr=expr@entry=0x7fe1228702c8 "_pmixp_job_info.magic == 0xCAFE01F0", file=file@entry=0x7fe122870288 "../../../../../../slurm/src/plugins/mpi/pmix/pmixp_info.h", line=line@entry=155, func=func@entry=0x7fe122870490 <__func__.17699> "pmixp_info_namespace") at ../../../../slurm/src/common/xassert.c:57
#3  0x00007fe12284e094 in pmixp_info_namespace () at ../../../../../../slurm/src/plugins/mpi/pmix/pmixp_info.h:166
#4  _setup_app_srun () at ../../../../../../slurm/src/plugins/mpi/pmix/mpi_pmix.c:149
#5  p_mpi_hook_client_prelaunch (job=0x55eeb8b6f900, env=0x7ffe31796088) at ../../../../../../slurm/src/plugins/mpi/pmix/mpi_pmix.c:300
#6  0x00007fe122bfc892 in mpi_hook_client_prelaunch (job=job@entry=0x55eeb8b6f900, env=env@entry=0x7ffe31796088) at ../../../../slurm/src/common/mpi.c:289
#7  0x00007fe122b9a991 in slurm_step_launch (ctx=0x55eeb8b6f360, params=params@entry=0x7ffe317963a0, callbacks=callbacks@entry=0x7ffe31796370) at ../../../../slurm/src/api/step_launch.c:240
#8  0x00007fe1228840e3 in launch_p_step_launch (job=0x55eeb8b711c0, cio_fds=<optimized out>, global_rc=<optimized out>, step_callbacks=<optimized out>, opt_local=0x55eeb75095c0 <opt>) at ../../../../../../slurm/src/plugins/launch/slurm/launch_slurm.c:825
#9  0x000055eeb74f45d5 in launch_g_step_launch (job=job@entry=0x55eeb8b711c0, cio_fds=cio_fds@entry=0x7ffe31796630, global_rc=global_rc@entry=0x55eeb7509f30 <global_rc>, step_callbacks=step_callbacks@entry=0x7ffe31796660, opt_local=opt_local@entry=0x55eeb75095c0 <opt>) at ../../../../../slurm/src/srun/libsrun/launch.c:578
#10 0x000055eeb74efb0d in _launch_one_app (data=<optimized out>) at ../../../../slurm/src/srun/srun.c:248
#11 0x000055eeb74f12f6 in _launch_app (got_alloc=true, srun_job_list=0x0, job=0x55eeb8b711c0) at ../../../../slurm/src/srun/srun.c:547
#12 srun (ac=<optimized out>, av=<optimized out>) at ../../../../slurm/src/srun/srun.c:202
#13 0x000055eeb74f1980 in main (argc=<optimized out>, argv=<optimized out>) at ../../../../slurm/src/srun/srun.wrapper.c:17


Doesn't appear _pmixp_job_info has been set up set when you set this new code.  Hopefully this gets you going now Slurm will detect v4 and build against it.

Let me know what you come up with.

Note, we typically like patches generated from

git format-patch -#OFCOMMITS --stdout > BUG#_SLURMVERSION_v#.patch

Thanks for your help on this :).
Comment 4 Ralph Castain 2019-07-30 10:41:40 MDT
Thanks! I’m on vacation right now but will get back on this when I return
Comment 5 Danny Auble 2019-12-26 05:54:10 MST
Ralph, I am guessing this may not happen?
Comment 6 Ralph Castain 2019-12-31 11:24:27 MST
I don't see a path forward as my learning curve on the required Slurm code is too steep. I can provide advice and even help a bit with coding, but would need someone more familiar with Slurm to help with the integration pieces.

I suspect Slurm will want to pursue this for competitive reasons going forward, but I understand the constraints. We can close this for now until someone has time to work with me on it.
Comment 7 Danny Auble 2019-12-31 11:33:36 MST
Sounds good Ralph. I'll leave it open but leave it unassigned.
Comment 8 Boris Karasev 2020-03-25 09:47:13 MDT
Created attachment 13475 [details]
mpi/pmix: adds `srun` identifying `pmix_v4` pluggin
Comment 9 Boris Karasev 2020-03-25 09:52:55 MDT
Created attachment 13476 [details]
mpi/pmix: added support `PMIx_server_setup_application` API
Comment 10 Boris Karasev 2020-03-25 09:59:00 MDT
Ralph please review attached patches. The second patch adds support for `PMIx_server_setup_application`
Comment 11 Ralph Castain 2020-03-26 17:48:30 MDT
Boris - my apologies for the delayed response. I'm no Slurm expert, but the PMIx logic looks correct. However, it isn't complete. While this change does indeed add support to gather the application-support data, the resulting info needs to be transported to the stepd's and passed down into their PMIx server libraries via the PMIx_server_setup_local_support function.

The two functions act as a pair. Setup_application gathers info (e.g., OMPI envars and endpt info), and setup_local_support delivers that info. This allows the remote daemons to put the info into the job-info for the procs to use, and do anything else the various plugins may require.

Hope that helps
Ralph
Comment 12 Boris Karasev 2020-06-04 03:53:12 MDT
Created attachment 14511 [details]
0001-mpi-pmix-fixed-detection-of-version-4-PMIx-lib.patch
Comment 13 Boris Karasev 2020-06-04 03:53:38 MDT
Created attachment 14512 [details]
0002-common-added-base64-encode-decode.patch
Comment 14 Boris Karasev 2020-06-04 03:53:57 MDT
Created attachment 14513 [details]
0003-common-added-slurm_env_get_val_maxlen.patch
Comment 15 Boris Karasev 2020-06-04 03:54:25 MDT
Created attachment 14514 [details]
0004-mpi-pmix-added-support-PMIx_server_setup_application.patch
Comment 16 Boris Karasev 2020-06-04 04:28:28 MDT
These 4 patches update the support of `PMIx_server_setup_application` API. 
Short description:
The srun process calls `PMIx_server_setup_application`, it generates PMIx info keys, which pack into a blob. This blob divides into several chunks (if the size of the entire blob exceeds the allowed env size) that add to multiple env variables. These envs are delivered to the slurmstepd process and then they unpack and pass to `PMIx_server_setup_local_support`.
Comment 17 Boris Karasev 2020-06-15 02:13:47 MDT
Created attachment 14666 [details]
0002-common-added-base64-encode-decode.patch

Updated `0002-common-added-base64-encode-decode.patch` to avoid compilation warnings.
Comment 18 Danny Auble 2020-09-17 13:48:34 MDT
Created attachment 15930 [details]
original patchset with autoreconf ran and fixed for current master

Boris, thanks for this, sorry it has taken so long to look at it.  It looks like there were issues putting into master (which is where this will go for 20.11).  Attached is a patch that will fix this to at least apply to master.

I am wondering if putting your base64 [en|de]code to common is a good idea.  Would it make more sense to just add it to pmix as it is the only thing presently using it (It looks like it is only ever used in pmix_client.c - though src/common/base64_encode.h is included in pmix_server.c)?  Or do you have plans in the future for it?

You will also notice in pmix_jobinfo_t now has a structure 'slurm_step_id_t step_id' that contains job_id and step_id.  Could you alter your new pmixp_srun_info_t to do the same kind of thing?  Or even better, perhaps make a structure that is also shared by pmix_jobinfo_t that contains the duplicate information as it seems pmixp_srun_info_t is really a subset of pmix_jobinfo_t where we can potentially reuse code instead of having code that looks duplicate.  Also it looks like your code will no longer compile because of this change.

If in the future instead of giving multiple patches that belong in the same patch set we now are requesting 1 file created with

git format-patch -<commit_cnt> --stdout > bug<bugid>_<slurm_release>[_<description>]_v#.patch
E.g. bug7263_2011_v3.patch

Let me know if you need anything else.  Thanks!
Comment 19 Danny Auble 2020-09-17 13:49:19 MDT
Comment on attachment 11047 [details]
Modifications needed for pmix v4

Based on the other patch I don't believe this patch is valid anymore.
Comment 20 Danny Auble 2020-11-24 15:14:29 MST
Boris, more insight here would be nice.  Thanks :).
Comment 21 Artem Polyakov 2020-11-25 13:10:24 MST
Danny,
Sorry that it fell out of my radar (I wasn't on the CC list).
I am there now and will expedite the process now.
Comment 22 Boris Karasev 2020-12-01 21:56:19 MST
Created attachment 16911 [details]
bug7263_2011_fixed_remarks.patch

My apologies for the delay. Attaching the updated patch with fixed Danny remarks.
Comment 23 Artem Polyakov 2020-12-02 17:56:45 MST
Danny,
While doing an extra internal review we spot few places that needs slight refactoring.
Please expect the new patch from Boris till the end of the week.
Comment 24 Artem Polyakov 2020-12-14 18:35:21 MST
Update:
Unfortunately it is taking longer than I've expected.
After completing the refactoring we faced stability issues.
Boris is working on them right now.
Comment 25 Danny Auble 2020-12-17 05:46:52 MST
Thanks for the update Artem, I look forward to see what you all will come up with.
Comment 26 Boris Karasev 2021-01-25 02:52:17 MST
Created attachment 17592 [details]
bug7263.patch

Danny, please review the updated patch.
Comment 27 Tim Wickberg 2021-02-01 13:56:52 MST
*** Ticket 10683 has been marked as a duplicate of this ticket. ***
Comment 28 Christian Goll 2021-02-23 09:42:32 MST
The patch bug7263.patch does not apply to the released slurm 20.11.4 any more.
@Boris and @Artem
When you refactor the patch, can you please make sure that before the libpmix.so with the absolute path is loaded, the loader also checks for a systemwide versioned libpmix.so.SO_VERSION?
For the SUSE slurm rpm we have following lines in mpi_pmix.c:109
#ifdef PMIXP_V1_LIBPATH
  xstrfmtcat(full_path, "%s/", PMIXP_V1_LIBPATH);
#elif defined PMIXP_V2_LIBPATH
  xstrfmtcat(full_path, "%s/", PMIXP_V2_LIBPATH);
#elif defined PMIXP_V3_LIBPATH
  xstrfmtcat(full_path, "%s/", PMIXP_V3_LIBPATH);
#define PMIX_SO_STRING ".2"
#endif
  xstrfmtcat(full_path, "libpmix.so");
#ifdef PMIX_SO_STRING
  lib_plug = dlopen("libpmix.so"PMIX_SO_STRING, RTLD_LAZY | RTLD_GLOBAL);
  if(!lib_plug)
#endif
    lib_plug = dlopen(full_path, RTLD_LAZY | RTLD_GLOBAL);
Comment 29 Artem Polyakov 2021-02-23 17:49:29 MST
Hi, 
I’m not sure this is the right approach.

If user wants system-wide libpmix, he should build Slurm against it and then this lib’s full path will be hard coded in the instance of plugin.

If we go your route the plugin will always pick the system-wide for all plugin instances. Even if the instance is not compatible.

Please correct me if I’m missing something.

The option to work against the system-wide thing would be to distribute the plugin as a separate package that is built against the PMIx that is also shipped by the distro
Comment 30 Christian Goll 2021-02-24 01:31:08 MST
(In reply to Artem Polyakov from comment #29)
> Hi, 
> I’m not sure this is the right approach.
> 
> If user wants system-wide libpmix, he should build Slurm against it and then
> this lib’s full path will be hard coded in the instance of plugin.
> 
But the plain libpmix.so resides on all distributions in the -dev[el] package, so this package would have to be installed on all the nodes, pulling a lot of big and not needed other devel packages.
> If we go your route the plugin will always pick the system-wide for all
> plugin instances. Even if the instance is not compatible.
> 
So that's exactly the reason to load a versioned libpmix.so.$APIVERS, so that only the compatible libpmix.so.$APIVERS is loaded. This is the only way that two different API versions can coexist.
> Please correct me if I’m missing something.
> 
> The option to work against the system-wide thing would be to distribute the
> plugin as a separate package that is built against the PMIx that is also
> shipped by the distro
This is what we are doing right now. The problem here would be that for example
libpmix4-devel.rpm and libpmix3-devel.rpm would both carry libpmix.so and the package manager (zypper/dnf/apt/whatever) could not find which version of libpmix.so is needed, in contrast when loading libpmix.so.$APIVERS, the package manager knows which version to install.
Comment 31 Artem Polyakov 2021-02-24 10:14:55 MST
Oh, I see now what is the problem

This is the essence of the request, right:
...
#define PMIX_SO_STRING ".2"
#endif
  xstrfmtcat(full_path, "libpmix.so");
#ifdef PMIX_SO_STRING
  lib_plug = dlopen("libpmix.so"PMIX_SO_STRING, RTLD_LAZY | RTLD_GLOBAL);
  if(!lib_plug)
#endif

We want to append .2 for PMIx v3 and the proper value for v4.

Sure, we will do that. Thank you for the feedback. I'm having a knowledge gap in the area of versioning the libraries. Thanks for the education!
Comment 32 Christian Goll 2021-02-25 11:17:21 MST
(In reply to Artem Polyakov from comment #31)
> Oh, I see now what is the problem
> 
> This is the essence of the request, right:
> ...
> #define PMIX_SO_STRING ".2"
> #endif
>   xstrfmtcat(full_path, "libpmix.so");
> #ifdef PMIX_SO_STRING
>   lib_plug = dlopen("libpmix.so"PMIX_SO_STRING, RTLD_LAZY | RTLD_GLOBAL);
>   if(!lib_plug)
> #endif
> 
> We want to append .2 for PMIx v3 and the proper value for v4.
>
These values do unfortunately not correspond to the library version in an easy way and also keep me confused all the time. 
> Sure, we will do that. Thank you for the feedback. I'm having a knowledge
> gap in the area of versioning the libraries. Thanks for the education!
I also had to read through the man pages, as normally the dynamic library loader does the job for you and you have not to worry about the plain libfoo.so.$VERS files.
The whole absolute path in loading the library is quite uncommon and I would guess it is here in the code, as in the previous slurm version the m4/autoconf/configure code was different and did not create environment variables which could be easily consumed, 
With the actual version you could run for example into the problem, when you compile the code in your home, these absolute path is in the binaries and on a parallel job execution N-jobs from N-nodes access your home.
Comment 33 Danny Auble 2021-03-10 14:36:26 MST
Artem, were you going to give me another patch on this?
Comment 34 Artem Polyakov 2021-03-10 14:45:32 MST
Not on this as long as it applies ok.
If not we will rebase.

The issue raised by Christian should be a separate issue IMO
Comment 35 Danny Auble 2021-03-10 14:53:17 MST
Created attachment 18349 [details]
v2

Thanks for clarifying, since it was discussed here I figured it was also to be solved here.  I am totally fine having it in a separate issue.

Attached is a non-reviewed patch that does go in cleanly to 20.11.
Comment 36 Danny Auble 2021-03-10 14:58:45 MST
As I am looking at this I would feel much better if we pushed this to 21.08 instead of 20.11 as the changes are a little more involved than I want to put into a branch that is fairly old now.  Will that be ok with you?
Comment 37 Artem Polyakov 2021-03-10 22:30:31 MST
PMIx v4 is out. Delaying it till August might restrict users.
Comment 38 Danny Auble 2021-03-11 08:49:12 MST
I completely understand delaying it's inclusion will delay it's use.  My main (massive) concern is it may impact those not using v4 yet.

For instance, this current patch against v3 stalls the job indefinitely.

A simple (pmix_v[2|3])

srun hostname

hangs with no progress.  The logs from the stepd are

[Mar 11 08:33:31.442283 3687952 slurmstepd   0x7f1ee5eafbc0] [3768.0] debug:  mpi/pmix_v3: p_mpi_hook_slurmstepd_prefork: (null) [0]: mpi_pmix.c:162: start
[Mar 11 08:33:31.442326 3687952 slurmstepd   0x7f1ee5eafbc0] [3768.0] debug:  mpi/pmix_v3: pmixp_info_nspace_usock: setup sockets
[Mar 11 08:33:31.444260 3687952 slurmstepd   0x7f1ee4d6d640] [3768.0] debug:  mpi/pmix_v3: _errhandler_reg_callbk: snowflake7 [0]: pmixp_client_v2.c:82: Error handler registration callback is called with status=0, ref=0

At this point the stepd seems to have disappeared with no core file.  I have not done any investigation on the matter past running this multiple times and looking at the log file.

I would hate for this to happen to someone running happily with a current 20.11.

If this patchset were completely v4 specific I could live with it going into 20.11, but as it affect other versions it makes me extremely nervous in a midstream release to include it.
Comment 39 Artem Polyakov 2021-03-11 09:19:19 MST
I see, fair enough.

Let us have a look into what is happening.
I would like to reproduce the problem that you observe so we can add some error detection logic to fail with some meaningful info.

We will try to reproduce it locally. But because Boris tested it with all PMIx versions, I suspect it might be an exotic combination of stale/new PMIx installation or something. Given that, can I ask you to keep this failing combination so in case we can't reproduce locally, we can have a screen-sharing session to investigate (if you'll have the time of course).
Comment 40 Artem Polyakov 2021-03-11 09:20:28 MST
Regarding 21.08 - I agree that we probably should delay it given your current observations and given that I'm not sure when Boris can revisit this.
Comment 41 Danny Auble 2021-03-11 09:27:16 MST
Of course.  If it matters the pmix v2 I have is 2.1.3 commit 9535456d5125f and v3 is 3.2 commit 742187e65c99b3c.  I will not touch those until this is figured out.

For me it was reproducible 100%.  The moment I reverted the patchset here all worked as expected again.

Thank you for understanding the delay.  As said I was okay with it in 20.11 if it was completely isolated, but I am erring on the side of caution.
Comment 42 Christian Goll 2021-03-15 03:10:48 MDT
(In reply to Artem Polyakov from comment #34)
> Not on this as long as it applies ok.
> If not we will rebase.
> 
> The issue raised by Christian should be a separate issue IMO

Sure, I will open a seperate issue. when ist clear which pmix version goes in which slurm version.
Comment 43 Danny Auble 2021-03-23 14:43:40 MDT
Artem/Boris, please add me back when you have a v3.

Thanks!
Comment 44 Felip Moll 2021-04-29 10:26:42 MDT
*** Ticket 11489 has been marked as a duplicate of this ticket. ***
Comment 45 Isaias Compres 2021-08-26 07:27:36 MDT
Greetings everyone,

I will be working with future releases Open PMIx, and need Slurm's build system to allow versions 5.0 and later.  I think having specific code for each version of Open PMIx is high in maintenance effort, and makes it a bit difficult to test with pre-releases.

I have produced some changes to remove the version checks and have a single mpi_pmix.so plugin produced, regardless of the Open PMIx version provided (although incompatible with 1.x releases).

I have a couple of questions:
1.- Is this change OK with everyone?
2.- Should I upload a patch here, or would you prefer that I create a separate bug?

Once this is sorted, I will pick up this bug and try to update the patches as necessary (Artem and Boris are aware of this).
Comment 46 Artem Polyakov 2021-08-26 08:18:55 MDT
Here are my thoughts:

1. From my perspective, it’s ok to drop v1.x compatibility. However, I still see customers looking for PMI2 support for their own reasons, so wouldn’t be surprised if someone wants to stick to PMIx v1.x for some reason. But I’ll let SchedMD to advise here. They may have more data.

2. Regarding multiple versions: I’ve seen people running multiple versions on their systems and thought that ability to produce multiple plugins automatically is a good feature. The same can be achieved by having multiple builds and manual renaming of the plugins, but it requires more effort.
I also don’t see too much maintenance burden so far. What we can have is an additional option for pre-releases maybe.
Comment 47 Isaias Compres 2021-08-26 09:18:17 MDT
(In reply to Artem Polyakov from comment #46)
> Here are my thoughts:
> 
> 1. From my perspective, it’s ok to drop v1.x compatibility. However, I still
> see customers looking for PMI2 support for their own reasons, so wouldn’t be
> surprised if someone wants to stick to PMIx v1.x for some reason. But I’ll
> let SchedMD to advise here. They may have more data.

Hi Artem, do you mean PMI2, or PMIx version 2?
Comment 48 Tim Wickberg 2021-08-26 13:35:06 MDT
(In reply to Artem Polyakov from comment #46)
> Here are my thoughts:
> 
> 1. From my perspective, it’s ok to drop v1.x compatibility. However, I still
> see customers looking for PMI2 support for their own reasons, so wouldn’t be
> surprised if someone wants to stick to PMIx v1.x for some reason. But I’ll
> let SchedMD to advise here. They may have more data.

PMI2 is certainly not going away, but that's off in a separate plugin.

I assume what you're talking about is the PMI2-compatibility mode that I believe existed in PMIx v1.x? I don't see a problem dropping that.

> 2. Regarding multiple versions: I’ve seen people running multiple versions
> on their systems and thought that ability to produce multiple plugins
> automatically is a good feature. The same can be achieved by having multiple
> builds and manual renaming of the plugins, but it requires more effort.
> I also don’t see too much maintenance burden so far. What we can have is an
> additional option for pre-releases maybe.

Artem covered this nicely - I would prefer to maintain the parallel plugin infrastructure at this point.

Especially if/when more MPI stacks move to adopt PMIx, I expect we'll continue to see a mix of version requirements on production systems to support them, and at the moment the way we handle this seems best suited to address that.
Comment 49 Ralph Castain 2021-08-29 12:07:25 MDT
I think there is some fundamental confusion in this thread, probably because it has been around a long time and people have forgotten/confused some things over the years. Let me try to clarify a few things.

First, PMIx makes great effort to maintain cross-version compatibility. Thus, there is no requirement or reason for the PMIx server library (linked to the Slurm daemon) to be the same version as the PMIx client library used by the application. The two libraries perform a handshake to ensure they select the necessary communication compatibility options.

The initial rationale for multiple PMIx Slurm plugins therefore had nothing to do with what the application was doing. Rather, it reflected an architectural decision based on how to deal with API additions between PMIx versions - i.e., it was focused solely on the Slurm integration question. The current approach allowed the plugins to be individually dedicated to a particular PMIx major release series.

Thus, the pmix1 plugin supports only the PMIx v1.x major release series. The APIs and server integration functions used in that plugin are those available in v1. Similarly, the pmix2 plugin supports all the v1 APIs, but adds some support for APIs introduced in PMIx v2. The pmix3 plugin includes all the support from the v2 plugin, and adds some support for APIs introduced in PMIx v3. And so on for the pmix4 plugin.

In other words, the plugins have a growing degree of duplication across them. The only differences lie in any additional support provided for new APIs introduced in the associated PMIx major release. So over time, we wind up with each plugin becoming a superset of all preceding plugins, plus any additions.

What Isaias is proposing is simply a different approach to handling the introduction of APIs across major releases. He would replace the code duplication by using "#if" statements as required to protect APIs across versions. So a sys admin would configure the Slurm support against whatever PMIx release they installed, the configure logic would detect the API version, and the code would ensure that only the appropriate version level APIs were compiled.

There are a couple of advantages to this approach. Obviously, it eliminates some code duplication, but I think the biggest advantage lies in reducing user confusion. There is no need for a user to specify (via the "-mpi=" option) which version of PMIx Slurm should use - they only need ask that PMIx support be used. As I noted earlier, the PMIx server and client libraries will automatically resolve the cross-version question.

Having multiple options can, therefore, actually cause confusion, especially for container users and those who build Open MPI using its embedded support libraries. In both cases, it is highly unlikely that the PMIx version will be the same as the sys admin used for Slurm. This is why PMIx takes such pain to continuously test cross-version support. Requiring the user to specify what PMIx version to use on the Slurm end only implies that the PMIx libraries are not compatible, and can cause users to think that their app must use the same PMIx version that they are telling Slurm to use (which is something we do see on the mailing lists).

I understand the potential negatives of disturbing existing code, so I realize this isn't a "slam dunk" decision. However, I think the advantages may outweigh the risks here.
Comment 50 Artem Polyakov 2021-08-29 13:23:03 MDT
> First, PMIx makes great effort to maintain cross-version compatibility. Thus, there is no requirement or reason for the PMIx server library (linked to the Slurm daemon) to be the same version as the PMIx client library used by the application. The two libraries perform a handshake to ensure they select the necessary communication compatibility options.

I still se advantage of being able to build Slurm against multiple PMIx versions for the sake of transitioning between the versions. Also running mismatch configuration where the RM-side PMIx is older may result in the transparent loss of functionality/performance.

> Thus, the pmix1 plugin supports only the PMIx v1.x major release series. The APIs and server integration functions used in that plugin are those available in v1. Similarly, the pmix2 plugin supports all the v1 APIs, but adds some support for APIs introduced in PMIx v2. The pmix3 plugin includes all the support from the v2 plugin, and adds some support for APIs introduced in PMIx v3. And so on for the pmix4 plugin.

> In other words, the plugins have a growing degree of duplication across them. The only differences lie in any additional support provided for new APIs introduced in the associated PMIx major release. So over time, we wind up with each plugin becoming a superset of all preceding plugins, plus any additions.

> What Isaias is proposing is simply a different approach to handling the introduction of APIs across major releases. He would replace the code duplication by using "#if" statements as required to protect APIs across versions. 

I think this is what we do today. See the v2 file below that accomodates both v2 & v3 and (in this patchset) also accomodates v4.
https://github.com/artpol84/slurm/blob/master/src/plugins/mpi/pmix/pmixp_client_v2.c
So there is no code duplication. What we do is we just build multiple instances against multiple PMIx versions when requested.
Am I missing something?
Comment 51 Ralph Castain 2021-08-29 19:48:11 MDT
> Also running mismatch configuration where the RM-side PMIx is older may result in the transparent loss of functionality/performance.

I'm not sure I understand this comment. The Standard specifically requires that any inability to perform a request return a "not supported" error. Certainly, the PMIx server library is setup to do that, so I don't see how any functional loss can be transparent.

> So there is no code duplication. What we do is we just build multiple instances against multiple PMIx versions when requested.

I think this is where we don't agree. I don't believe building multiple instances really gains anything, and it does seem to generate at least some user confusion. However, I have no strong objection to you continuing to do so.

What I propose is a simple compromise. Leave the current plugins as-is and allow Isaias to add a new "generic" plugin that is activated with "-mpi=pmix" - i.e., no specification of version. We can work the configuration logic so that sys admins get the generic plugin in addition to anything they specifically build, and Isaias can ensure it builds and supports all versions within the range of support.

Seems to me this gets everyone what they want. You can continue to support the build of multiple version-specific PMIx plugins that are activated by "-mpi=pmixN". Isaias can create one generic plugin that is more easily extended to meet the needs of his projects and is activated by "-mpi=pmix".

Make sense?
Comment 52 Isaias Compres 2021-08-30 02:31:52 MDT
> 
> What I propose is a simple compromise. Leave the current plugins as-is and
> allow Isaias to add a new "generic" plugin that is activated with
> "-mpi=pmix" - i.e., no specification of version. We can work the
> configuration logic so that sys admins get the generic plugin in addition to
> anything they specifically build, and Isaias can ensure it builds and
> supports all versions within the range of support.
> 
> Seems to me this gets everyone what they want. You can continue to support
> the build of multiple version-specific PMIx plugins that are activated by
> "-mpi=pmixN". Isaias can create one generic plugin that is more easily
> extended to meet the needs of his projects and is activated by "-mpi=pmix".
> 
> Make sense?

I think this is a good compromise.  For me this is very good, since we need to work with the bleeding edge at head.  Every version bump would require an update to the build system, with the current solution.

Let us preserve all functionality by Artem and Boris, and have a latest version case: 
If a list of available Open PMIx libraries is passed, 1 plugin per version available is produced appended with _V<major>, and the _latest_ version will be available both _with and without_ the _V<latest>.

What does everyone think about this?
Comment 53 Artem Polyakov 2021-08-30 08:14:11 MDT
Well,
We actually create a symbiotic link to the PMIx with the highest version and call it just pmix. I think it does what you want to achieve.
Comment 54 Artem Polyakov 2021-08-30 08:23:46 MDT
I mean even right now you should get a plugin with name “pmix” in the list of available.
Comment 55 Artem Polyakov 2021-08-30 08:27:57 MDT
So the question that remains is: how do we enable building non-released versions of PMIx with Slurm.
In the past it was complicated as PMIx is changing its server-side API and a the plugin wasn’t able to keep up with it.
This issue demonstrates it.

Isaias, how do you see the plugin working with the bleeding edge PMIx then? It’s a matter of luck, whether or not the server-side API will remain in a new major release.
Comment 56 Ralph Castain 2021-08-30 08:49:04 MDT
Confused - by "the server-side API" you mean the server-provided function pointers? We add to those, but I cannot recall changing them. Nobody is suggesting that SchedMD release a plugin that interfaces to the raw PMIx master branch.

All I am trying to advocate is:

(a) I'd like to move towards getting rid of these version-specific libraries as they are causing confusion amongst packagers and users. At some point, it would be nice to no longer have to answer cross-version compatibility questions and explain why a packager doesn't need the extra versions. I realize this is a long-term debate, so I don't expect to "win" it today. Hopefully, word is spreading and it will resolve itself.

(b) I'd like to allow Isaias to approach the plugin with his own "fresh" eyes, based on what he wants to achieve for the Euro projects he is supporting. I see no harm in taking a fresh cut at it so long as we provide ways to distinguish them.

HTH
Comment 57 Artem Polyakov 2021-08-30 08:55:32 MDT
Regarding server-side API.
In my understanding PMIx has 2 apis: client- and server-side.
RMs are using the server-side. This is how PMIx plugin interface PMIx library.

If the server-side API would be compatible we wouldn’t have this issue open and holding support for PNIx v4. Correct me if I’m wrong, but few new things are required to support v4 properly.

Regarding (b), I don’t mind Isaias active participation as I already told you. What is important for me is to avoid revolutionary changes disturbing existing users.
Comment 58 Isaias Compres 2021-08-30 09:06:46 MDT
(In reply to Artem Polyakov from comment #55)
> So the question that remains is: how do we enable building non-released
> versions of PMIx with Slurm.
> In the past it was complicated as PMIx is changing its server-side API and a
> the plugin wasn’t able to keep up with it.
> This issue demonstrates it.
> 
> Isaias, how do you see the plugin working with the bleeding edge PMIx then?
> It’s a matter of luck, whether or not the server-side API will remain in a
> new major release.

I Artem, I have only tested MPI launchs without dynamic processes.  That is, no Spawn, connect/accep or join MPI operations.

It works fine in my test setup with around 10 nodes.  That is head Slurm and head Open PMIx with both MPICH and Open MPI tested.
Comment 59 Isaias Compres 2021-08-30 09:13:15 MDT
Hi Artem,

just for clarity: 

I have tested first without the patches of this PR.  Basically Slurm unmodified at HEAD with only small changes related to the build (basically the version check removed so that the mpi/pmix plugin is generated with HEAD Open PMIx).

My understanding is that this PR request adds further integration.
Comment 60 Isaias Compres 2021-08-30 09:29:45 MDT
(In reply to Artem Polyakov from comment #57)
> Regarding server-side API.
> In my understanding PMIx has 2 apis: client- and server-side.
> RMs are using the server-side. This is how PMIx plugin interface PMIx
> library.
> 

There is also the "tool" API, in addition to the server and client, that in the case of Slurm would be used by the srun command, among other cli commands.


> If the server-side API would be compatible we wouldn’t have this issue open
> and holding support for PNIx v4. Correct me if I’m wrong, but few new things
> are required to support v4 properly.

I defer to Ralph and you on this.  

My current tests are passing.  Maybe I should make MPI_Init and MPI_Session_init to be more thorough.


> 
> Regarding (b), I don’t mind Isaias active participation as I already told
> you. What is important for me is to avoid revolutionary changes disturbing
> existing users.

I attest all of you have been very gracious and provided a lot of info.
Comment 61 Artem Polyakov 2021-08-30 09:47:08 MDT
I see.
I don’t mind having a minimal change commit that allows PMIx v4 support then.
I think we should open a separate issue for it to avoid confusion. This one we will bring up to speed once the other one is merged.
Comment 62 Ralph Castain 2021-08-30 09:53:41 MDT
(In reply to Artem Polyakov from comment #57)
> Regarding server-side API.
> In my understanding PMIx has 2 apis: client- and server-side.
> RMs are using the server-side. This is how PMIx plugin interface PMIx
> library.
> 
> If the server-side API would be compatible we wouldn’t have this issue open
> and holding support for PNIx v4. Correct me if I’m wrong, but few new things
> are required to support v4 properly.
> 

I believe you are a little confused, not wrong :-)

The server-side API is compatible and hasn't changed. What this PR was intended to do was _add_ support for _new_ APIs that would enhance what Slurm currently does. As Isaias has noted, there is no problem using older or newer versions of PMIx because the APIs haven't changed.

What Isaias wants to do is further extend Slurm's use of the PMIx APIs to continue to enhance what it offers. This includes the "instant on" integration in this PR, but also the tools interfaces and other features introduced in PMIx v4 and coming in PMIx v5.

Again, nothing has changed or will be changing. We will simply be adding to the capabilities.
Comment 63 Isaias Compres 2021-08-30 10:59:52 MDT
(In reply to Ralph Castain from comment #62)
> 
> What Isaias wants to do is further extend Slurm's use of the PMIx APIs to
> continue to enhance what it offers. This includes the "instant on"
> integration in this PR, but also the tools interfaces and other features
> introduced in PMIx v4 and coming in PMIx v5.
> 
> Again, nothing has changed or will be changing. We will simply be adding to
> the capabilities.

It's likely that more folks from the EuroHPC projects will get involved in the future :) 

There is currently a rather diverse amount of research and Slurm has been selected as a common workload manager, and Open PMIx as an interface for tools and runtime systems.

Of course, these efforts are mainly research, so I don't believe we will be pushing all results back to Slurm upstream.  Only high quality validated changes should go into a production ready project like Slurm.  So I understand and support Artem's (and SchedMD) concerns.
Comment 64 Artem Polyakov 2021-08-30 12:42:36 MDT
I’ve got the point about an API, and as I’ve said above we can have a separate PR to just enable PMIx v4 without new functionality.

I’m looking forward seeing more integration work from Isaias and his team.
Comment 65 Tim Wickberg 2021-08-30 12:51:48 MDT
Is there any document that clearly outlines the versioning strategy for PMIx, as well as how deprecated / removed / behaviorally APIs will be handled in the future?

Correct me if I'm wrong, but it sounds like v2/v3/v4 all expand the API but have not yet removed or drastically changed support for prior calls. At which point I can understand why the repetition has been annoying.

But, long-term, I'm assuming something will need to break in a way that is not perfectly cross-compatible, and I believe this assumption is what is driving the version-specific approach taken so far in Slurm.
Comment 66 Ralph Castain 2021-08-30 13:45:46 MDT
(In reply to Tim Wickberg from comment #65)
> Is there any document that clearly outlines the versioning strategy for
> PMIx, as well as how deprecated / removed / behaviorally APIs will be
> handled in the future?
> 
> Correct me if I'm wrong, but it sounds like v2/v3/v4 all expand the API but
> have not yet removed or drastically changed support for prior calls. At
> which point I can understand why the repetition has been annoying.
> 
> But, long-term, I'm assuming something will need to break in a way that is
> not perfectly cross-compatible, and I believe this assumption is what is
> driving the version-specific approach taken so far in Slurm.

The PMIx Standard goes into the versioning strategy at some length. Basically, it mandates a rather laborious process for deprecation and removal of any API.

Beyond that, the PMIx library (OpenPMIx) has stated (publicly in presentations and on the web site) that it will maintain backward compatibility across all versions - i.e., it will retain definitions and support for APIs deprecated/removed from the Standard to avoid breaking applications and RMs. We have done so to date for that very reason.

All that said, nothing in life is forever. So it is indeed possible that someday in the future, something will get removed from the library (I see no motivation for somebody to just change something, nor does the Standard allow that - it has to be removed and then replaced with something else as opposed to just changed). Probably somebody trimming the deprecated list by removing things as I suspect that would be the most likely motivation.

Should that happen someday, it could possibly force a change in the plugin. However, if we think that thru, it becomes a question of Slurm-PMIx release compatibility at time of plugin compile. It isn't an issue for the application. Thus, it becomes a simple note in the Slurm release (checked in PMIx plugin config) regarding supported releases. Pretty much what you have done today. If something gets removed from a later release, we'd still need to modify the plugin to protect against it going forward - and the sys admin can drop down to an earlier PMIx release once he/she sees that the plugin fails to compile on their older version of Slurm.

FWIW: in PRRTE, we simply protect APIs with "#ifdef" statements. So if an API goes away some day, that code block is just ignored. We don't expect that to happen - we use it to allow compiles against earlier versions of PMIx. Still, the principle is the same. One could easily apply it to the plugin for the same purposes and avoid the removal issue.

My concern with the version-specific approach as currently implemented is that the various version-specific plugins are getting exposed to the user. This creates an impression that the system cannot deal with cross-version situations, which is simply not the case. We shouldn't be asking the user to specify the PMIx support level in the server. If the server or Slurm cannot support a request, then they will return "not supported". It doesn't matter what version the user is using, nor what version the server is using.

Which is why we keep getting questions from the Slurm community and packagers (e.g., EasyBuild) about why they have to rebuild their app against the PMIx version Slurm is using, even though we claim to support cross-version operations. Sigh.

HTH
Comment 67 Isaias Compres 2021-08-31 04:51:51 MDT
Following Artem's advice, I have created a separate bug here:
https://bugs.schedmd.com/show_bug.cgi?id=12396

It's a small patch that only touches the build system for mpi/pmix, so hopefully easy to review.
Comment 68 Danny Auble 2021-10-28 10:51:34 MDT
Isaias, could you please add me as the Assignee here when you have something more?

As of commit fbbcddcb9b v4 no longer will register as a valid option in configure.  This was primarily done to avoid people using the package given from a standard repo.  We found today debian/ubuntu were shipping version 4.1.0.  Based on bug 12396 comment 3 v4 has always had issues compiling with this error...

../../../../../../slurm/src/plugins/mpi/pmix/pmixp_client_v2.c: In function ‘pmixp_lib_init’:
../../../../../../slurm/src/plugins/mpi/pmix/pmixp_client_v2.c:252:9: error: implicit declaration of function ‘PMIx_Register_event_handler’ [-Werror=implicit-function-declaration]
  252 |         PMIx_Register_event_handler(NULL, 0, NULL, 0, _errhandler,
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~
../../../../../../slurm/src/plugins/mpi/pmix/pmixp_client_v2.c: In function ‘pmixp_lib_finalize’:
../../../../../../slurm/src/plugins/mpi/pmix/pmixp_client_v2.c:262:9: error: implicit declaration of function ‘PMIx_Deregister_event_handler’ [-Werror=implicit-function-declaration]
  262 |         PMIx_Deregister_event_handler(0, _op_callbk, NULL);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Until v4 can officially be supported we felt it better to just remove any feeling that it actually was suppose to work.
Comment 69 Ralph Castain 2021-10-28 11:17:53 MDT
(In reply to Danny Auble from comment #68)
> Isaias, could you please add me as the Assignee here when you have something
> more?
> 
> As of commit fbbcddcb9b v4 no longer will register as a valid option in
> configure.  This was primarily done to avoid people using the package given
> from a standard repo.  We found today debian/ubuntu were shipping version
> 4.1.0.  Based on bug 12396 comment 3 v4 has always had issues compiling with
> this error...
> 
> ../../../../../../slurm/src/plugins/mpi/pmix/pmixp_client_v2.c: In function
> ‘pmixp_lib_init’:
> ../../../../../../slurm/src/plugins/mpi/pmix/pmixp_client_v2.c:252:9: error:
> implicit declaration of function ‘PMIx_Register_event_handler’
> [-Werror=implicit-function-declaration]
>   252 |         PMIx_Register_event_handler(NULL, 0, NULL, 0, _errhandler,
>       |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> ../../../../../../slurm/src/plugins/mpi/pmix/pmixp_client_v2.c: In function
> ‘pmixp_lib_finalize’:
> ../../../../../../slurm/src/plugins/mpi/pmix/pmixp_client_v2.c:262:9: error:
> implicit declaration of function ‘PMIx_Deregister_event_handler’
> [-Werror=implicit-function-declaration]
>   262 |         PMIx_Deregister_event_handler(0, _op_callbk, NULL);
>       |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> Until v4 can officially be supported we felt it better to just remove any
> feeling that it actually was suppose to work.

That is a bug in your plugin - probably failing to include <pmix.h>.
Comment 70 Carlos Tripiana Montes 2021-10-29 00:21:44 MDT
*** Ticket 12771 has been marked as a duplicate of this ticket. ***
Comment 71 Isaias Compres 2021-10-29 02:06:38 MDT
(In reply to Ralph Castain from comment #69)
> (In reply to Danny Auble from comment #68)
> > Isaias, could you please add me as the Assignee here when you have something
> > more?
> > 
> > As of commit fbbcddcb9b v4 no longer will register as a valid option in
> > configure.  This was primarily done to avoid people using the package given
> > from a standard repo.  We found today debian/ubuntu were shipping version
> > 4.1.0.  Based on bug 12396 comment 3 v4 has always had issues compiling with
> > this error...
> > 
> > ../../../../../../slurm/src/plugins/mpi/pmix/pmixp_client_v2.c: In function
> > ‘pmixp_lib_init’:
> > ../../../../../../slurm/src/plugins/mpi/pmix/pmixp_client_v2.c:252:9: error:
> > implicit declaration of function ‘PMIx_Register_event_handler’
> > [-Werror=implicit-function-declaration]
> >   252 |         PMIx_Register_event_handler(NULL, 0, NULL, 0, _errhandler,
> >       |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> > ../../../../../../slurm/src/plugins/mpi/pmix/pmixp_client_v2.c: In function
> > ‘pmixp_lib_finalize’:
> > ../../../../../../slurm/src/plugins/mpi/pmix/pmixp_client_v2.c:262:9: error:
> > implicit declaration of function ‘PMIx_Deregister_event_handler’
> > [-Werror=implicit-function-declaration]
> >   262 |         PMIx_Deregister_event_handler(0, _op_callbk, NULL);
> >       |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > 
> > Until v4 can officially be supported we felt it better to just remove any
> > feeling that it actually was suppose to work.
> 
> That is a bug in your plugin - probably failing to include <pmix.h>.

Ralph beat me to it.  

The client header is missing. This precedes the patches added in Bug 12396. I can fix it there, to keep this patch set independent. 

I will follow your lead in testing with -Werror from now on.
Comment 72 Ralph Castain 2021-11-30 09:14:13 MST
Just FYI: we are continuing this work (including rebasing much of this patch) over here (https://github.com/slurm-pmix/slurm/)
Comment 73 Michael Hinton 2022-02-15 14:13:21 MST
*** Ticket 13438 has been marked as a duplicate of this ticket. ***