Ticket 13621

Summary: PMI2_Job_Spawn with info does not work with current MPICH
Product: Slurm Reporter: Hui Zhou <hzhou321>
Component: PMIxAssignee: Tim Wickberg <tim>
Status: RESOLVED FIXED QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: bart
Version: 22.05.x   
Hardware: Linux   
OS: Linux   
Site: -Other- Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: https://github.com/pmodels/mpich/pull/5909 Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Hui Zhou 2022-03-15 11:34:46 MDT
The current PMI2 API exposes an MPICH internal structure -- MPID_Info. It is used in PMI2_Job_Spawn, PMI2_Nameserv_publish, and PMI2_Nameserv_lookup. The usage of info in the latter two functions (name service) is not essential and are likely ignored anyway. However, the spawn function often requires info for key parameters, e.g. hosts for where to launch. The current definition of MPID_Info must worked with MPICH at one point, but it is broken when MPICH changed it internally quite a few years ago.

Since it was a mistake to expose the MPID_Info in PMI2, moving forward, we believe the right fix is to define a public PMI2_keyval_t as

    typedef struct PMI2_keyval_t {
        char *key;
        char *val;
    } PMI2_keyval_t;

I believe this is what PMIx's PMI2 wrapper uses as well (before they removed PMI1/2 support).

This fix is not in MPICH yet but is planned for the next release. Slurm need make corresponding changes in order to work with MPICH for the next release -- it doesn't work with the current release anyway.

Cheers,

Hui Zhou
Principal Software Development Specialist
Mathematics and Computer Science
Argonne National Laboratory
TEL: 630-252-3430
Comment 1 Tim Wickberg 2022-04-08 19:54:22 MDT
I saw that this landed upstream. Can you want propose a patch to Slurm's implementation to correct this, ideally one that's been tested against upstream MPICH?

I can see there's a gap in both of our test harnesses here since this has obviously not worked right for a while. I'll be prodding our QA group on adding this (alongside some other PMI testing) at some point, but it'd be good to address before our next release in May if possible.

- Tim
Comment 3 Hui Zhou 2022-04-08 22:24:44 MDT
We are currently making a patch in MPICH, https://github.com/pmodels/mpich/pull/5909/commits/5864dcd0d1aeb27cca8b18756be127a0b51dc4c2, to detect Slurm and make mpich work with Slurm pmi2. So it is good for now. Of course, we would like to get rid of the patch at some point. But rather than patch PMI2, we think a better future plan is to unify the functionality of PMI1 and PMI2 into a backward compatible new PMI interface, at which point we can simply deprecate PMI2.

A note to Slurm: current Slurm documentation and option interface only refers to `pmi2`, but in fact it works with both the original `pmi` and `pmi2`. In fact, it appears Slurm's pmi-1 support is more stable than its pmi2 support. The latter will spew some mysterious debug messages. On our side, pmi2 never gets sufficiently tested and it is not feature-complete compared to pmi-1. For example, it does not support singleton-init. Thus, we would recommend users to use pmi-1 with Slurm. If Slurm would amend its documentation to at least say it supports both pmi1 and pmi2, that would avoid some confusion. And the option `--mpi=pmi2` is better to be renamed as `--mpi=pmi`. This will be better aligned with the direction of unifying the PMI interface.

-- 
Hui