I am trying to install slurm 22.05.8 or 23.02.1 on CentOS7, but for some reason, PMIx is not being loaded: From slurmctld.log: 2023-04-10T17:16:19.794] debug3: Trying to load plugin /usr/lib64/slurm/mpi_pmix_v4.so [2023-04-10T17:16:19.795] debug3: plugin_load_from_file->_verify_syms: found Slurm plugin name:PMIx plugin type:mpi/pmix_v4 version:0x170201 [2023-04-10T17:16:19.795] error: mpi/pmix_v4: init: (null) [0]: mpi_pmix.c:197: pmi/pmix: can not load PMIx library [2023-04-10T17:16:19.795] error: Couldn't load specified plugin name for mpi/pmix_v4: Plugin init() callback failed [2023-04-10T17:16:19.795] error: MPI: Cannot create context for mpi/pmix_v4 I have built both PMIx and Slurm: wget https://github.com/openpmix/openpmix/releases/download/v4.1.2/pmix-4.1.2.tar.gz tar -xzvf pmix-4.1.2.tar.gz cd pmix-4.1.2 mkdir build cd build ../configure --prefix /opt/pmix/pmix-4.1.2 make all make install wget https://download.schedmd.com/slurm/slurm-23.02.1.tar.bz2 rpmbuild -ta slurm-23.02.1.tar.bz2 --with mysql --with slurmrestd --with jwt --with lua --with hdf5 --with hwloc --with numa --define '_with_pmix --with-pmix=/opt/pmix/pmix-4.1.2'
Additional information about the files location: [ubuntu@juju-788468-0 slurm]$ pwd /usr/lib64/slurm [ubuntu@juju-788468-0 slurm]$ ls accounting_storage_none.so cli_filter_syslog.so job_container_cncu.so openapi_dbv0_0_38.so select_cons_res.so accounting_storage_slurmdbd.so cli_filter_user_defaults.so job_container_none.so openapi_dbv0_0_39.so select_cons_tres.so acct_gather_energy_gpu.so core_spec_cray_aries.so job_container_tmpfs.so openapi_v0_0_37.so select_cray_aries.so acct_gather_energy_ibmaem.so core_spec_none.so job_submit_all_partitions.so openapi_v0_0_38.so select_linear.so acct_gather_energy_ipmi.so cred_munge.so job_submit_cray_aries.so openapi_v0_0_39.so serializer_json.so acct_gather_energy_none.so data_parser_v0_0_39.so job_submit_lua.so power_cray_aries.so serializer_url_encoded.so acct_gather_energy_pm_counters.so ext_sensors_none.so job_submit_require_timelimit.so power_none.so serializer_yaml.so acct_gather_energy_rapl.so ext_sensors_rrd.so job_submit_throttle.so preempt_none.so site_factor_none.so acct_gather_energy_xcc.so gpu_generic.so libslurmfull.so preempt_partition_prio.so src acct_gather_filesystem_lustre.so gres_gpu.so libslurm_pmi.so preempt_qos.so switch_cray_aries.so acct_gather_filesystem_none.so gres_mps.so mcs_account.so prep_script.so switch_none.so acct_gather_interconnect_none.so gres_nic.so mcs_group.so priority_basic.so task_affinity.so acct_gather_interconnect_sysfs.so gres_shard.so mcs_none.so priority_multifactor.so task_cgroup.so acct_gather_profile_hdf5.so hash_k12.so mcs_user.so proctrack_cgroup.so task_cray_aries.so acct_gather_profile_influxdb.so jobacct_gather_cgroup.so mpi_cray_shasta.so proctrack_cray_aries.so task_none.so acct_gather_profile_none.so jobacct_gather_linux.so mpi_none.so proctrack_linuxproc.so topology_3d_torus.so auth_jwt.so jobacct_gather_none.so mpi_pmi2.so proctrack_pgid.so topology_hypercube.so auth_munge.so jobcomp_elasticsearch.so mpi_pmix.so rest_auth_jwt.so topology_none.so burst_buffer_datawarp.so jobcomp_filetxt.so mpi_pmix_v4.so rest_auth_local.so topology_tree.so burst_buffer_lua.so jobcomp_lua.so node_features_helpers.so route_default.so cgroup_v1.so jobcomp_mysql.so node_features_knl_cray.so route_topology.so cli_filter_lua.so jobcomp_none.so node_features_knl_generic.so sched_backfill.so cli_filter_none.so jobcomp_script.so openapi_dbv0_0_37.so sched_builtin.so
Hi @jaime, I am having similar problems when integrating PMIX with slurm-23.11.3 I tried compiling both pmix: 3.2.2 4.2.8 B oth pmix compile fine, but when when I try to integrate slurm via rpmmacross the code for pmix library is not showing under the libpmi rpm: The following is the command I am using for the RPM creation: rpmbuild --define '--with-pmix=/SOFTWARE/MPIx/openpmix-3.2.2_master/lib:/SOFTWARE/MPIx/openpmix-4.2.8_master/lib' -tb slurm-23.11.3.tar.bz2 --with mysql --with ofed 2>&1 | tee build.log The following is the pmi libraries I observed inside the generated RPM, PMIX is nowhere to be found: rpm -qpl /root/rpmbuild/RPMS/x86_64/slurm-libpmi-23.11.3-1.el8.x86_64.rpm /usr/lib/.build-id /usr/lib/.build-id/e1/e445b724c722b88e13f39fcb27b7d30d32b8e1 /usr/lib/.build-id/ea/01a8978a2b93754a7e71cc9048b75b71084513 /usr/lib64/libpmi.so /usr/lib64/libpmi.so.0 /usr/lib64/libpmi.so.0.0.0 /usr/lib64/libpmi2.so /usr/lib64/libpmi2.so.0 /usr/lib64/libpmi2.so.0.0.0
I am also having issues with pmix. it couldn't load specified plugin name for mpi/pmix_v4. Please see the output of systemctl status slumrd: slurmd: slurmd version 23.11.6 started Sep 09 12:18:00 c-22 slurmd[151268]: slurmd: error: mpi/pmix_v4: init: (null) [0]: mpi_pmix.c:193: pmi/pmix: can not load PMIx library Sep 09 12:18:00 c-22 slurmd[151268]: slurmd: error: Couldn't load specified plugin name for mpi/pmix_v4: Plugin init() callback failed Sep 09 12:18:00 c-22 slurmd[151268]: slurmd: error: MPI: Cannot create context for mpi/pmix_v4 Sep 09 12:18:00 c-22 slurmd[151268]: slurmd: error: mpi/pmix_v4: init: (null) [0]: mpi_pmix.c:193: pmi/pmix: can not load PMIx library Sep 09 12:18:00 c-22 slurmd[151268]: slurmd: error: Couldn't load specified plugin name for mpi/pmix: Plugin init() callback failed Sep 09 12:18:00 c-22 slurmd[151268]: slurmd: error: MPI: Cannot create context for mpi/pmix Sep 09 12:18:00 c-22 slurmd[151268]: slurmd: slurmd started on Mon, 09 Sep 2024 12:18:00 -0400 Sep 09 12:18:00 c-22 systemd[1]: Started Slurm node daemon. Sep 09 12:18:00 c-22 slurmd[151268]: slurmd: CPUs=128 Boards=1 Sockets=2 Cores=32 Threads=2 Memory=515614 TmpDisk=374387 Uptime=780974 CPUSpecList=(null) FeaturesAvail=(null) FeaturesActive=(null)