Using Intel MPI 2021 update 6 with slurm and seeing: In: PMI_Abort(2664079, Fatal error in PMPI_Init_thread: Other MPI error, error stack: 7 MPIR_Init_thread(138).........: 6 MPID_Init(1117)...............: 5 MPIDI_SHMI_mpi_init_hook(29)..: 4 MPIDI_POSIX_mpi_init_hook(141): 3 MPIDI_POSIX_eager_init(2268)..: 2 MPIDU_shm_seg_commit(296).....: unable to allocate shared memory)
(In reply to Erin Boland from comment #0) > Using Intel MPI 2021 update 6 with slurm and seeing: > > In: PMI_Abort(2664079, Fatal error in PMPI_Init_thread: Other MPI error, > error stack: > 7 MPIR_Init_thread(138).........: > 6 MPID_Init(1117)...............: > 5 MPIDI_SHMI_mpi_init_hook(29)..: > 4 MPIDI_POSIX_mpi_init_hook(141): > 3 MPIDI_POSIX_eager_init(2268)..: > 2 MPIDU_shm_seg_commit(296).....: unable to allocate shared memory) Can you please try to set this environment variable before running the job? export I_MPI_PMI_LIBRARY=/path_to_slurm/lib/libpmi2.so and try again?
Hi Felip, I had already had that environment variable set for this run. Erin
W
We got a fix - going to close the bug.
(In reply to Erin Boland from comment #4) > We got a fix - going to close the bug. Hi Erin, Can you explain which was the fix exactly? That could be useful for the future.
I am marking the bug as infogiven. If possible, I would appreciate some info about how you fixed the issue, that would be of great help for us and future responses/diagnostics. Thanks!!