Hi, that is a good finding, let me run tests and investigate.

bugs@schedmd.com wrote:

Comment # 12 on bug 459 from
Hello David,

sorry for the long delay on this one. Here are the results on the tests that
you asked me

The tests were made upon 20 nodes with 240 cpus in total. 

Version                  Average MPI_Init (sec)
Intel srun (libpmi)  :      2.82 
Intel mpirun         :      0.28
OpenMPI srun (libpmi):      2.72
OpenMPI mpirun       :      1.64


So to answer to your question, indeed the tests show that the degradation with
libpmi happens with both Intel and OpenMPI. BullxMPI is compiled with pmi2 by
default so it should not be used in the comparison.  

By the way a colleague in BULL have made tests measuring the time of the Intel
srun with libmpi on a larger cluster using different values for PMI_TIME
variable and it seems that lowering this variable improves the time
significantly: 

Nodes Ntasks PMI_TIME=500 PMI_TIME=10
10 40 4.485 3.696477
20 80 5.320 3.396676
100 400 11.966 2.217788
400 1600 111.757 9.523558
900 3600 551.599 30.90825

We are starting using this variable to workaround the delays. 
The default value of PMI_TIME in slurm/src/api/slurm_pmi.c is 500. Do you think
we should drop this to a lower value or should we work an optimization of the
logic in _delay_rpc function which makes use of PMI_TIME?

Thanks
Yiannis

You are receiving this mail because:

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.