Ticket 7355

Summary: excessive slurmctld starting time
Product: Slurm Reporter: IDRIS System Team <gensyshpe>
Component: slurmctldAssignee: Marcin Stolarek <cinek>
Status: RESOLVED INVALID QA Contact:
Severity: 3 - Medium Impact    
Priority: --- CC: cinek, nate
Version: 18.08.7   
Hardware: Linux   
OS: Linux   
Site: IDRIS Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description IDRIS System Team 2019-07-04 04:08:40 MDT
Hi,
We are installing Slurm 18.08.7 on a large configuration (about 1800 compute nodes) and we noticed that slurmctld takes quite a long time (around 30 minutes) to start or to re-read the configuration file slurm.conf .
Is this the starting time we should expect or could it be the result of a problem?
What can we do to make slurmctld start faster ?

Regards,

Philipe  Collinet
Comment 1 Marcin Stolarek 2019-07-04 07:48:29 MDT
Could you please share slurmctld log from start time with us with debug2 enabled? 

cheers,
Marcin
Comment 4 IDRIS System Team 2019-07-08 05:24:13 MDT
Hello,

 We setup the debug mode and it helped US.
 We found that the adress resolution of the node running the slurmctld was node correct and had to wait for the second adress to succed the resolution.

  Whence this issue corrected, the starting time is correct.

  Thanks for your help. We can close the Bug.

Best regards,

Philippe Collinet
Comment 5 Marcin Stolarek 2019-07-08 05:35:30 MDT
Thanks for the update. I'm closing this bug report as invalid.

cheers,
Marcin