I am working with Jump Trading, and they reported that when using "systemctl reload slurmctld" after adding, removing, or modifying a node, slurmctld will exit. A SIGHUP is sent and the slurmctld is terminated. I have reproduced this as well. If the reload is done when no node changes are made, slurmctld will not exit. They are aware now that they need to do a restart, but having it terminate itself after a SIGHUP was not expected behavior. [2021-04-01T12:36:20.299] Reconfigure signal (SIGHUP) received [2021-04-01T12:36:20.300] debug: Reading slurm.conf file: /opt/slurm/fpip1_testing_default/etc/slurm.conf [2021-04-01T12:36:20.300] debug: NodeNames=fpip1-compute0002 setting Sockets=35 based on CPUs(35)/(CoresPerSocket(1)/ThreadsPerCore(1)) [2021-04-01T12:36:20.300] debug: NodeNames=fpip1-compute0004 setting Sockets=35 based on CPUs(35)/(CoresPerSocket(1)/ThreadsPerCore(1)) [2021-04-01T12:36:20.300] debug: NodeNames=fpip1-compute0005 setting Sockets=35 based on CPUs(35)/(CoresPerSocket(1)/ThreadsPerCore(1)) [2021-04-01T12:36:20.300] debug: NodeNames=fpip1-compute0006 setting Sockets=35 based on CPUs(35)/(CoresPerSocket(1)/ThreadsPerCore(1)) [2021-04-01T12:36:20.300] debug: NodeNames=fpip1-login0001 setting Sockets=35 based on CPUs(35)/(CoresPerSocket(1)/ThreadsPerCore(1)) [2021-04-01T12:36:20.300] debug: Reading cgroup.conf file /opt/slurm/fpip1_testing_default/etc/cgroup.conf [2021-04-01T12:36:20.301] error: _compare_hostnames: node count has changed before reconfiguration from 4 to 5. You have to restart slurmctld. [2021-04-01T12:36:20.301] fatal: read_slurm_conf: hostnames inconsistency detected
Marking as a duplicate of bug 10597. *** This ticket has been marked as a duplicate of ticket 10597 ***