Ticket 14983

Summary: Question about compute node hostname changes
Product: Slurm Reporter: David Gloe <david.gloe>
Component: ConfigurationAssignee: Marcin Stolarek <cinek>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 4 - Minor Issue    
Priority: ---    
Version: 22.05.3   
Hardware: Linux   
OS: Linux   
Site: CRAY Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: Cray Internal
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description David Gloe 2022-09-16 08:37:36 MDT
We're looking into a new procedure to fill gaps in the compute node hostname assignments.
For example, if we have compute nodes nid000001, nid000002, nid000004, and nid000005, the procedure would rename nid000004 to nid000003 and nid000005 to nid000004.

We're wondering what would need to be done in Slurm to handle this change.
Do you just need to update slurm.conf and restart slurmctld? Or is there state in the spool directory that needs to be cleared manually?
Comment 1 Marcin Stolarek 2022-09-19 23:02:25 MDT
David,

For sure all slurmd's should be restarted too after the change. In general I'd recommend following our two FAQ answers:
1) What process should I follow to remove nodes from Slurm?[1]
2) What process should I follow to add nodes to Slurm?[2]

first remove the nodes that are going to be renamed and then add those nodes with new names.

What is the command you use to start slurmctld - I'm interested in the command line options used in the systemd unit file (or alternative).

cheers,
Marcin
[1]https://slurm.schedmd.com/faq.html#rem_nodes
[2]https://slurm.schedmd.com/faq.html#add_nodes
Comment 2 David Gloe 2022-09-22 09:36:40 MDT
The compute nodes are all rebooted during this change, so the slurmds will be restarted.

We start slurmctld with /usr/sbin/slurmctld -D
Comment 3 Marcin Stolarek 2022-09-26 05:55:07 MDT
This should work just fine. However, I'd recommend doing this in two steps like in our FAQ. In first step remove all nodes that are going to be renamed and then add those nodes under new names.

Let me know if the procedure is clear for you.

cheers,
Marcin
Comment 4 David Gloe 2022-09-26 06:19:25 MDT
Yes, that's clear, thank you.