| Summary: | scontrol write config - output not valid slurm.conf syntax | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Michael Hammond <Michael.Hammond> |
| Component: | Configuration | Assignee: | Jacob Jenson <jacob> |
| Status: | RESOLVED INVALID | QA Contact: | |
| Severity: | 6 - No support contract | ||
| Priority: | --- | CC: | sts |
| Version: | 20.11.8 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| See Also: | https://bugs.schedmd.com/show_bug.cgi?id=3435 | ||
| Site: | -Other- | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | ||
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
Overview: scontrol write config does not always produce output which can be used as a valid configuration file. Settings SlurmctldHost, CpuFreqDef, SlurmctldSyslogDebug, and SlurmSyslogDebug were found, but may not be all instances. Case 1: For SlurmctldHost, syntax in slurm.conf is: SlurmctldHost={hostname1}{optional ({ip addr1}) } SlurmctldHost={hostname2}{optional ({ip addr2}) } Multiples are allowed and order is significant. scontrol write config produces: SlurmctldHost[1]={hostname1}({ip addr1} ) SlurmctldHost[2]={hostname2}({ip addr2} ) Case 2: If CpuFreqDef, SlurmctldSyslogDebug, or SlurmSyslogDebug are undefined, they are printed in the output of scontrol write config as: CpuFreqDef=Unknown SlurmctldSyslogDebug=unknown SlurmdSyslogDebug=unknown None of these are syntax accepted by slurmd Steps to reproduce Case 1: 1. Create slurm.conf with multiple Slurmctld hosts (snippet below) SlurmctldHost=primary.cluster(192.168.7.1) SlurmctldHost=secondary.cluster(192.168.7.2) 2. start slurmctld with this config 3. run "scontrol write config" Output will contain: SlurmctldHost[0]=primary.cluster(192.168.7.1) SlurmctldHost[1]=secondary.cluster(192.168.7.2) 4. cp output from step 3 to slurm.conf 5. run slurmd or slurmctld with new slurm.conf Alternate reproduction steps: 1, 2, 3 As steps 1,2,3 above 4. SLURM_CONF=slurm.conf-{DATE} scontrol show config root@primary:/etc/slurm# SLURM_CONF=slurm.conf-update-20211012 scontrol show config scontrol: error: Parse error in file slurm.conf-update-20211012 line 219: "SlurmctldHost[0]=primary.cluster(192.168.7.1)" scontrol: error: Parse error in file slurm.conf-update-20211012 line 221: "SlurmctldHost[1]=secondary.cluster(192.168.7.2))" scontrol: error: No SlurmctldHost defined. scontrol: fatal: Unable to process configuration file Case 2 reproduction steps: 1. Remove any definitions of CpuFreqDef, SlurmctldSyslogDebug, and SlurmSyslogDebug in slurm.conf. 2. Rerun SLURM_CONF=slurm.conf-update-20211012 scontrol show config scontrol: error: cpu_freq_verify_def: CpuFreqDef=Unknown invalid scontrol: error: Ignoring invalid CpuFreqDef: Unknown scontrol: error: Invalid SlurmctldSyslogDebug unknown scontrol: fatal: Unable to process configuration file