| Summary: | slurmd: default spool dir used in _read_config() | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Jeff Frey <frey> |
| Component: | slurmd | Assignee: | Jacob Jenson <jacob> |
| Status: | RESOLVED INVALID | QA Contact: | |
| Severity: | 6 - No support contract | ||
| Priority: | --- | CC: | alex, bart, tim |
| Version: | 17.11.0 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| See Also: | https://bugs.schedmd.com/show_bug.cgi?id=4487 | ||
| Site: | Yale | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | ||
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
In (src/slurmd/slurmd/slurmd.c:832) the _update_logging() function is called from inside _read_config(). In 17.02 releases, the _update_logging() function handled the file reopening itself (with required conf fields filled-in by _read_config() before _update_logging() was called) and then returned. In 17.11, the _update_logging() function IN ADDITION attempts to contact any step daemons in existence. If does this using the value of conf->spooldir. But _read_config() has not yet filled-in conf->spooldir when _update_logging() is called (it does so later at line 937). So _update_logging() always tries the default spool directory (e.g. /var/spool/slurmd) producing the following red-herring error message in the slurmd.log: debug: Log file re-opened error: Domain socket directory /var/spool/slurmd: No such file or directory debug2: hwloc_topology_init debug2: hwloc_topology_load debug: CPUs:8 Boards:1 Sockets:2 CoresPerSocket:4 ThreadsPerCore:1 debug4: CPU map[0]=>0 S:C:T 0:0:0 debug4: CPU map[1]=>1 S:C:T 0:1:0 : This does not appear to adversely impact startup of slurmd, it just produces an unnecessary error message. Solution: the _read_config() function should be rearranged so that all conf field pre-conditions for _update_logging() are satisfied before it is called.