Ticket 4027 - KNL related log spam on non-knl nodes
Summary: KNL related log spam on non-knl nodes
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: slurmd (show other tickets)
Version: 16.05.10
Hardware: Linux Linux
: 4 - Minor Issue
Assignee: Tim Wickberg
QA Contact:
URL:
: 3825 (view as ticket list)
Depends on:
Blocks:
 
Reported: 2017-07-25 11:58 MDT by john.blaas
Modified: 2017-12-07 19:26 MST (History)
1 user (show)

See Also:
Site: University of Colorado
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name: Summit
CLE Version:
Version Fixed: 17.02.4 17.11.0-pre1
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description john.blaas 2017-07-25 11:58:15 MDT
On our cluster we have a heterogenous mixture of haswell and KNL nodes.  We have noticed though that on our non-KNL nodes we are getting quite a bit of spam.

# grep /usr/bin/syscfg /var/log/messages
Jul 17 04:08:11 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 04:08:11 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 04:41:33 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 04:41:33 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 05:14:53 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 05:14:53 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 05:48:14 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 05:48:14 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 06:21:40 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 06:21:40 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 06:55:02 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 06:55:02 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 07:28:24 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 07:28:24 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 08:01:49 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 08:01:49 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 08:35:09 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 08:35:09 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 09:08:32 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 09:08:32 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 09:41:52 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 09:41:52 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 10:15:12 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 10:15:12 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 10:48:37 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 10:48:37 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 11:21:57 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 11:21:57 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 11:55:17 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 11:55:17 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 12:28:47 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 12:28:47 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 13:02:26 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 13:02:26 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 13:35:51 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 13:35:51 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 14:09:13 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 14:09:13 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 14:42:34 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 14:42:34 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 15:16:02 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory
Jul 17 15:16:02 shas0101 slurmd[34065]: error: _run_script: /usr/bin/syscfg can not be executed: No such file or directory

Thing is these nodes aren't setup with a feature of KNL, and even on the nodes that do have the feature of KNL and have a knl_generic.conf file setup with the following:

# cat knl_generic.conf 
# Managed by Puppet
SyscfgPath=/opt/dell/toolkit/bin/syscfg
DefaultNUMA=hemi         # NUMA=all2all
AllowNUMA=a2a,snc2,hemi
DefaultMCDRAM=cache     # MCDRAM=cache

So it is unclear how slurmd is even pulling up a path of /usr/bin/syscfg.

Any advice on how to rid us of this log spam would be greatly appreciated.
Comment 1 Tim Wickberg 2017-07-25 15:29:33 MDT
It's safe to ignore this, although it does admittedly generate a lot of noise.

This was fixed in 17.02.4 / 17.11.0-pre1 by commit ea2a0d25d11. If you're able to upgrade to the 17.02 branch at some point there are a lot of other assorted small fixes to the knl_generic plugin that you'll probably want as well.

- Tim
Comment 2 Tim Wickberg 2017-07-25 15:30:00 MDT
*** Ticket 3825 has been marked as a duplicate of this ticket. ***