Danny just note that we just added a bunch of nodes today and we just started seeing these issues.  Clusters that were added are Natgas, cumulus, explorer and musigny.  Look at the slurm.conf file and you will see the node names.


[2014-04-22T10:26:24.371] topology tree plugin loaded
[2014-04-22T10:26:24.516] Warning: Note very large processing time from slurm_topo_build_config: usec=145546 began=10:26:24.371
[2014-04-22T10:26:24.517] Gathering cpu frequency information for 12 cpus
[2014-04-22T10:26:24.517] task NONE plugin loaded
[2014-04-22T10:26:24.517] auth plugin for Munge (http://code.google.com/p/munge/) loaded
[2014-04-22T10:26:24.517] Munge cryptographic signature plugin loaded
[2014-04-22T10:26:24.534] Warning: Core limit is only 0 KB
[2014-04-22T10:26:24.534] slurmd version 2.6.4 started
[2014-04-22T10:26:24.535] Job accounting gather LINUX plugin loaded
[2014-04-22T10:26:24.535] switch NONE plugin loaded
[2014-04-22T10:26:24.535] slurmd started on Tue, 22 Apr 2014 10:26:24 -0700
[2014-04-22T10:26:24.535] CPUs=12 Boards=1 Sockets=2 Cores=6 Threads=1 Memory=96869 TmpDisk=30042 Uptime=1292
[2014-04-22T10:26:24.535] AcctGatherEnergy NONE plugin loaded
[2014-04-22T10:26:24.535] AcctGatherProfile NONE plugin loaded
[2014-04-22T10:26:24.535] AcctGatherInfiniband NONE plugin loaded
[2014-04-22T10:26:24.536] AcctGatherFilesystem NONE plugin loaded
[2014-04-22T10:41:18.472] error: forward_thread to n0008.baldur0: No route to host
[2014-04-22T15:25:51.182] error: forward_thread to n0008.baldur0: No route to host
[2014-04-22T15:47:01.128] error: forward_thread to n0008.baldur0: No route to host


On Tue, Apr 22, 2014 at 3:57 PM, <bugs@schedmd.com> wrote:

Comment # 1 on bug 740 from Danny Auble
Jackie, could you send the slurmd log during this time for one of the nodes
(n0000.cumulus0)?


You are receiving this mail because:
  • You reported the bug.