Ticket 5473

Summary: slurmctld segfaulted in validate_jobs_on_node
Product: Slurm Reporter: Steve Ford <fordste5>
Component: slurmctldAssignee: Jason Booth <jbooth>
Status: RESOLVED DUPLICATE QA Contact:
Severity: 2 - High Impact    
Priority: ---    
Version: 17.11.7   
Hardware: Linux   
OS: Linux   
Site: MSU Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: --- DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---
Attachments: gdb thread apply all bt
Slurmctld log

Description Steve Ford 2018-07-24 11:05:34 MDT
Created attachment 7390 [details]
gdb thread apply all bt

Our slurmctl daemon has segfaulted twice today with a backtrace that shows a call to __assert_fail from within validate_jobs_on_node.
Comment 1 Steve Ford 2018-07-24 11:05:59 MDT
Created attachment 7391 [details]
Slurmctld log
Comment 3 Jason Booth 2018-07-30 17:12:39 MDT
Marking as a duplicate to 5457

*** This ticket has been marked as a duplicate of ticket 5457 ***