Ticket 5472

Summary: Slurmctld Segfaults in _purge_missing_jobs
Product: Slurm Reporter: Steve Ford <fordste5>
Component: slurmctldAssignee: Jason Booth <jbooth>
Status: RESOLVED DUPLICATE QA Contact:
Severity: 2 - High Impact    
Priority: --- CC: bart
Version: 17.11.7   
Hardware: Linux   
OS: Linux   
Site: MSU Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---
Attachments: Thead apply all bt
Slurmctld log

Description Steve Ford 2018-07-24 11:00:05 MDT
Created attachment 7388 [details]
Thead apply all bt

Our slurmctl daemon has segfaulted twice today with a backtrace that shows a call to __assert_fail from within _purge_missing_jobs.
Comment 1 Steve Ford 2018-07-24 11:02:34 MDT
Created attachment 7389 [details]
Slurmctld log
Comment 2 Jason Booth 2018-07-24 11:30:25 MDT
Hi Steve

I will follow up with you in ticket 5457 for this issue and 5473. They all look related and I believe I see the issue. 

-Jason
Comment 3 Jason Booth 2018-07-30 17:12:09 MDT
Marking as a duplicate to 5457.

*** This ticket has been marked as a duplicate of ticket 5457 ***