Ticket 5472 - Slurmctld Segfaults in _purge_missing_jobs
Summary: Slurmctld Segfaults in _purge_missing_jobs
Status: RESOLVED DUPLICATE of ticket 5457
Alias: None
Product: Slurm
Classification: Unclassified
Component: slurmctld (show other tickets)
Version: 17.11.7
Hardware: Linux Linux
: 2 - High Impact
Assignee: Jason Booth
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2018-07-24 11:00 MDT by Steve Ford
Modified: 2018-07-30 17:12 MDT (History)
1 user (show)

See Also:
Site: MSU
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
Thead apply all bt (372.20 KB, text/plain)
2018-07-24 11:00 MDT, Steve Ford
Details
Slurmctld log (1.35 MB, text/plain)
2018-07-24 11:02 MDT, Steve Ford
Details

Note You need to log in before you can comment on or make changes to this ticket.
Description Steve Ford 2018-07-24 11:00:05 MDT
Created attachment 7388 [details]
Thead apply all bt

Our slurmctl daemon has segfaulted twice today with a backtrace that shows a call to __assert_fail from within _purge_missing_jobs.
Comment 1 Steve Ford 2018-07-24 11:02:34 MDT
Created attachment 7389 [details]
Slurmctld log
Comment 2 Jason Booth 2018-07-24 11:30:25 MDT
Hi Steve

I will follow up with you in ticket 5457 for this issue and 5473. They all look related and I believe I see the issue. 

-Jason
Comment 3 Jason Booth 2018-07-30 17:12:09 MDT
Marking as a duplicate to 5457.

*** This ticket has been marked as a duplicate of ticket 5457 ***