Ticket 2643

Summary: Remove delay for shutdown when waiting for health check
Product: Slurm Reporter: Thomas HAMEL <hmlth>
Component: slurmdAssignee: Moe Jette <jette>
Status: RESOLVED FIXED QA Contact:
Severity: 5 - Enhancement    
Priority: ---    
Version: 16.05.x   
Hardware: Linux   
OS: Linux   
Site: EDF - Electricite de France Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: --- DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed: 16.05.0-pre3
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---
Attachments: [PATCH] Separate health check from shutdown check

Description Thomas HAMEL 2016-04-15 02:04:57 MDT
Created attachment 3003 [details]
[PATCH] Separate health check from shutdown check

While deploying the patch discussed here (https://bugs.schedmd.com/show_bug.cgi?id=2504) we had an issue. Slurmd shutdown while waiting for nhc could be delayed up to 10 seconds, this was causing isssues during runlevel changes with sysvrc (it was also a bit annoying).

The attached patch removes this delay by checking if there is a shutdown requested every 10ms. A similar patch against 15.08 was deployed in production, this has been tested on a small test 16.05 git install.

Thanks again for including the previous patch.
Comment 2 Moe Jette 2016-04-15 05:27:25 MDT
Thank you for your contribution. Your patch is committed here:
https://github.com/SchedMD/slurm/commit/988edf1227855e4c578476be3d431a144b1b628f