Ticket 450 - Select Plugin Omits an important option to the nodehealth checker binary
Summary: Select Plugin Omits an important option to the nodehealth checker binary
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: Cray ALPS (show other tickets)
Version: 14.03.x
Hardware: Linux Linux
: 3 - Medium Impact
Assignee: Danny Auble
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2013-10-09 09:17 MDT by Jason Sollom
Modified: 2013-10-09 09:54 MDT (History)
1 user (show)

See Also:
Site: CRAY
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Jason Sollom 2013-10-09 09:17:18 MDT
When the select_cray plugin calls xtcleanup_after, it omits the '-m' option.  This option tells NHC whether it is running at the end of the job step or the end of the job.  The -m option takes a non-option argument string either "application" or "reservation".

For the end of the job step, the option should be

-m application

For the end of the job, the option should be

-m reservation

Please upgrade the select_cray plugin to use the -m option.

NHC will not be properly invoked for the end of the job until this change is made.  

This will inhibit the current testing that is going on.
Comment 1 Danny Auble 2013-10-09 09:26:13 MDT
Thanks for dropping knowledge on us Jason ;).

I apparently though (foolishly) the -a or -r options would of been sufficient.

This is fixed in 2aa193ef2c366adf39fafeb9c1284d5367cf2118
Comment 2 Danny Auble 2013-10-09 09:54:56 MDT
Do not use the previous plugin, the xfrees were not updated to reflect the change in position of allocated memory.

I am waiting on a response to bug 451 to commit a fix since everything changes again.