Ticket 3484

Summary: Squeue Time left INVALID
Product: Slurm Reporter: paull
Component: User CommandsAssignee: Tim Wickberg <tim>
Status: RESOLVED INVALID QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: phils
Version: 16.05.7   
Hardware: Linux   
OS: Linux   
Site: DownUnder GeoSolutions Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description paull 2017-02-21 02:16:21 MST
Hi Support,

The below occurs when a job runs longer than the requested time. We usually see a negative value but are not getting "INVALID"

$SQUEUE_FORMAT                                                                                                                                     
%11P %10Q %20j %.8u %.2t %.10M %.9L %.6D %15R %i


idle        200        170_05_Demeter_2ndPa  tzuyenw  R      22:46   INVALID      1 pnod0777        23714656_6620
idle        200        170_05_Demeter_2ndPa  tzuyenw  R      22:46   INVALID      1 pnod0778        23714656_6621
idle        200        170_05_Demeter_2ndPa  tzuyenw  R      22:43   INVALID      1 pnod0786        23714656_6622
idle        200        170_05_Demeter_2ndPa  tzuyenw  R      22:43   INVALID      1 pnod0779        23714656_6623
idle        200        170_05_Demeter_2ndPa  tzuyenw  R      22:43   INVALID      1 pnod0780        23714656_6624
idle        200        170_05_Demeter_2ndPa  tzuyenw  R      22:43   INVALID      1 pnod0781        23714656_6625
idle        200        170_05_Demeter_2ndPa  tzuyenw  R      22:43   INVALID      1 pnod0782        23714656_6626

Please advise.

Thanks,
Paul
Comment 1 Tim Wickberg 2017-02-21 10:50:54 MST
Which version of the 'squeue' client command were you using previously? And were there any local patches applied?

I'm starting to wonder, based on this and bug 3483, that you may have been running a locally-modified 14.11 release?

Looking back at the code, I don't think %L has ever printed a negative number, and was always using INVALID to denote these.
Comment 2 paull 2017-02-21 16:51:41 MST
Hi Tim,

I will look through our patches and see if this was a patch given to us. 

We are running the proper version:

[root@cluster init.d]# scontrol --version
slurm 16.05.7

This is a branch created directly from the slurm-16-05-07-1 branch.

Thanks,
Paul
Comment 4 Phil Schwan 2017-03-27 04:54:42 MDT
This was due to a local patch, and I apologise for wasting your time.  I'll make sure that our procedure gets changed to avoid bugs like this getting filed in future.