2067 – Bad job records

Ticket 2067 - Bad job records

Summary: Bad job records

Status:	RESOLVED FIXED

Alias:	None

Product:	Slurm
Classification:	Unclassified
Component:	Accounting (show other tickets)
Version:	15.08.1
Hardware:	Linux Linux

Severity:	3 - Medium Impact
Assignee:	Danny Auble
QA Contact:

URL:

Depends on:
Blocks:

Reported:	2015-10-27 00:49 MDT by Josko Plazonic
Modified:	2015-11-02 02:02 MST (History)
CC List:	4 users (show)

See Also:
Site:	Princeton (PICSciE)
Slinky Site:	---
Alineos Sites:	---
Atos/Eviden Sites:	---
Confidential Site:	---
Coreweave sites:	---
Cray Sites:	---
DS9 clusters:	---
Google sites:	---
HPCnow Sites:	---
HPE Sites:	---
IBM Sites:	---
NOAA SIte:	---
NoveTech Sites:	---
Nvidia HWinf-CS Sites:	---
OCF Sites:	---
Recursion Pharma Sites:	---
SFW Sites:	---
SNIC sites:	---
Tzag Elita Sites:	---
Linux Distro:	---
Machine Name:
CLE Version:
Version Fixed:	15.08.3 16.05.0-pre1
Target Release:	---
DevPrio:	---
Emory-Cloud Sites:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this ticket.

Description Josko Plazonic 2015-10-27 00:49:37 MDT

Hi, 

we are seeing an increasing number of bad jobs that are missing ReqTRES & alloc_nodes, plus start of the job is frequently equal to end of job (for main part, not for batch):

 sacct -j 5891006 -o 'jobidraw,jobid,start,end,ntasks,ncpus,nnodes,reqmem,reqtres%20,cputime,usercpu,cputimeraw%20,elapsed,node'
    JobIDRaw        JobID               Start                 End   NTasks      NCPUS   NNodes     ReqMem              ReqTRES    CPUTime    UserCPU           CPUTimeRAW    Elapsed        NodeList 
------------ ------------ ------------------- ------------------- -------- ---------- -------- ---------- -------------------- ---------- ---------- -------------------- ---------- --------------- 
5891006      0_0          2015-10-25T00:13:00 2015-10-25T00:13:00                            0         0n                        00:00:00  18:46.762                    0   00:00:00    della-r1c4n5 
5891006.bat+ 0_0.batch    2015-10-24T23:54:04 2015-10-25T00:13:00        1                   1         0n                         INVALID  18:46.762 18446744073709550480   00:18:56    della-r1c4n5 

on 10/24 we had over 4000 of jobs with such bad records.  While we are at it - can we talk about CPUTime and CPUTimeRAW?  Seems like that's always invalid/huge for batch, e.g. for a job that does have ncpus/nnodes/reqtres:

 sacct -j 5892006 -o 'jobidraw,jobid,start,end,ntasks,ncpus,nnodes,reqmem,reqtres%20,cputime,usercpu,cputimeraw%20,elapsed,node'
    JobIDRaw        JobID               Start                 End   NTasks      NCPUS   NNodes     ReqMem              ReqTRES    CPUTime    UserCPU           CPUTimeRAW    Elapsed        NodeList 
------------ ------------ ------------------- ------------------- -------- ---------- -------- ---------- -------------------- ---------- ---------- -------------------- ---------- --------------- 
5892006      5892006      2015-10-25T09:42:58 2015-10-25T09:46:38                   1        1     4000Mn cpu=1,mem=4000,node+   00:03:40  03:35.443                  220   00:03:40   della-r3c4n10 
5892006.bat+ 5892006.bat+ 2015-10-25T09:42:58 2015-10-25T09:46:38        1                   1     4000Mn                         INVALID  03:35.443 18446744073709551396   00:03:40   della-r3c4n10 

Thanks!

Comment 1 David Bigagli 2015-10-27 01:54:24 MDT

Hi,
   I am working on this. I see the job that don't have tres have 
a huge cputimeraw  and invalid cputime.

David

Comment 2 Josko Plazonic 2015-10-27 02:08:15 MDT

Well, not entirely - it looks like the INVALID time is happening *always* for batch portion of the job, e.g.:

5817297      5817297      2015-10-26T17:33:24 2015-10-27T00:48:45                  80        4     4000Mc cpu=80,mem=320000,n+ 24-04:28:00 15-05:00:+              2089680   07:15:21 della-r3c1n9,d+ 
5817297.bat+ 5817297.bat+ 2015-10-26T17:33:24 2015-10-27T00:48:45        1                   1     4000Mc                         INVALID  00:30.412 18446744073709525495   07:15:21    della-r3c1n9 
5817297.0    5817297.0    2015-10-26T17:33:24 2015-10-26T17:35:49       80         80        4     4000Mc                        03:13:20   01:24:19                11600   00:02:25 della-r3c1n9,d+ 
5817297.1    5817297.1    2015-10-26T17:38:26 2015-10-27T00:48:45       80         80        4     4000Mc                      23-21:45:20 15-03:35:+              2065520   07:10:19 della-r3c1n9,d+ 

So looks like we have two things happening - batch portion is not accounted for correctly AND some jobs are missing reqtres and other info.

Comment 3 David Bigagli 2015-10-27 02:18:13 MDT

yes I was looking at the batch part only without steps

Comment 4 Josko Plazonic 2015-10-27 02:30:44 MDT

Actually it seems like there is an obvious culprit (linked to our slurmdbd trouble - ticket #2066):

[2015-10-24T23:53:11.301] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-24T23:55:13.666] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-24T23:57:15.464] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-24T23:59:16.118] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-25T00:01:18.142] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-25T00:03:19.003] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-25T00:05:20.096] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-25T00:07:21.510] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-25T00:09:23.047] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-25T00:11:24.628] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-25T00:13:26.195] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-25T00:15:00.941] error: slurmdbd: response is not type DBD_RC: DBD_GOT_MULT_MSG(1475)
[2015-10-25T00:15:02.390] error: SlurmDBD connection experienced an error
[2015-10-25T00:15:02.390] error: slurmdbd: Getting response to message type 1472
[2015-10-25T00:15:02.390] error: slurmdbd: DBD_SEND_MULT_JOB_START failure: No error
[2015-10-25T00:15:31.126] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW

which covers full period from start to end of job.  That probably explains 5891006 bad records (though, clearly, not missing batch data).

Comment 5 Danny Auble 2015-10-27 02:39:29 MDT

Josko, could you send us the dbd.messages file in your state directory?  Not being able to get the dbd to read the messages is most likely the problem in all the tickets submitted today. 

On October 27, 2015 7:30:44 AM PDT, bugs@schedmd.com wrote:
>http://bugs.schedmd.com/show_bug.cgi?id=2067
>
>--- Comment #4 from Josko Plazonic <plazonic@princeton.edu> ---
>Actually it seems like there is an obvious culprit (linked to our
>slurmdbd
>trouble - ticket #2066):
>
>[2015-10-24T23:53:11.301] error: slurmdbd: agent queue filling, RESTART
>SLURMDBD NOW
>[2015-10-24T23:55:13.666] error: slurmdbd: agent queue filling, RESTART
>SLURMDBD NOW
>[2015-10-24T23:57:15.464] error: slurmdbd: agent queue filling, RESTART
>SLURMDBD NOW
>[2015-10-24T23:59:16.118] error: slurmdbd: agent queue filling, RESTART
>SLURMDBD NOW
>[2015-10-25T00:01:18.142] error: slurmdbd: agent queue filling, RESTART
>SLURMDBD NOW
>[2015-10-25T00:03:19.003] error: slurmdbd: agent queue filling, RESTART
>SLURMDBD NOW
>[2015-10-25T00:05:20.096] error: slurmdbd: agent queue filling, RESTART
>SLURMDBD NOW
>[2015-10-25T00:07:21.510] error: slurmdbd: agent queue filling, RESTART
>SLURMDBD NOW
>[2015-10-25T00:09:23.047] error: slurmdbd: agent queue filling, RESTART
>SLURMDBD NOW
>[2015-10-25T00:11:24.628] error: slurmdbd: agent queue filling, RESTART
>SLURMDBD NOW
>[2015-10-25T00:13:26.195] error: slurmdbd: agent queue filling, RESTART
>SLURMDBD NOW
>[2015-10-25T00:15:00.941] error: slurmdbd: response is not type DBD_RC:
>DBD_GOT_MULT_MSG(1475)
>[2015-10-25T00:15:02.390] error: SlurmDBD connection experienced an
>error
>[2015-10-25T00:15:02.390] error: slurmdbd: Getting response to message
>type
>1472
>[2015-10-25T00:15:02.390] error: slurmdbd: DBD_SEND_MULT_JOB_START
>failure: No
>error
>[2015-10-25T00:15:31.126] error: slurmdbd: agent queue filling, RESTART
>SLURMDBD NOW
>
>which covers full period from start to end of job.  That probably
>explains
>5891006 bad records (though, clearly, not missing batch data).
>
>-- 
>You are receiving this mail because:
>You are on the CC list for the bug.

Comment 6 Josko Plazonic 2015-10-27 02:54:43 MDT

I am afraid that that file is empty right now - I did restart slurmdbd when I (finally... working on adding notification for further trouble) noticed it was not functioning correctly.

So probably missing fields could be attributed to db trouble but missing batch data is present even now after restart and apparent correct functioning of slurmdbd:

    JobIDRaw        JobID               Start                 End   NTasks      NCPUS   NNodes     ReqMem              ReqTRES    CPUTime    UserCPU           CPUTimeRAW    Elapsed        NodeList                 End 
------------ ------------ ------------------- ------------------- -------- ---------- -------- ---------- -------------------- ---------- ---------- -------------------- ---------- --------------- ------------------- 
5936393      5936393      2015-10-27T10:50:59 2015-10-27T10:51:59                   6        1     4000Mc cpu=6,mem=24000,nod+   00:06:00  05:59.899                  360   00:01:00    della-r2c3n2 2015-10-27T10:51:59 
5936393.bat+ 5936393.bat+ 2015-10-27T10:50:59 2015-10-27T10:51:59        1                   1     4000Mc                         INVALID  05:59.899 18446744073709551556   00:01:00    della-r2c3n2 2015-10-27T10:51:59 

Freshly ran (6 core stress job) - still INVALID.

Also the other ticket I reported (2068) - those messages were appearing before slurmdbd trouble and still appearing after restart.

Comment 7 David Bigagli 2015-10-27 02:58:56 MDT

I can reproduce the missing and invalid data and I did not have any dbd 
trouble. I am investigating where the problem is.

David

Comment 8 Danny Auble 2015-10-27 02:59:10 MDT

Could you stop the slurmctld and grab it if there is something there?  
Then restart the slurmctld.  I am guessing this is the main problem 
though.  Since the slurmctld isn't getting anything to the dbd all the 
data will be incomplete until it gets there.

On 10/27/15 07:54, bugs@schedmd.com wrote:
>
> *Comment # 6 <http://bugs.schedmd.com/show_bug.cgi?id=2067#c6> on bug 
> 2067 <http://bugs.schedmd.com/show_bug.cgi?id=2067> from Josko 
> Plazonic <mailto:plazonic@princeton.edu> *
> I am afraid that that file is empty right now - I did restart slurmdbd when I
> (finally... working on adding notification for further trouble) noticed it was
> not functioning correctly.
>
> So probably missing fields could be attributed to db trouble but missing batch
> data is present even now after restart and apparent correct functioning of
> slurmdbd:
>
>      JobIDRaw        JobID               Start                 End   NTasks
> NCPUS   NNodes     ReqMem              ReqTRES    CPUTime    UserCPU
> CPUTimeRAW    Elapsed        NodeList                 End
> ------------ ------------ ------------------- ------------------- --------
> ---------- -------- ---------- -------------------- ---------- ----------
> -------------------- ---------- --------------- -------------------
> 5936393      5936393      2015-10-27T10:50:59 2015-10-27T10:51:59
>      6        1     4000Mc cpu=6,mem=24000,nod+   00:06:00  05:59.899
>        360   00:01:00    della-r2c3n2 2015-10-27T10:51:59
> 5936393.bat+ 5936393.bat+ 2015-10-27T10:50:59 2015-10-27T10:51:59        1
>               1     4000Mc                         INVALID  05:59.899
> 18446744073709551556   00:01:00    della-r2c3n2 2015-10-27T10:51:59
>
> Freshly ran (6 core stress job) - still INVALID.
>
> Also the other ticket I reported (2068) - those messages were appearing before
> slurmdbd trouble and still appearing after restart.
> ------------------------------------------------------------------------
> You are receiving this mail because:
>
>   * You are on the CC list for the bug.
>

Comment 9 Josko Plazonic 2015-10-27 03:06:46 MDT

stopped slurmctld, sadly dbd.messages still had zero length.

Any way to increase how much data slurmctld is willing to cache in case of slurmdbd communication trouble - i.e. if there a fixed limit or is it willing to cache for as long as necessary?

Comment 10 David Bigagli 2015-10-27 03:43:30 MDT

Is the problem still happening? From the comment 6 it seems that newly
submitted jobs are being entered in the database, beside the missing
or apparently incorrect fields. Is this correct?

David

Comment 11 Josko Plazonic 2015-10-27 03:50:53 MDT

So far, after restart, things look good and it appears that reqtres field is present - of course not that many jobs started and finished since then so can't be 100% sure. We will definitely try to keep a closer eye on nightly accounting processings - that's where anomalies usually surface.

Comment 12 David Bigagli 2015-10-27 04:14:47 MDT

Good to hear. What exactly do you run in your nightly accounting processing?

David

Comment 13 Tim Wickberg 2015-10-27 04:55:01 MDT

*** Ticket 2068 has been marked as a duplicate of this ticket. ***

Comment 14 Danny Auble 2015-10-27 07:06:25 MDT

Jasko, in comment 4, are these all the errors in the log dealing with the DBD grep "slurmdbd"?  Just wondering when they started and ended and if the communication failed multiple times or was constant.

Also was the DBD down during this period or just not responsive for one reason or another.

Comment 15 Josko Plazonic 2015-10-27 07:13:31 MDT

No, we had numerous errors but they did not start at the same time that slurmdbd started spewing those logs (that was on the morning of 19th).  First time della's slurmctld started complaining about slurmdbd was on 23rd:

[2015-10-23T16:40:26.769] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T16:42:28.806] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T16:44:29.239] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T16:46:32.133] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T16:48:34.525] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T16:50:35.246] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T16:52:37.215] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T16:54:42.626] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T16:56:43.101] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T16:58:45.299] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T17:00:47.694] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T17:02:49.673] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T17:04:52.098] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T17:06:01.823] error: _handle_assoc_tres_run_secs: job 5872667: assoc 440 TRES cpu grp_used_tres_run_secs underflow, tried to remove 6508 seconds when only 6448 remained.
[2015-10-23T17:06:01.823] error: _handle_assoc_tres_run_secs: job 5872667: assoc 440 TRES mem grp_used_tres_run_secs underflow, tried to remove 26032000 seconds when only 25792000 remained.
[2015-10-23T17:06:01.823] error: _handle_assoc_tres_run_secs: job 5872667: assoc 440 TRES node grp_used_tres_run_secs underflow, tried to remove 6508 seconds when only 6448 remained.
[2015-10-23T17:06:01.823] error: _handle_assoc_tres_run_secs: job 5872667: assoc 24 TRES cpu grp_used_tres_run_secs underflow, tried to remove 6508 seconds when only 6448 remained.
[2015-10-23T17:06:01.823] error: _handle_assoc_tres_run_secs: job 5872667: assoc 24 TRES mem grp_used_tres_run_secs underflow, tried to remove 26032000 seconds when only 25792000 remained.
[2015-10-23T17:06:01.823] error: _handle_assoc_tres_run_secs: job 5872667: assoc 24 TRES node grp_used_tres_run_secs underflow, tried to remove 6508 seconds when only 6448 remained.
[2015-10-23T17:06:53.853] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T17:08:55.333] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T17:11:03.838] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T17:13:10.542] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T17:15:11.921] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T17:17:12.001] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T17:19:15.024] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T17:21:20.434] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T17:23:30.263] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T17:25:32.807] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-23T17:27:46.640] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-24T06:35:36.291] error: _handle_assoc_tres_run_secs: job 5873777: assoc 354 TRES cpu grp_used_tres_run_secs underflow, tried to remove 78327 seconds when only 78267 remained.
[2015-10-24T06:35:36.291] error: _handle_assoc_tres_run_secs: job 5873777: assoc 354 TRES mem grp_used_tres_run_secs underflow, tried to remove 313308000 seconds when only 313068000 remained.

e.g. you can see there that it was constant for a while, then it stops for many hours) and it resumes complaining at:

[2015-10-24T21:06:15.434] error: slurm_receive_msg: Zero Bytes were transmitted or received
[2015-10-24T21:12:03.186] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
[2015-10-24T21:14:06.173] error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW

You can find the full della slurmctld log at:
http://tigress-web.princeton.edu/~plazonic/della-slurmctld.logs.txt

Comment 16 Danny Auble 2015-10-27 07:25:44 MDT

Ok, the error message that points to lost data (which is missing) is

slurmdbd: agent queue is full, discarding request

If you grep for slurmdbd you will see the size in the form of messages like

slurmdbd: agent queue size $NUMBER

They are info's instead of errors.  As long as that number doesn't keep growing you should be ok.  If the database is busy for some reason you may see these messages depending on the activity on the slurmctld.  So it isn't strange to see them come and go.

Do you have anything in the slurmdbd log that points to around [2015-10-23T16:40:26.769]?

Comment 17 Danny Auble 2015-10-27 09:56:47 MDT

Josko, this appears to be fixed in commits

78098296f35
3132d34a9f1
fd12a0910ad
0804545c345
02db31c29d4

The good news is this appears to only be related to the batch step.  The problem is only cosmetic as well.  Applying the patches above will fix the problem completely.  If you are just looking for a bandaid you only really need commit 0804545c345.

It doesn't appear you have lost any messages as noted in Comment 16.  

Let us know if this fixes your issues or not.

Comment 18 Danny Auble 2015-10-27 10:12:39 MDT

You will also want 14a39998c48f it fixes a memory leak in fd12a0910adf75.

Comment 19 Josko Plazonic 2015-10-27 12:35:43 MDT

(In reply to Danny Auble from comment #16)
> Ok, the error message that points to lost data (which is missing) is
> 
> slurmdbd: agent queue is full, discarding request

Which we didn't seem to have in this case.

> If you grep for slurmdbd you will see the size in the form of messages like
> 
> slurmdbd: agent queue size $NUMBER

We also didn't have this message in logs - I assume you mean in slurmctld logs.

> They are info's instead of errors.  As long as that number doesn't keep
> growing you should be ok.  If the database is busy for some reason you may
> see these messages depending on the activity on the slurmctld.  So it isn't
> strange to see them come and go.

Aha, maybe our logging level is too low for these to show up? Though I'd hope that discarding requests messages should be an error rather then info.

> Do you have anything in the slurmdbd log that points to around
> [2015-10-23T16:40:26.769]?

Sadly we lost all slurmdbd logs after morning of 2015-10-19 :( but if we were supposed to get "discarding requests" messages and we didn't doesn't that mean we really didn't loose any data (from this cluster)?

I am going to apply those patches soon - it looks like that only slurmctld and slurmdbd's servers need to be updated so unless you tell me differently that's where I'll apply updates first.

Comment 20 Danny Auble 2015-10-27 14:44:22 MDT

(In reply to Josko Plazonic from comment #19)
> (In reply to Danny Auble from comment #16)
> > Ok, the error message that points to lost data (which is missing) is
> > 
> > slurmdbd: agent queue is full, discarding request
> 
> Which we didn't seem to have in this case.

That is good :).

> 
> > If you grep for slurmdbd you will see the size in the form of messages like
> > 
> > slurmdbd: agent queue size $NUMBER
> 
> We also didn't have this message in logs - I assume you mean in slurmctld
> logs.

Yes, the slurmctld logs.  info is the level of debug you would need.  If you have the debug to only error you wouldn't see these.  They are really just for reference to tell you how many messages you have in the system to be sent.

> 
> > They are info's instead of errors.  As long as that number doesn't keep
> > growing you should be ok.  If the database is busy for some reason you may
> > see these messages depending on the activity on the slurmctld.  So it isn't
> > strange to see them come and go.
> 
> Aha, maybe our logging level is too low for these to show up? Though I'd
> hope that discarding requests messages should be an error rather then info.

Yes, the discarding request message is an error message, so you would see them if you had them.

> 
> > Do you have anything in the slurmdbd log that points to around
> > [2015-10-23T16:40:26.769]?
> 
> Sadly we lost all slurmdbd logs after morning of 2015-10-19 :( but if we
> were supposed to get "discarding requests" messages and we didn't doesn't
> that mean we really didn't loose any data (from this cluster)?

I don't believe you lost any data for this cluster during this time period, at least there isn't any evidence that supports it here.

> 
> I am going to apply those patches soon - it looks like that only slurmctld
> and slurmdbd's servers need to be updated so unless you tell me differently
> that's where I'll apply updates first.

The patches that really matter are for the slurmctld.  You would only need the slurmdbd patch if you were writing directly to mysql (which you are not ;)).  The other patches are for sacct which will correct the view you have for older jobs.

Comment 21 Danny Auble 2015-10-30 10:34:32 MDT

Josko, how are things going?  Can we close this?

Comment 22 Josko Plazonic 2015-11-02 01:18:45 MST

Yup, for now everything seems to be behaving and I can't seem to find newly finished jobs without reqtres and other fields.  If they re-appear we'll re-open.  Thanks!

Comment 23 Danny Auble 2015-11-02 02:02:10 MST

No problem, glad things worked out.