Hi, These messages began repeating at very regular intervals creating some hefty log files: [2021-01-29T16:17:05.288] error: _remove_accrue_time_internal: QOS normal accrue_cnt underflow [2021-01-29T16:17:05.289] error: _remove_accrue_time_internal: QOS normal acct pi-sfrietze accrue_cnt underflow [2021-01-29T16:17:05.289] error: _remove_accrue_time_internal: QOS normal user 340507 accrue_cnt underflow [2021-01-29T16:17:05.289] error: _remove_accrue_time_internal: assoc_id 688(pi-sfrietze/arrichma/(null)) accrue_cnt underflow [2021-01-29T16:17:05.289] error: _remove_accrue_time_internal: assoc_id 137(pi-sfrietze/(null)/(null)) accrue_cnt underflow [2021-01-29T16:17:05.289] error: _remove_accrue_time_internal: assoc_id 1(root/(null)/(null)) accrue_cnt underflow I have tried a few things, searched bugs and mail archives, and google, without any luck finding anything. At the same time, I am seeing these repeated in slurmdbd.log [2021-01-29T16:00:06.527] error: We have more time than is possible (30337200+738000+0)(31075200) > 30337200 for cluster vacc(8427) from 2021-01-29T15:00:00 - 2021-01-29T16:00:00 tres 5 [2021-01-29T16:00:06.648] Warning: Note very large processing time from hourly_rollup for vacc: usec=5982917 began=16:00:00.665 Suggestions welcome, and thank you, Andy
While this bug says Importance: --- 6 - No support contract we did renew our support contract in Aug 2020 I have been told
Andy, Please set the Site field to "U of Vermont" on future tickets to have the Severity properly logged. I have updated this ticket and routed it to the support team. Thank you, Jacob
This is a duplicate of bug 7375. That bug has been delayed and I don't have an estimate of when that will be done, but I want to get it done in the next couple months. *** This ticket has been marked as a duplicate of ticket 7375 ***
(In reply to Jacob Jenson from comment #2) > Andy, > > Please set the Site field to "U of Vermont" on future tickets to have the > Severity properly logged. I have updated this ticket and routed it to the > support team. > > Thank you, > Jacob Sorry, I missed the "U of" section and couldn't find us. Thank you for this. Andy
(In reply to Marshall Garey from comment #3) > This is a duplicate of bug 7375. That bug has been delayed and I don't have > an estimate of when that will be done, but I want to get it done in the next > couple months. > > *** This bug has been marked as a duplicate of bug 7375 *** The only commonality I could find in the users and jobs generating the error, perhaps, is the use of Dependency=afterok Some jons include multiple comma delimited afterok/jobid combinations. The jobs are submitted to one just partition and normal QOS. Just trying to fill in the blanks, thank you for your effort, Andy
(In reply to Andy Evans from comment #5) > (In reply to Marshall Garey from comment #3) > > This is a duplicate of bug 7375. That bug has been delayed and I don't have > > an estimate of when that will be done, but I want to get it done in the next > > couple months. > > > > *** This bug has been marked as a duplicate of bug 7375 *** > > The only commonality I could find in the users and jobs generating the > error, perhaps, is the use of Dependency=afterok > > Some jons include multiple comma delimited afterok/jobid combinations. The > jobs are submitted to one just partition and normal QOS. > > Just trying to fill in the blanks, thank you for your effort, > Andy Thanks Andy, this is interesting and useful information. Can you comment with this information (and whatever else you can find to reproduce your situation) on bug 7375?