Hi Team, No jobs are executing and it's impacting our production. [2023-03-21T12:12:26.784] Warning: Note very large processing time from _slurm_rpc_dump_jobs: usec=5828060 began=12:12:20.956 [2023-03-21T12:12:27.745] Warning: Note very large processing time from _slurm_rpc_allocate_resources: usec=6770181 began=12:12:20.975 [2023-03-21T12:12:27.745] sched: _slurm_rpc_allocate_resources JobId=1261189 NodeList=(null) usec=6770181 [2023-03-21T12:12:27.857] Warning: Note very large processing time from _slurmctld_background: usec=6858277 began=12:12:20.999 [2023-03-21T12:12:27.857] job_signal: 9 of pending JobId=1261148 successful [2023-03-21T12:12:28.191] Warning: Note very large processing time from dump_all_job_state: usec=4190409 began=12:12:24.001 [2023-03-21T12:12:29.925] sched: _slurm_rpc_allocate_resources JobId=1261190 NodeList=(null) usec=30493 [2023-03-21T12:12:30.224] sched: _slurm_rpc_allocate_resources JobId=1261191 NodeList=(null) usec=123222 [2023-03-21T12:12:34.593] _job_complete: JobId=1261088 WTERMSIG 126 [2023-03-21T12:12:34.593] _job_complete: JobId=1261088 cancelled by interactive user [2023-03-21T12:12:34.594] _job_complete: JobId=1261088 done [2023-03-21T12:12:34.594] _slurm_rpc_complete_job_allocation: JobId=1261088 error Job/step already completing or completed [2023-03-21T12:12:35.441] _slurm_rpc_complete_job_allocation: JobId=1261088 error Job/step already completing or completed [2023-03-21T12:12:35.466] _slurm_rpc_complete_job_allocation: JobId=1261088 error Job/step already completing or completed [2023-03-21T12:12:36.193] _job_complete: JobId=1261025 WEXITSTATUS 0 [2023-03-21T12:12:36.193] _job_complete: JobId=1261025 done [2023-03-21T12:12:36.238] sched: _slurm_rpc_allocate_resources JobId=1261192 NodeList=(null) usec=26488 [2023-03-21T12:12:36.373] sched: _slurm_rpc_allocate_resources JobId=1261193 NodeList=(null) usec=26556 [2023-03-21T12:12:36.611] Time limit exhausted for JobId=1174490 [2023-03-21T12:12:36.826] _slurm_rpc_complete_job_allocation: JobId=1174490 error Job/step already completing or completed [2023-03-21T12:12:39.894] _job_complete: JobId=1261028 WTERMSIG 126 [2023-03-21T12:12:39.894] _job_complete: JobId=1261028 cancelled by interactive user [2023-03-21T12:12:39.894] _job_complete: JobId=1261028 done [2023-03-21T12:12:39.894] _slurm_rpc_complete_job_allocation: JobId=1261028 error Job/step already completing or completed [2023-03-21T12:12:41.323] _job_complete: JobId=1261023 WEXITSTATUS 0 [2023-03-21T12:12:41.323] _job_complete: JobId=1261023 done
I am closing this out as a duplicate of bug#16219. Nate will follow up on that bug with some recommendations based on our call today. *** This ticket has been marked as a duplicate of ticket 16219 ***