Hi, This is a spin-off from #8621, as the symptoms are similar, although the cause may be different, so here's a separate bug report. Jobs that are submitted with the --begin option seem to continue accruing age-based priority while pending with (BeginTime). It may be less than optimal to have their priority raise while they're not actively looking for an opportunity to run. Here's an example: # scontrol show job 62308791 JobId=62308791 JobName=gdrive-backup UserId=[...] GroupId=[...] MCS_label=N/A Priority=92094 Nice=0 Account=[...] QOS=normal JobState=PENDING Reason=BeginTime Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=00:00:00 TimeLimit=7-00:00:00 TimeMin=N/A SubmitTime=2020-02-29T16:45:40 EligibleTime=2020-03-07T16:45:40 AccrueTime=2020-02-29T22:36:37 StartTime=2020-03-07T16:45:40 EndTime=2020-03-14T17:45:40 Deadline=N/A SuspendTime=None SecsPreSuspend=0 LastSchedEval=2020-02-29T16:45:40 Partition=[...] AllocNode:Sid=sh01-15n08:2925 ReqNodeList=(null) ExcNodeList=(null) NodeList=(null) NumNodes=1 NumCPUs=4 NumTasks=1 CPUs/Task=4 ReqB:S:C:T=0:0:4:* TRES=cpu=4,mem=20G,node=1,billing=4 Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=* MinCPUsNode=4 MinMemoryNode=20G MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null) Command=[...] WorkDir=[...] StdErr=[...] StdIn=/dev/null StdOut=[...] Power= This job has been submitted with "sbatch --begin=now+7days", SubmitTime and EligibleTime are consistent with this. So right now, it's pending with "BeginTime", but still continues to see its age-base priority increase over time: # while true; do date; squeue -j 62308791 -o "%.9i %.8Q %32R"; sleep 300; done Thu Mar 5 08:10:16 PST 2020 JOBID PRIORITY NODELIST(REASON) 62308791 92119 (BeginTime) Thu Mar 5 08:15:21 PST 2020 JOBID PRIORITY NODELIST(REASON) 62308791 92144 (BeginTime) Thu Mar 5 08:20:53 PST 2020 JOBID PRIORITY NODELIST(REASON) 62308791 92169 (BeginTime) Thu Mar 5 08:25:02 PST 2020 JOBID PRIORITY NODELIST(REASON) 62308791 92193 (BeginTime) Thu Mar 5 08:30:25 PST 2020 JOBID PRIORITY NODELIST(REASON) 62308791 92218 (BeginTime) Yet, trying to get details with sprio fails, as sprio seems to explicitly ignore jobs in that state: # sprio -j 62308791 Unable to find jobs matching user/id(s) specified So it's not clear why the priority of those jobs still increase over time, but they definitely end up at the top of the queue, and when their EligibleTime arrives, they're pretty much skipping the whole line. Wouldn't it be better if age-based priority accrual were to be suspended until EligibleTime? Thanks! -- Kilian
Created attachment 13308 [details] Reset AccrueTime on "scontrol update job=XX Start=" (v1) Kilian, I can't reproduce the issue. What age-based priority does it's simply a calculation of a difference between the AccrueTime (visible in scontrol show job) and "now". What is strange for me in the output you attached is that AccrueTime is shifted when compared to SubmitTime, but it doesn't match EliglibleTime/StartTime. Submit time is also not from the day you posted the comment - is it possible that job StartTime was updated by scontrol while the job was pending? This is the only way I see this happening - it should be fixed by the attached patch. The patch didn't pass our Q/A, but I think it's safe to apply and as you know we appreciate users' feedback. Just for completeness since it doesn't sound like you'd be interested. If one whats to use "now - SubmitTime" instead of AccrueTime for age priority factor this can be achieved by ACCRUE_ALWAYS flag[1] cheers, Marcin [1] https://slurm.schedmd.com/slurm.conf.html#OPT_ACCRUE_ALWAYS
Hi Marcin, (In reply to Marcin Stolarek from comment #1) > I can't reproduce the issue. Which part? The job priority that increases over time for jobs submitted with --begin? Because this is pretty straightforward to reproduce on my end. And we can rule out "scontrol update" scenarios too: $ sbatch --begin=now+7days --wrap="sleep 1000" Submitted batch job 62955664 $ squeue -j 62955664 -h -o "%.9i %.8Q %32R" 62955664 62125 (BeginTime) $ sleep 500; squeue -j 62955664 -h -o "%.9i %.8Q %32R" 62955664 62154 (BeginTime) That job's priority went from 62125 to 62154 in 5mn, despite not being eligible to start for another week. Here's the full details about the job: scontrol show job 62955664 JobId=62955664 JobName=wrap UserId=kilian(215845) GroupId=ruthm(32264) MCS_label=N/A Priority=62179 Nice=0 Account=ruthm QOS=normal JobState=PENDING Reason=BeginTime Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=00:00:00 TimeLimit=02:00:00 TimeMin=N/A SubmitTime=2020-03-09T12:27:29 EligibleTime=2020-03-16T12:27:28 AccrueTime=2020-03-09T12:27:34 StartTime=2020-03-16T12:27:28 EndTime=2020-03-16T14:27:28 Deadline=N/A SuspendTime=None SecsPreSuspend=0 LastSchedEval=2020-03-09T12:27:29 Partition=normal AllocNode:Sid=sh01-ln04:133985 ReqNodeList=(null) ExcNodeList=(null) NodeList=(null) NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:* TRES=cpu=1,mem=6400M,node=1,billing=1 Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=* MinCPUsNode=1 MinMemoryCPU=6400M MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null) Command=(null) WorkDir=/home/users/kilian StdErr=/home/users/kilian/slurm-62955664.out StdIn=/dev/null StdOut=/home/users/kilian/slurm-62955664.out Power= Here are the priority weights in use: # sprio -w JOBID PARTITION PRIORITY SITE AGE FAIRSHARE JOBSIZE PARTITION QOS TRES Weights 1 100000 100000 5000000 50000 100000 CPU=0,Mem=0,GRES/gpu And we also have MaxJobsAccruePU=5 and MaxJobsAccruePA=10 on the "normal" (default) QOS, maybe that can explain the AccrueTime/SubmitTime discrepancy? Cheers, -- Kilian
Created attachment 13319 [details] Fix AccrueTime handling for 20.02 (v2) Kilian, Yes - Accrue limits were the key (Actually, I should have noticed it checking the code previously). I'm attaching the patch for Slurm 20.02, it will not work on 19.05 because of the use of missing accrue debug flag. Do you want to apply the fix on 19.05 - if yes, I'll prepare a patch for you, but since it's not a critical functionality I think it won't be merged into 19.05. cheers, Marcin
Hi Marcin, Excellent, thanks! We plan to go to 20.02 relatively soon, so I guess we can wait for the patch to be included there and won't need a specific backport for 19.05. Thank you! -- Kilian
Kilian, The issue is fixed by the following commits: 3e1c29f1 - Don't accrue time if job begin time is in the future. 64e9e116 - Remove accrue time when updating a job start/eligible time to future. Those were merged to 20.02 branch. cheers, Marcin
Excellent, thanks a lot! Cheers,