| Summary: | reason=AssocMaxJobsLimit not correct | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Daniel Grimwood <daniel.grimwood> |
| Component: | Scheduling | Assignee: | Director of Support <support> |
| Status: | RESOLVED FIXED | QA Contact: | Brian Christiansen <brian> |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | arc-slurm-admins, phils, tim |
| Version: | 18.08.6 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| See Also: |
https://bugs.schedmd.com/show_bug.cgi?id=8012 https://bugs.schedmd.com/show_bug.cgi?id=12202 |
||
| Site: | Pawsey | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | 19.05.5 | Target Release: | --- |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
Hi Daniel, Let me see if I understand correctly: you and Tim expect that only the highest-priority pending job should show `AssocMaxJo`, while all lesser-priority pending jobs should show `Priority`, right? Could you attach your slurm.conf and if possible indicate the job arguments necessary to reproduce this? Thanks, Michael Hi Michael, no none of them should be assocmaxjo. The association and its parents have no fewer than maxjobs=32, with only 2 running. With regards, Daniel. Hi Daniel,
We are still trying to reproduce this. Could you show the output of the following commands?:
sacctmgr show assoc
sacctmgr show qos
Was the array job requeued? Is there anything else notable about the job?
Thanks,
Michael
I should say, were any of the array jobs requeued and is there anything else notable about any of the array jobs? Hi Michael, no the jobs haven't been requeued. > sacct --start=2019-04-01 -j 3493679_73,3493679_74,3494357_62 --duplicates JobID JobName Partition Account AllocCPUS State ExitCode ------------ ---------- ---------- ---------- ---------- ---------- -------- 3493679_73 LQE_Re60_+ workq pawsey0205 48 COMPLETED 0:0 3493679_73.+ batch pawsey0205 24 COMPLETED 0:0 3493679_73.+ extern pawsey0205 48 COMPLETED 0:0 3493679_73.0 python pawsey0205 24 COMPLETED 0:0 3493679_73.1 python pawsey0205 24 COMPLETED 0:0 3493679_74 LQE_Re60_+ workq pawsey0205 48 COMPLETED 0:0 3493679_74.+ batch pawsey0205 24 COMPLETED 0:0 3493679_74.+ extern pawsey0205 48 COMPLETED 0:0 3493679_74.0 python pawsey0205 24 COMPLETED 0:0 3493679_74.1 python pawsey0205 24 COMPLETED 0:0 3494357_62 LQE_Re110+ workq pawsey0205 48 COMPLETED 0:0 3494357_62.+ batch pawsey0205 24 COMPLETED 0:0 3494357_62.+ extern pawsey0205 48 COMPLETED 0:0 The user is being a bit specific about array indices in the jobscript, and apart from that it's normal... #!/bin/bash --login #SBATCH --array=72-77:1 #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --time=23:58:00 #SBATCH --account=pawsey0205 #SBATCH --job-name=LQE_Re60_Iter1 #SBATCH --error=%x-%a.err #SBATCH --output=%x-%a.out #SBATCH --export=NONE > sacctmgr show qos Name Priority GraceTime Preempt PreemptMode Flags UsageThres UsageFactor GrpTRES GrpTRESMins GrpTRESRunMin GrpJobs GrpSubmit GrpWall MaxTRES MaxTRESPerNode MaxTRESMins MaxWall MaxTRESPU MaxJobsPU MaxSubmitPU MaxTRESPA MaxJobsPA MaxSubmitPA MinTRES ---------- ---------- ---------- ---------- ----------- ---------------------------------------- ---------- ----------- ------------- ------------- ------------- ------- --------- ----------- ------------- -------------- ------------- ----------- ------------- --------- ----------- ------------- --------- ----------- ------------- normal 10000 00:00:00 cluster 1.000000 cpu=1 disciplin+ 0 00:00:00 cluster 1.000000 0 cpu=1 longqos 0 00:00:00 cluster PartitionTimeLimit,RequiresReservation 1.000000 12-00:00:00 cpu=1 benthos 3000 00:00:00 cluster 1.000000 cpu=1 askaprt 20000 00:00:00 cluster 1.000000 pawseyove+ 20000 00:00:00 cluster PartitionMaxNodes 1.000000 prohibit 0 00:00:00 cluster 1.000000 0 highqos 20000 00:00:00 cluster 1.000000 exempt 10000 00:00:00 cluster 1.000000 I am not keen on attaching the whole assocations output, is there anything in particular you would like that isn't already in the ticket? With regards, Daniel. Hi Daniel, So it appears that when an array job is broken out into individual job records, only the reason field of the first job of the array (i.e. the meta job) gets updated in a properly and timely manner. Did you at any point have a situation where the limit _was_ hit beforehand? I was able to reproduce what you saw by temporarily setting a MaxJobs limit low, and then increasing it again. The meta jobs properly reacted to the changes, while the non-meta jobs still showed the old reason of AssocMaxJobsLimit from when I set the limit low. I haven't seen anything to indicate that this is anything more than a display issue that e.g. could affect scheduling. I'll continue to look into it deeper to find the underlying cause. Thanks, -Michael (In reply to Michael Hinton from comment #10) > I haven't seen anything to indicate that this is anything more than a > display issue that e.g. could affect scheduling. Sorry, the wording was a little ambiguous. This issue shouldn't affect scheduling. It's only a display issue. Thanks Michael. Yes we're confident it's a display issue only.
We have not changed maxjobs recently. The jobs were submitted a week after we upgraded from SLURM 17.11.9 to 18.08.6, and while doing so we increased maxsubmit and introduced maxaccrue but did not change maxjobs.
The only oddity is in how the user is submitting the array elements. I've sorted the below output...
> sacct --start 2019-04-09 --end 2019-04-10 --format=jobid,state,submit -A pawsey0205 -a -X
JobID State Submit
------------ ---------- -------------------
3487484_64 COMPLETED 2019-04-08T08:25:31
3487484_65 COMPLETED 2019-04-08T08:25:31
3487484_66 COMPLETED 2019-04-08T08:25:31
3487484_67 COMPLETED 2019-04-08T08:25:31
3487484_68 COMPLETED 2019-04-08T08:25:31
3487484_69 COMPLETED 2019-04-08T08:25:31
3487484_70 COMPLETED 2019-04-08T08:25:31
3487484_71 COMPLETED 2019-04-08T08:25:31
3490247_72 CANCELLED+ 2019-04-08T22:07:20
3490247_73 CANCELLED+ 2019-04-08T22:07:20
3490247_74 CANCELLED+ 2019-04-08T22:07:20
3490247_75 CANCELLED+ 2019-04-08T22:07:20
3490247_76 CANCELLED+ 2019-04-08T22:07:20
3490247_77 CANCELLED+ 2019-04-08T22:07:20
3493642_60 CANCELLED+ 2019-04-09T09:16:06
3493642_62 CANCELLED+ 2019-04-09T09:16:06
3493642_63 CANCELLED+ 2019-04-09T09:16:06
3493642_65 CANCELLED+ 2019-04-09T09:16:06
3493679_72 COMPLETED 2019-04-09T09:23:17
3493679_73 COMPLETED 2019-04-09T09:23:17
3493679_74 COMPLETED 2019-04-09T09:23:17
3493679_75 COMPLETED 2019-04-09T09:23:17
3493679_76 COMPLETED 2019-04-09T09:23:17
3493679_77 COMPLETED 2019-04-09T09:23:17
3494357_60 COMPLETED 2019-04-09T11:13:50
3494357_62 COMPLETED 2019-04-09T11:13:50
3494357_63 COMPLETED 2019-04-09T11:13:50
3494357_65 COMPLETED 2019-04-09T11:13:50
3495628_64 COMPLETED 2019-04-09T15:10:05
3495628_65 COMPLETED 2019-04-09T15:10:05
3495628_66 COMPLETED 2019-04-09T15:10:05
3495628_67 COMPLETED 2019-04-09T15:10:05
3495628_68 COMPLETED 2019-04-09T15:10:05
3495628_69 COMPLETED 2019-04-09T15:10:05
3495628_70 COMPLETED 2019-04-09T15:10:05
3495628_71 COMPLETED 2019-04-09T15:10:05
With regards,
Daniel.
Hey Daniel, So there are two separate issues here: 1) Why does the first job in the job array correctly show `Priority` but subsequent broken-out jobs don't? 2) Why was the reason ever set to AssocMaxJo in the first place? For 1), the expected behavior is that the reasons for all the jobs in the job array should match (at least in the vast majority of cases). I've identified what's causing this, and expect to get a patch out shortly. I will keep you updated. For 2), based on the information you've given, it's hard to say exactly what caused it. But the fix for 1) would effectively hide it. Since it's not really affecting anything other than display, it sounds like it's not at the center of this ticket. Thanks! Michael Hey Daniel, I have a patch for issue 1) waiting for review. While we wait, would you like me to share it with you so you can try it out? At this stage, the patch will likely go into 19.05 or 20.02 instead of 18.08. Thanks, -Michael Thanks Michael. We'd prefer to wait for it to hit a release rather than applying the patch. No worries about not going into 18.08. With regards, Daniel. Hi Daniel, The wrong array task reason bug has now been fixed in commit https://github.com/SchedMD/slurm/commit/ee3d4715f0071725a4b3ea260889427512778c08. As for the completely random array task reason showing up in the first place: there are a few other customers that have experienced that issue as well. That is now being tracked in bug 8012, so if you see that again, please continue the discussion there. However, the commit above would probably mask that bug. Thanks, Michael *** Ticket 8012 has been marked as a duplicate of this ticket. *** *** Ticket 7966 has been marked as a duplicate of this ticket. *** |
Hi, just looked at this with Tim while he is here, who asked for this to be in a ticket... squeue reason is listed as assocmaxjobslimit but is probably actually just priority. With regards, Daniel. reaper@magnus-2:~> squeue -u bjin -r JOBID USER ACCOUNT PARTITION NAME EXEC_HOST ST REASON START_TIME END_TIME TIME_LEFT NODES PRIORITY 3487484_70 bjin pawsey0205 workq LQE_Re60_Iter0 nid01244 R None 2019-04-09T02:32 2019-04-10T02:30 12:51:49 1 5149 3487484_69 bjin pawsey0205 workq LQE_Re60_Iter0 nid01328 R None 2019-04-09T02:18 2019-04-10T02:16 12:37:25 1 5148 3493679_72 bjin pawsey0205 workq LQE_Re60_Iter1 n/a PD Priority N/A N/A 23:58:00 1 5120 3493679_73 bjin pawsey0205 workq LQE_Re60_Iter1 n/a PD AssocMaxJo N/A N/A 23:58:00 1 5120 3493679_74 bjin pawsey0205 workq LQE_Re60_Iter1 n/a PD AssocMaxJo N/A N/A 23:58:00 1 5120 3493679_75 bjin pawsey0205 workq LQE_Re60_Iter1 n/a PD AssocMaxJo N/A N/A 23:58:00 1 5120 3493679_76 bjin pawsey0205 workq LQE_Re60_Iter1 n/a PD AssocMaxJo N/A N/A 23:58:00 1 5115 3493679_77 bjin pawsey0205 workq LQE_Re60_Iter1 n/a PD AssocMaxJo N/A N/A 23:58:00 1 5115 3494357_60 bjin pawsey0205 workq LQE_Re110_Iter n/a PD Priority N/A N/A 23:58:00 1 5115 3494357_62 bjin pawsey0205 workq LQE_Re110_Iter n/a PD AssocMaxJo N/A N/A 23:58:00 1 5115 3494357_63 bjin pawsey0205 workq LQE_Re110_Iter n/a PD AssocMaxJo N/A N/A 23:58:00 1 5108 3494357_65 bjin pawsey0205 workq LQE_Re110_Iter n/a PD AssocMaxJo N/A N/A 23:58:00 1 5108 reaper@magnus-2:~> scontrol show job 3493679_73 JobId=3493681 ArrayJobId=3493679 ArrayTaskId=73 JobName=LQE_Re60_Iter1 UserId=bjin(24886) GroupId=bjin(24886) MCS_label=N/A Priority=5120 Nice=0 Account=pawsey0205 QOS=normal JobState=PENDING Reason=AssocMaxJobsLimit Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=00:00:00 TimeLimit=23:58:00 TimeMin=N/A SubmitTime=2019-04-09T09:23:17 EligibleTime=2019-04-09T09:23:17 AccrueTime=2019-04-09T09:23:17 StartTime=Unknown EndTime=Unknown Deadline=N/A PreemptTime=None SuspendTime=None SecsPreSuspend=0 LastSchedEval=2019-04-09T13:38:43 Partition=workq AllocNode:Sid=magnus-1:19079 ReqNodeList=(null) ExcNodeList=(null) NodeList=(null) NumNodes=1-1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:1 TRES=cpu=1,mem=60000M,node=1,billing=1 Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=* MinCPUsNode=1 MinMemoryNode=60000M MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 OverSubscribe=NO Contiguous=0 Licenses=(null) Network=(null) Command=/scratch/pawsey0205/bjin/2019/LQE/Re060/task1/test.slurm WorkDir=/scratch/pawsey0205/bjin/2019/LQE/Re060/task1 StdErr=/scratch/pawsey0205/bjin/2019/LQE/Re060/task1/LQE_Re60_Iter1-73.err StdIn=/dev/null StdOut=/scratch/pawsey0205/bjin/2019/LQE/Re060/task1/LQE_Re60_Iter1-73.out Power= Snippets of scontrol show assoc... ClusterName=magnus Account=root UserName= Partition= ID=1 SharesRaw/Norm/Level/Factor=1/1.00/1/0.00 UsageRaw/Norm/Efctv=44437177776.00/1.00/1.00 ParentAccount= Lft=1 DefAssoc=No GrpJobs=N(366) GrpJobsAccrue=N(358) GrpSubmitJobs=N(1383) GrpWall=N(2801042.05) GrpTRES=cpu=N(70656),mem=N(88320000),energy=N(0),node=N(1472),billing=N(70656),fs/disk=N(0),vmem=N(0),pages=N(0) GrpTRESMins=cpu=N(740619629),mem=N(922772679827),energy=N(0),node=N(15429575),billing=N(740619629),fs/disk=N(0),vmem=N(0),pages=N(0) GrpTRESRunMins=cpu=N(49269803),mem=N(61587254000),energy=N(0),node=N(1026454),billing=N(49269803),fs/disk=N(0),vmem=N(0),pages=N(0) MaxJobs=32(366) MaxJobsAccrue= MaxSubmitJobs=128(1383) MaxWallPJ= MaxTRESPJ= MaxTRESPN= MaxTRESMinsPJ= MinPrioThresh= ClusterName=magnus Account=pawsey0205 UserName= Partition= ID=6667 SharesRaw/Norm/Level/Factor=125000/0.00/71140239/0.00 UsageRaw/Norm/Efctv=147909552.00/0.00/0.00 ParentAccount=root(1) Lft=3580 DefAssoc=No GrpJobs=N(2) GrpJobsAccrue=N(16) GrpSubmitJobs=N(12) GrpWall=N(51357.48) GrpTRES=cpu=N(96),mem=N(120000),energy=N(0),node=N(2),billing=N(96),fs/disk=N(0),vmem=N(0),pages=N(0) GrpTRESMins=cpu=N(2465159),mem=N(3081449000),energy=N(0),node=N(51357),billing=N(2465159),fs/disk=N(0),vmem=N(0),pages=N(0) GrpTRESRunMins=cpu=N(73977),mem=N(92472000),energy=N(0),node=N(1541),billing=N(73977),fs/disk=N(0),vmem=N(0),pages=N(0) MaxJobs=32(2) MaxJobsAccrue= MaxSubmitJobs=128(12) MaxWallPJ= MaxTRESPJ= MaxTRESPN= MaxTRESMinsPJ= MinPrioThresh= ClusterName=magnus Account=pawsey0205 UserName=bjin(24886) Partition=workq ID=7401 SharesRaw/Norm/Level/Factor=2147483647/0.00/0/0.00 UsageRaw/Norm/Efctv=147909552.00/0.00/1.00 ParentAccount= Lft=3587 DefAssoc=Yes GrpJobs=N(2) GrpJobsAccrue=N(16) GrpSubmitJobs=N(12) GrpWall=N(51357.48) GrpTRES=cpu=N(96),mem=N(120000),energy=N(0),node=N(2),billing=N(96),fs/disk=N(0),vmem=N(0),pages=N(0) GrpTRESMins=cpu=N(2465159),mem=N(3081449000),energy=N(0),node=N(51357),billing=N(2465159),fs/disk=N(0),vmem=N(0),pages=N(0) GrpTRESRunMins=cpu=N(73977),mem=N(92472000),energy=N(0),node=N(1541),billing=N(73977),fs/disk=N(0),vmem=N(0),pages=N(0) MaxJobs=32(2) MaxJobsAccrue=16(16) MaxSubmitJobs=512(12) MaxWallPJ= MaxTRESPJ= MaxTRESPN= MaxTRESMinsPJ= MinPrioThresh=