| Summary: | issue with --switches not working in some cases | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Todd Merritt <tmerritt> |
| Component: | Scheduling | Assignee: | Dominik Bartkiewicz <bart> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | chrisreidy |
| Version: | - Unsupported Older Versions | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | U of AZ | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
| Attachments: |
slurm.conf
slurmctld.log |
||
|
Description
Todd Merritt
2021-02-16 13:47:08 MST
Hi
Could you send me slurm.conf?
Without log or at least output from "scontrol show job <affected job_id>" I can only speculate.
But this can be related to "max_switch_wait" which is by default 300 seconds.
man slurm.conf:
...
max_switch_wait=#
Maximum number of seconds that a job can delay execution waiting for the
specified desired switch count. The default value is 300 seconds.
...
Dominik
Thanks, Dominik. I'm sure that's it. I was only looking at the man page for sbatch and saw the max switch wait but it seemed to imply that there was no limit if the default was unset. I'll bump that setting up. You can close this out. Thanks, Todd From: "bugs@schedmd.com" <bugs@schedmd.com> Date: Thursday, February 18, 2021 at 3:01 AM To: "Merritt, Todd R - (tmerritt)" <tmerritt@arizona.edu> Subject: [EXT][Bug 10878] issue with --switches not working in some cases External Email Comment # 1<https://bugs.schedmd.com/show_bug.cgi?id=10878#c1> on bug 10878<https://bugs.schedmd.com/show_bug.cgi?id=10878> from Dominik Bartkiewicz<mailto:bart@schedmd.com> Hi Could you send me slurm.conf? Without log or at least output from "scontrol show job <affected job_id>" I can only speculate. But this can be related to "max_switch_wait" which is by default 300 seconds. man slurm.conf: ... max_switch_wait=# Maximum number of seconds that a job can delay execution waiting for the specified desired switch count. The default value is 300 seconds. ... Dominik ________________________________ You are receiving this mail because: * You reported the bug. Hi I am marking the bug as resolved. Feel free to reopen it if you have a follow up question. Dominim Hi Dominik,
I've bumped the limit up to 1 hour and then to 12 hours and that does indeed seem to be what was causing it to drop the switches requirement. It's brought up a new, tangentially related question. These jobs are being submitted with QOS that should preempt jobs in our windfall partition but that does not seem to happen until the switch requirement has been removed
root@ericidle:~ # scontrol show job 590692
JobId=590692 JobName=SCCM_Diag
UserId=mmazloff(25123) GroupId=staff(340) MCS_label=N/A
Priority=24 Nice=0 Account=jrussell QOS=user_qos_jrussell
JobState=RUNNING Reason=None Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
RunTime=06:24:51 TimeLimit=5-03:00:00 TimeMin=N/A
SubmitTime=2021-02-18T13:35:22 EligibleTime=2021-02-18T13:35:22
AccrueTime=2021-02-18T13:35:22
StartTime=2021-02-19T01:35:38 EndTime=2021-02-24T04:35:38 Deadline=N/A
PreemptEligibleTime=2021-02-19T01:35:38 PreemptTime=None
SuspendTime=None SecsPreSuspend=0 LastSchedEval=2021-02-19T01:35:38
Partition=standard AllocNode:Sid=wentletrap:13272
ReqNodeList=(null) ExcNodeList=(null)
NodeList=r1u07n1,r1u12n1,r1u15n2,r1u16n1,r2u09n2,r3u03n1
BatchHost=r1u07n1
NumNodes=6 NumCPUs=576 NumTasks=564 CPUs/Task=N/A ReqB:S:C:T=0:0:*:*
TRES=cpu=576,mem=2592G,node=6,billing=576
Socks/Node=* NtasksPerN:B:S:C=94:0:*:* CoreSpec=*
MinCPUsNode=94 MinMemoryNode=432G MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
OverSubscribe=NO Contiguous=0 Licenses=(null) Network=(null)
Command=/xdisk/jrussell/SOSE/SO6/SO6_DiagBlng/run_so6_puma_wind.sh
WorkDir=/xdisk/jrussell/SOSE/SO6/SO6_DiagBlng
StdErr=/xdisk/jrussell/SOSE/SO6/SO6_DiagBlng/run_so6_puma_wind.sh.e590692
StdIn=/dev/null
StdOut=/xdisk/jrussell/SOSE/SO6/SO6_DiagBlng/run_so6_puma_wind.sh.o590692
Switches=1@12:00:00
Power=
user_qos_jrussell was created with
sacctmgr -i add qos user_qos_jrussell Priority=5 Preempt=part_qos_windfall Flags=OverPartQOS GrpTRES=cpu=4442,gres/gpu:volta=0 GrpTRESMins=cpu=50457600 GrpJobs=2000 GrpSubmit=2000
Though I can't see where the overpartqos setting is represented in the scontrol output
root@ericidle:~ # scontrol show assoc_mgr flags=qos qos=user_qos_jrussell
Current Association Manager state
QOS Records
QOS=user_qos_jrussell(101)
UsageRaw=1278054214.000000
GrpJobs=2000(2) GrpJobsAccrue=N(0) GrpSubmitJobs=2000(2) GrpWall=N(172888.48)
GrpTRES=cpu=4442(1140),mem=N(5308416),energy=N(0),node=N(12),billing=N(1140),fs/disk=N(0),vmem=N(0),pages=N(0),gres/gpu:volta=0(0)
GrpTRESMins=cpu=50457600(21300903),mem=N(98429041180),energy=N(0),node=N(223610),billing=N(21300903),fs/disk=N(0),vmem=N(0),pages=N(0),gres/gpu:volta=N(0)
GrpTRESRunMins=cpu=N(4522354),mem=N(20887687987),energy=N(0),node=N(47217),billing=N(4522354),fs/disk=N(0),vmem=N(0),pages=N(0),gres/gpu:volta=N(0)
MaxWallPJ=
MaxTRESPJ=mem=52428800
MaxTRESPN=
MaxTRESMinsPJ=
MinPrioThresh=
MinTRESPJ=
PreemptMode=OFF
Priority=5
Account Limits
jrussell
MaxJobsPA=N(2) MaxJobsAccruePA=N(0) MaxSubmitJobsPA=N(2)
MaxTRESPA=cpu=N(1140),mem=N(5308416),energy=N(0),node=N(12),billing=N(1140),fs/disk=N(0),vmem=N(0),pages=N(0),gres/gpu:volta=N(0)
User Limits
25123
MaxJobsPU=N(1) MaxJobsAccruePU=N(0) MaxSubmitJobsPU=N(1)
MaxTRESPU=cpu=N(576),mem=N(2654208),energy=N(0),node=N(6),billing=N(576),fs/disk=N(0),vmem=N(0),pages=N(0),gres/gpu:volta=N(0)
48946
MaxJobsPU=N(1) MaxJobsAccruePU=N(0) MaxSubmitJobsPU=N(1)
MaxTRESPU=cpu=N(564),mem=N(2654208),energy=N(0),node=N(6),billing=N(564),fs/disk=N(0),vmem=N(0),pages=N(0),gres/gpu:volta=N(0)
My questions are is this the expected behavior and if so is there a way to get preemption to trigger? I'll attach our slurm config as well. Thanks!
Created attachment 18020 [details]
slurm.conf
Hi Could you call your test job again but this time with extra logging active on the controller? eg.: > scontrol setdebug debug3 > scontrol setdebugflags +selecttype to revert the extra logging: > scontrol setdebug info > scontrol setdebugflags -selecttype Dominik Created attachment 18063 [details]
slurmctld.log
Thanks Dominik, I've uploaded a slurmctld log with those settings. Job id 609130 is an example of a high qos job that sits waiting on switches until the timeout. Hi Unfortunately, until 20.2.6 cons_tres has completely broken "--switches" option. This was solved by: https://github.com/SchedMD/slurm/commit/f2eef3cd6ab Sorry that I didn't catch this before. Dominik Thanks! We're planning to go to 20.11 shortly so it looks like this will be resolved in that update. You can go ahead and close this out. Thanks, Todd From: "bugs@schedmd.com" <bugs@schedmd.com> Date: Wednesday, February 24, 2021 at 8:49 AM To: "Merritt, Todd R - (tmerritt)" <tmerritt@arizona.edu> Subject: [EXT][Bug 10878] issue with --switches not working in some cases External Email Comment # 10<https://bugs.schedmd.com/show_bug.cgi?id=10878#c10> on bug 10878<https://bugs.schedmd.com/show_bug.cgi?id=10878> from Dominik Bartkiewicz<mailto:bart@schedmd.com> Hi Unfortunately, until 20.2.6 cons_tres has completely broken "--switches" option. This was solved by: https://github.com/SchedMD/slurm/commit/f2eef3cd6ab Sorry that I didn't catch this before. Dominik ________________________________ You are receiving this mail because: * You reported the bug. |