6265 – Jobs no longer being preempted after switch to 17.11.8 from 17.02

Ticket 6265 - Jobs no longer being preempted after switch to 17.11.8 from 17.02

Summary: Jobs no longer being preempted after switch to 17.11.8 from 17.02

Status:	RESOLVED INFOGIVEN

Alias:	None

Product:	Slurm
Classification:	Unclassified
Component:	Scheduling (show other tickets)
Version:	17.11.8
Hardware:	Linux Linux

Severity:	3 - Medium Impact
Assignee:	Jason Booth
QA Contact:

URL:

Depends on:
Blocks:

Reported:	2018-12-18 15:09 MST by ifisk
Modified:	2019-03-25 13:34 MDT (History)
CC List:	1 user (show)

See Also:
Site:	Simons Foundation & Flatiron Institute
Slinky Site:	---
Alineos Sites:	---
Atos/Eviden Sites:	---
Confidential Site:	---
Coreweave sites:	---
Cray Sites:	---
DS9 clusters:	---
Google sites:	---
HPCnow Sites:	---
HPE Sites:	---
IBM Sites:	---
NOAA SIte:	---
NoveTech Sites:	---
Nvidia HWinf-CS Sites:	---
OCF Sites:	---
Recursion Pharma Sites:	---
SFW Sites:	---
SNIC sites:	---
Tzag Elita Sites:	---
Linux Distro:	---
Machine Name:
CLE Version:
Version Fixed:
Target Release:	---
DevPrio:	---
Emory-Cloud Sites:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this ticket.

Description ifisk 2018-12-18 15:09:07 MST

Hi,

    We recently switched from 17.02 to 17.11.8 and our preempt setup that had worked on preempt/qos is not longer preempting jobs.   We also tried partition priority and it doesn't work.   

     Is there a known issue with 17.11.8?  I include our slurm.conf

Thanks, Ian


#
# See the slurm.conf man page for more information.
#

ClusterName=SLURM_CLUSTER
SlurmUser=slurm
#SlurmdUser=root
SlurmctldPort=6800-6817
SlurmdPort=6818
AuthType=auth/munge
#JobCredentialPrivateKey=
#JobCredentialPublicCertificate=
StateSaveLocation=/cm/shared/apps/slurm/var/cm/statesave
SlurmdSpoolDir=/cm/local/apps/slurm/var/spool
SwitchType=switch/none
MpiDefault=none
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
#ProctrackType=proctrack/pgid
ProctrackType=proctrack/cgroup
PrologFlags=Contain
#PluginDir=
CacheGroups=0
#FirstJobId=
ReturnToService=2
#MaxJobCount=
#PlugStackConfig=
#PropagatePrioProcess=
#PropagateResourceLimits=
#PropagateResourceLimitsExcept=
#SrunProlog=
#SrunEpilog=
#TaskProlog=
#TaskEpilog=
TaskPlugin=task/cgroup
#TrackWCKey=no
#TreeWidth=50
#TmpFs=
#UsePAM=
#RebootProgram=/sbin/reboot
RebootProgram=/cm/shared/apps/fi/bin/fi-reboot
JobRequeue=0
#EnforcePartLimits=ALL
# Try to work around slurm bug #5452
EnforcePartLimits=ANY
#
# TIMERS
SlurmctldTimeout=300
SlurmdTimeout=300
InactiveLimit=0
MinJobAge=300
KillWait=30
Waittime=0
SelectType=select/cons_res
SelectTypeParameters=CR_Core_Memory
#
# SCHEDULING
#SchedulerAuth=
#SchedulerPort=
#SchedulerRootFilter=
#PriorityType=priority/multifactor
#PriorityDecayHalfLife=14-0
#PriorityUsageResetPeriod=14-0
#PriorityWeightFairshare=100000
#PriorityWeightAge=1000
#PriorityWeightPartition=10000
#PriorityWeightJobSize=1000
#PriorityMaxAge=1-0
#
# LOGGING
SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurmctld
SlurmdDebug=3
SlurmdLogFile=/var/log/slurmd

#JobCompType=jobcomp/filetxt
#JobCompLoc=/cm/local/apps/slurm/var/spool/job_comp.log

#
# ACCOUNTING
JobAcctGatherType=jobacct_gather/linux
#JobAcctGatherType=jobacct_gather/cgroup
#JobAcctGatherFrequency=30
JobAcctGatherParams=NoOverMemoryKill
AccountingStorageType=accounting_storage/slurmdbd
AccountingStorageUser=slurm
AccountingStorageEnforce=limits,qos
PreemptType=preempt/partition_prio
PreemptMode=CHECKPOINT
# AccountingStorageLoc=slurm_acct_db
# AccountingStoragePass=SLURMDBD_USERPASS

# This section of this file was automatically generated by cmd. Do not edit manually!
# BEGIN AUTOGENERATED SECTION -- DO NOT REMOVE
# Scheduler
SchedulerType=sched/backfill
# Master nodes
ControlMachine=ironbcm1
ControlAddr=ironbcm1
BackupController=ironbcm2
BackupAddr=ironbcm2
AccountingStorageHost=ironbcm1
# Nodes
NodeName=workermem00  CoresPerSocket=12 RealMemory=1500000 Sockets=4
NodeName=worker[0016-0047]  CoresPerSocket=14 RealMemory=256000 Sockets=2 Feature=ib
NodeName=workergpu[00-02]  CoresPerSocket=14 RealMemory=384000 Sockets=2 Gres=gpu:2 Feature=k40
NodeName=worker[1000-1239]  CoresPerSocket=14 RealMemory=512000 Sockets=2 Feature=opa,broadwell
NodeName=workergpu[03-07]  CoresPerSocket=14 RealMemory=512000 Sockets=2 Gres=gpu:2 Feature=p100
NodeName=workergpu[13-18]  CoresPerSocket=18 RealMemory=768000 Sockets=2 Gres=gpu:4 Feature=v100,skylake
NodeName=workergpu[08-12]  CoresPerSocket=20 RealMemory=384000 Sockets=2 Gres=gpu:2 Feature=v100,skylake
NodeName=worker[3000-3119]  CoresPerSocket=20 RealMemory=768000 Sockets=2 Feature=skylake,opa
NodeName=worker[2001-2119]  CoresPerSocket=20 RealMemory=768000 Sockets=2 ThreadsPerCore=1 Feature=skylake
NodeName=worker[0000-0015]  CoresPerSocket=22 RealMemory=384000 Sockets=2 Feature=ib
# Partitions
PartitionName=scc Default=NO MinNodes=1 AllowGroups=scc PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=gen,scc LLN=NO QoS=scc ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[1000-1239,3000-3119]
PartitionName=ccb Default=NO MinNodes=1 DefaultTime=7-00:00:00 MaxTime=7-00:00:00 AllowGroups=ccb PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=gen,ccb LLN=NO QoS=ccb ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[1000-1239,3000-3119]
PartitionName=gen Default=YES MinNodes=1 DefaultTime=1-00:00:00 AllowGroups=ALL PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=gen,inter LLN=NO QoS=inter ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[1000-1239,3000-3119]
PartitionName=cca Default=NO MinNodes=1 DefaultTime=7-00:00:00 MaxTime=7-00:00:00 AllowGroups=cca PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=gen,cca LLN=NO QoS=cca ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[1000-1239,3000-3119]
PartitionName=ccq Default=NO MinNodes=1 DefaultTime=7-00:00:00 AllowGroups=ccq PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=gen,ccq LLN=NO QoS=ccq ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[1000-1239,3000-3119]
PartitionName=preempt Default=NO MinNodes=1 AllowGroups=ALL PriorityJobFactor=1 PriorityTier=0 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=CANCEL ReqResv=NO AllowAccounts=ALL AllowQos=preempt LLN=NO QoS=preempt ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[1000-1239,3000-3119]
PartitionName=ib Default=NO MinNodes=1 DefaultTime=7-00:00:00 MaxTime=7-00:00:00 AllowGroups=ALL PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=gen,ib LLN=NO QoS=ib ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[0000-0047]
PartitionName=gpu Default=NO MinNodes=1 DefaultTime=7-00:00:00 AllowGroups=ALL PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO DefMemPerCPU=18000 AllowAccounts=ALL AllowQos=gen,gpu LLN=NO QoS=gpu ExclusiveUser=NO OverSubscribe=NO OverTimeLimit=0 State=UP Nodes=workergpu[00-18]
PartitionName=mem Default=NO MinNodes=1 DefaultTime=7-00:00:00 AllowGroups=ALL PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=gen,mem LLN=NO QoS=mem ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=workermem00
PartitionName=bnl Default=NO MinNodes=1 DefaultTime=10-00:00:00 AllowGroups=ALL PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=gen,bnl LLN=NO QoS=bnl ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[2001-2119]
PartitionName=bnlx Default=NO MinNodes=1 DefaultTime=1-00:00:00 AllowGroups=ALL PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO DefMemPerCPU=18000 AllowAccounts=ALL AllowQos=gen,bnlx LLN=NO QoS=bnlx ExclusiveUser=NO OverSubscribe=NO OverTimeLimit=0 State=UP Nodes=worker[2001-2119]
PartitionName=info Default=NO MinNodes=1 DefaultTime=7-00:00:00 AllowGroups=genedata PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=ALL LLN=NO QoS=infor ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[1000-1239,3000-3119]
PartitionName=ccm Default=NO MinNodes=1 DefaultTime=7-00:00:00 MaxTime=7-00:00:00 AllowGroups=ccm PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=gen,ccm LLN=NO QoS=ccm ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[1000-1239,3000-3119]
# Generic resources types
GresTypes=gpu,mic
# Epilog/Prolog parameters
PrologSlurmctld=/cm/local/apps/cmd/scripts/prolog-prejob
Prolog=/cm/local/apps/cmd/scripts/prolog
Epilog=/cm/local/apps/cmd/scripts/epilog
# Fast Schedule option
FastSchedule=1
# Power Saving
SuspendTime=-1 # this disables power saving
SuspendTimeout=60
ResumeTimeout=300
SuspendProgram=/cm/local/apps/cluster-tools/wlm/scripts/slurmpoweroff
ResumeProgram=/cm/local/apps/cluster-tools/wlm/scripts/slurmpoweron
# END AUTOGENERATED SECTION   -- DO NOT REMOVE

Comment 2 Jason Booth 2018-12-18 16:22:43 MST

Hi Ian,

 I am tracking down the commit that fixes this but I can tell you that this is caused by OverSubscribe=EXCLUSIVE and is not an issue in later versions such as 17.11.12. You could correct this issue be upgrading to 17.11.12.

-Jason

Comment 3 ifisk 2018-12-18 16:51:12 MST

We use bright, so we will struggle to upgrade until we move to the next
version.

Ian


Sent from my iPhone

On Dec 18, 2018, at 6:22 PM, bugs@schedmd.com wrote:

*Comment # 2 <https://bugs.schedmd.com/show_bug.cgi?id=6265#c2> on bug 6265
<https://bugs.schedmd.com/show_bug.cgi?id=6265> from Jason Booth
<jbooth@schedmd.com> *

Hi Ian,

 I am tracking down the commit that fixes this but I can tell you that this is
caused by OverSubscribe=EXCLUSIVE and is not an issue in later versions such as
17.11.12. You could correct this issue be upgrading to 17.11.12.

-Jason

------------------------------
You are receiving this mail because:

   - You reported the bug.

Comment 4 Jason Booth 2018-12-19 09:05:33 MST

Hi Ian,

 I am marking this issue as resolved but I wanted to mention one additional piece of information.

> We use bright, so we will struggle to upgrade until we move to the next
version.

Bright can re-roll their RPMs with a later version of Slurm. They will respond that they test on specific versions so they can not guarantee that updating to a later version will not cause any issues with their integration, however, the move between minor versions such as 17.11.8 to 17.11.12 should not cause an issue.

-Jason