Ticket 6265 - Jobs no longer being preempted after switch to 17.11.8 from 17.02
Summary: Jobs no longer being preempted after switch to 17.11.8 from 17.02
Status: RESOLVED INFOGIVEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: Scheduling (show other tickets)
Version: 17.11.8
Hardware: Linux Linux
: 3 - Medium Impact
Assignee: Jason Booth
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2018-12-18 15:09 MST by ifisk
Modified: 2019-03-25 13:34 MDT (History)
1 user (show)

See Also:
Site: Simons Foundation & Flatiron Institute
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description ifisk 2018-12-18 15:09:07 MST
Hi,

    We recently switched from 17.02 to 17.11.8 and our preempt setup that had worked on preempt/qos is not longer preempting jobs.   We also tried partition priority and it doesn't work.   

     Is there a known issue with 17.11.8?  I include our slurm.conf

Thanks, Ian


#
# See the slurm.conf man page for more information.
#

ClusterName=SLURM_CLUSTER
SlurmUser=slurm
#SlurmdUser=root
SlurmctldPort=6800-6817
SlurmdPort=6818
AuthType=auth/munge
#JobCredentialPrivateKey=
#JobCredentialPublicCertificate=
StateSaveLocation=/cm/shared/apps/slurm/var/cm/statesave
SlurmdSpoolDir=/cm/local/apps/slurm/var/spool
SwitchType=switch/none
MpiDefault=none
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
#ProctrackType=proctrack/pgid
ProctrackType=proctrack/cgroup
PrologFlags=Contain
#PluginDir=
CacheGroups=0
#FirstJobId=
ReturnToService=2
#MaxJobCount=
#PlugStackConfig=
#PropagatePrioProcess=
#PropagateResourceLimits=
#PropagateResourceLimitsExcept=
#SrunProlog=
#SrunEpilog=
#TaskProlog=
#TaskEpilog=
TaskPlugin=task/cgroup
#TrackWCKey=no
#TreeWidth=50
#TmpFs=
#UsePAM=
#RebootProgram=/sbin/reboot
RebootProgram=/cm/shared/apps/fi/bin/fi-reboot
JobRequeue=0
#EnforcePartLimits=ALL
# Try to work around slurm bug #5452
EnforcePartLimits=ANY
#
# TIMERS
SlurmctldTimeout=300
SlurmdTimeout=300
InactiveLimit=0
MinJobAge=300
KillWait=30
Waittime=0
SelectType=select/cons_res
SelectTypeParameters=CR_Core_Memory
#
# SCHEDULING
#SchedulerAuth=
#SchedulerPort=
#SchedulerRootFilter=
#PriorityType=priority/multifactor
#PriorityDecayHalfLife=14-0
#PriorityUsageResetPeriod=14-0
#PriorityWeightFairshare=100000
#PriorityWeightAge=1000
#PriorityWeightPartition=10000
#PriorityWeightJobSize=1000
#PriorityMaxAge=1-0
#
# LOGGING
SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurmctld
SlurmdDebug=3
SlurmdLogFile=/var/log/slurmd

#JobCompType=jobcomp/filetxt
#JobCompLoc=/cm/local/apps/slurm/var/spool/job_comp.log

#
# ACCOUNTING
JobAcctGatherType=jobacct_gather/linux
#JobAcctGatherType=jobacct_gather/cgroup
#JobAcctGatherFrequency=30
JobAcctGatherParams=NoOverMemoryKill
AccountingStorageType=accounting_storage/slurmdbd
AccountingStorageUser=slurm
AccountingStorageEnforce=limits,qos
PreemptType=preempt/partition_prio
PreemptMode=CHECKPOINT
# AccountingStorageLoc=slurm_acct_db
# AccountingStoragePass=SLURMDBD_USERPASS

# This section of this file was automatically generated by cmd. Do not edit manually!
# BEGIN AUTOGENERATED SECTION -- DO NOT REMOVE
# Scheduler
SchedulerType=sched/backfill
# Master nodes
ControlMachine=ironbcm1
ControlAddr=ironbcm1
BackupController=ironbcm2
BackupAddr=ironbcm2
AccountingStorageHost=ironbcm1
# Nodes
NodeName=workermem00  CoresPerSocket=12 RealMemory=1500000 Sockets=4
NodeName=worker[0016-0047]  CoresPerSocket=14 RealMemory=256000 Sockets=2 Feature=ib
NodeName=workergpu[00-02]  CoresPerSocket=14 RealMemory=384000 Sockets=2 Gres=gpu:2 Feature=k40
NodeName=worker[1000-1239]  CoresPerSocket=14 RealMemory=512000 Sockets=2 Feature=opa,broadwell
NodeName=workergpu[03-07]  CoresPerSocket=14 RealMemory=512000 Sockets=2 Gres=gpu:2 Feature=p100
NodeName=workergpu[13-18]  CoresPerSocket=18 RealMemory=768000 Sockets=2 Gres=gpu:4 Feature=v100,skylake
NodeName=workergpu[08-12]  CoresPerSocket=20 RealMemory=384000 Sockets=2 Gres=gpu:2 Feature=v100,skylake
NodeName=worker[3000-3119]  CoresPerSocket=20 RealMemory=768000 Sockets=2 Feature=skylake,opa
NodeName=worker[2001-2119]  CoresPerSocket=20 RealMemory=768000 Sockets=2 ThreadsPerCore=1 Feature=skylake
NodeName=worker[0000-0015]  CoresPerSocket=22 RealMemory=384000 Sockets=2 Feature=ib
# Partitions
PartitionName=scc Default=NO MinNodes=1 AllowGroups=scc PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=gen,scc LLN=NO QoS=scc ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[1000-1239,3000-3119]
PartitionName=ccb Default=NO MinNodes=1 DefaultTime=7-00:00:00 MaxTime=7-00:00:00 AllowGroups=ccb PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=gen,ccb LLN=NO QoS=ccb ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[1000-1239,3000-3119]
PartitionName=gen Default=YES MinNodes=1 DefaultTime=1-00:00:00 AllowGroups=ALL PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=gen,inter LLN=NO QoS=inter ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[1000-1239,3000-3119]
PartitionName=cca Default=NO MinNodes=1 DefaultTime=7-00:00:00 MaxTime=7-00:00:00 AllowGroups=cca PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=gen,cca LLN=NO QoS=cca ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[1000-1239,3000-3119]
PartitionName=ccq Default=NO MinNodes=1 DefaultTime=7-00:00:00 AllowGroups=ccq PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=gen,ccq LLN=NO QoS=ccq ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[1000-1239,3000-3119]
PartitionName=preempt Default=NO MinNodes=1 AllowGroups=ALL PriorityJobFactor=1 PriorityTier=0 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=CANCEL ReqResv=NO AllowAccounts=ALL AllowQos=preempt LLN=NO QoS=preempt ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[1000-1239,3000-3119]
PartitionName=ib Default=NO MinNodes=1 DefaultTime=7-00:00:00 MaxTime=7-00:00:00 AllowGroups=ALL PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=gen,ib LLN=NO QoS=ib ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[0000-0047]
PartitionName=gpu Default=NO MinNodes=1 DefaultTime=7-00:00:00 AllowGroups=ALL PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO DefMemPerCPU=18000 AllowAccounts=ALL AllowQos=gen,gpu LLN=NO QoS=gpu ExclusiveUser=NO OverSubscribe=NO OverTimeLimit=0 State=UP Nodes=workergpu[00-18]
PartitionName=mem Default=NO MinNodes=1 DefaultTime=7-00:00:00 AllowGroups=ALL PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=gen,mem LLN=NO QoS=mem ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=workermem00
PartitionName=bnl Default=NO MinNodes=1 DefaultTime=10-00:00:00 AllowGroups=ALL PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=gen,bnl LLN=NO QoS=bnl ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[2001-2119]
PartitionName=bnlx Default=NO MinNodes=1 DefaultTime=1-00:00:00 AllowGroups=ALL PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO DefMemPerCPU=18000 AllowAccounts=ALL AllowQos=gen,bnlx LLN=NO QoS=bnlx ExclusiveUser=NO OverSubscribe=NO OverTimeLimit=0 State=UP Nodes=worker[2001-2119]
PartitionName=info Default=NO MinNodes=1 DefaultTime=7-00:00:00 AllowGroups=genedata PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=ALL LLN=NO QoS=infor ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[1000-1239,3000-3119]
PartitionName=ccm Default=NO MinNodes=1 DefaultTime=7-00:00:00 MaxTime=7-00:00:00 AllowGroups=ccm PriorityJobFactor=1 PriorityTier=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=gen,ccm LLN=NO QoS=ccm ExclusiveUser=NO OverSubscribe=EXCLUSIVE OverTimeLimit=0 State=UP Nodes=worker[1000-1239,3000-3119]
# Generic resources types
GresTypes=gpu,mic
# Epilog/Prolog parameters
PrologSlurmctld=/cm/local/apps/cmd/scripts/prolog-prejob
Prolog=/cm/local/apps/cmd/scripts/prolog
Epilog=/cm/local/apps/cmd/scripts/epilog
# Fast Schedule option
FastSchedule=1
# Power Saving
SuspendTime=-1 # this disables power saving
SuspendTimeout=60
ResumeTimeout=300
SuspendProgram=/cm/local/apps/cluster-tools/wlm/scripts/slurmpoweroff
ResumeProgram=/cm/local/apps/cluster-tools/wlm/scripts/slurmpoweron
# END AUTOGENERATED SECTION   -- DO NOT REMOVE
Comment 2 Jason Booth 2018-12-18 16:22:43 MST
Hi Ian,

 I am tracking down the commit that fixes this but I can tell you that this is caused by OverSubscribe=EXCLUSIVE and is not an issue in later versions such as 17.11.12. You could correct this issue be upgrading to 17.11.12.

-Jason
Comment 3 ifisk 2018-12-18 16:51:12 MST
We use bright, so we will struggle to upgrade until we move to the next
version.

Ian


Sent from my iPhone

On Dec 18, 2018, at 6:22 PM, bugs@schedmd.com wrote:

*Comment # 2 <https://bugs.schedmd.com/show_bug.cgi?id=6265#c2> on bug 6265
<https://bugs.schedmd.com/show_bug.cgi?id=6265> from Jason Booth
<jbooth@schedmd.com> *

Hi Ian,

 I am tracking down the commit that fixes this but I can tell you that this is
caused by OverSubscribe=EXCLUSIVE and is not an issue in later versions such as
17.11.12. You could correct this issue be upgrading to 17.11.12.

-Jason

------------------------------
You are receiving this mail because:

   - You reported the bug.
Comment 4 Jason Booth 2018-12-19 09:05:33 MST
Hi Ian,

 I am marking this issue as resolved but I wanted to mention one additional piece of information.

> We use bright, so we will struggle to upgrade until we move to the next
version.

Bright can re-roll their RPMs with a later version of Slurm. They will respond that they test on specific versions so they can not guarantee that updating to a later version will not cause any issues with their integration, however, the move between minor versions such as 17.11.8 to 17.11.12 should not cause an issue.

-Jason