All, I am running SLURM 18.08.9 using the elastic plug-in in AWS cloud and ran into this weird error and even since this happened the WORKER nodes are not being create and the jobs are not executed. Our setup : We have the controller (slurmctld) running on AWS C4 class server and it spins up a worker node when a job is submitted using sbatch command. When the scheduled job is completed the worker node is terminated. This has been working for last 4 months but stopped working since last Monday (11/16). The error says, "Nodes required for job are DOWN, DRAINED or reserved for jobs in higher priority partitions" What I am noticing is, the controller is trying to allocate the controller as a worker node (I could be wrong). Ever since it happened the controller is not spinning up worker nodes and not running any scheduled jobs. Another piece of information, when this happened I noticed some AWS capacity issues but they were resolved in 30 mins. Any help to resolve this is appreciated. Here is the scontrol show job #jobnumber (base) [centos@ip-198-122-102-172 ~]$ scontrol show job 501 JobId=501 JobName=damocles UserId=centos(1000) GroupId=centos(1000) MCS_label=N/A Priority=4294901755 Nice=0 Account=(null) QOS=(null) JobState=PENDING Reason=Nodes_required_for_job_are_DOWN,_DRAINED_or_reserved_for_jobs_in_higher_priority_partitions Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=00:00:00 TimeLimit=08:00:00 TimeMin=N/A SubmitTime=2020-11-18T23:51:19 EligibleTime=2020-11-18T23:51:19 AccrueTime=2020-11-18T23:51:19 StartTime=Unknown EndTime=Unknown Deadline=N/A PreemptTime=None SuspendTime=None SecsPreSuspend=0 LastSchedEval=2020-11-19T18:52:03 Partition=normal AllocNode:Sid=ip-198-122-102-172:1366 ReqNodeList=(null) ExcNodeList=(null) NodeList=(null) NumNodes=1-1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:* TRES=cpu=1,node=1,billing=1 Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=* MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 OverSubscribe=NO Contiguous=0 Licenses=(null) Network=(null) Command=/efs/damocles_latest/main/data_output/damocles_30aedf_sbatch.sh WorkDir=/efs/damocles_latest/main StdErr=/efs/damocles_latest/main/data_output/slurm_logfile.501.err StdIn=/dev/null StdOut=/efs/damocles_latest/main/data_output/slurm_logfile.501.out Power=