Ticket 11354

Summary:	Discrepancy between partition requested by a job and node allocated to the job
Product:	Slurm	Reporter:	HPC Admin <hpcadmin>
Component:	Scheduling	Assignee:	Tim McMullan <mcmullan>
Status:	RESOLVED INFOGIVEN	QA Contact:
Severity:	3 - Medium Impact
Priority:	---
Version:	20.02.5
Hardware:	Linux
OS:	Linux
Site:	Auburn	Slinky Site:	---
Alineos Sites:	---	Atos/Eviden Sites:	---
Confidential Site:	---	Coreweave sites:	---
Cray Sites:	---	DS9 clusters:	---
Google sites:	---	HPCnow Sites:	---
HPE Sites:	---	IBM Sites:	---
NOAA SIte:	---	NoveTech Sites:	---
Nvidia HWinf-CS Sites:	---	OCF Sites:	---
Recursion Pharma Sites:	---	SFW Sites:	---
SNIC sites:	---	Tzag Elita Sites:	---
Linux Distro:	---	Machine Name:
CLE Version:		Version Fixed:
Target Release:	---	DevPrio:	---
Emory-Cloud Sites:	---

Description HPC Admin 2021-04-13 08:27:08 MDT

Good morning,

Scratching my head on this one... Job array 15205 was submitted with these sbatch directives:

#SBATCH -J "NoNg_bwa"          
#SBATCH -o job-%A_%a.log

#SBATCH --mail-type=ALL
#SBATCH --mail-user=xzw0070@auburn.edu

#SBATCH --nodes=1
#SBATCH --ntasks=1   
#SBATCH --cpus-per-task=48
##SBATCH --mem=5G   
#SBATCH --partition=general
#SBATCH -t 20-00:00:00    
#SBATCH --array=1-27

Here is the first job in the array:

root@c20-mgt01:utility # showjob 15205_1
JobId=15205 ArrayJobId=15205 ArrayTaskId=1 JobName=NoNg_bwa
   UserId=xzw0070(438804) GroupId=xzw0070(438804) MCS_label=N/A
   Priority=4294895255 Nice=0 Account=xzw0070_lab QOS=research
   JobState=PENDING Reason=ReqNodeNotAvail,_Reserved_for_maintenance Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
   RunTime=00:00:00 TimeLimit=20-00:00:00 TimeMin=N/A
   SubmitTime=2021-04-13T07:33:17 EligibleTime=2021-04-13T07:33:18
   AccrueTime=2021-04-13T07:33:18
   StartTime=2021-05-08T00:00:00 EndTime=2021-05-28T00:00:00 Deadline=N/A
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2021-04-13T09:19:33
   Partition=general AllocNode:Sid=node502:204925
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=(null)
   NumNodes=1-1 NumCPUs=48 NumTasks=1 CPUs/Task=48 ReqB:S:C:T=0:0:*:*
   TRES=cpu=48,node=1,billing=48
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   MinCPUsNode=48 MinMemoryNode=0 MinTmpDiskNode=0
   Features=(null) DelayBoot=00:00:00
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=/mmfs1/scratch/xzw0070/NoNg_hybrid/NoNg_27hybrid_SNPcalling_NvV4.sh
   WorkDir=/mmfs1/scratch/xzw0070/NoNg_hybrid
   StdErr=/mmfs1/scratch/xzw0070/NoNg_hybrid/job-15205_1.log
   StdIn=/dev/null
   StdOut=/mmfs1/scratch/xzw0070/NoNg_hybrid/job-15205_1.log
   Power=
   MailUser=xzw0070@auburn.edu MailType=BEGIN,END,FAIL,REQUEUE,STAGE_OUT

Note that the general partition is used, but the allocnode is node502 which is not in the general partition.

root@c20-mgt01:utility # scontrol show partition general
PartitionName=general
   AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
   AllocNodes=ALL Default=YES QoS=N/A
   DefaultTime=2-00:00:00 DisableRootJobs=YES ExclusiveUser=NO GraceTime=0 Hidden=NO
   MaxNodes=UNLIMITED MaxTime=90-00:00:00 MinNodes=1 LLN=NO MaxCPUsPerNode=UNLIMITED
   Nodes=node[010-036,038,040,042-064,066-069,071-077,079-132,134-138]
   PriorityJobFactor=1 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO
   OverTimeLimit=0 PreemptMode=REQUEUE
   State=UP TotalCPUs=5856 TotalNodes=122 SelectTypeParameters=NONE
   JobDefaults=(null)
   DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED

Thought perhaps the user manually assigned a node, but testing this myself it doesn't work:

terrykd@easley01:demo > sbatch -p general -w node400 myScript.sh 
sbatch: error: Batch job submission failed: Requested node configuration is not available

Am I reading this wrong or is something else going on here?

Thanks again.
Keenan

Comment 1 Tim McMullan 2021-04-13 09:28:26 MDT

Hi Keenan,

AllocNode represents the node that requested the allocation, not a node that has been allocated.

From that output, it appears that the job was submitted from node502, and hasn't yet been scheduled due to a maintenance reservation.

How long after it was submitted was "scontrol show job" run?  Depending on your configuration something like "SchedNodeList=some_nodes" may appear.  This would tell you where that job is expected to actually run when the nodes become available again.

Thanks!
--Tim

Comment 2 HPC Admin 2021-04-13 12:20:44 MDT

OK. That makes sense. How do we restrict users to the login node for job submission?

Keenan

Comment 3 Tim McMullan 2021-04-14 05:11:16 MDT

You should be able to limit what nodes you can submit from by setting "AllocNodes=" with all of your login nodes.  It is a partition specific option, so you would need to set it for every partition you wanted to limit.

Thanks!
--Tim

Comment 4 HPC Admin 2021-04-14 06:45:06 MDT

Thank you for the help. You can close this ticket.

Keenan

Comment 5 Tim McMullan 2021-04-14 07:09:37 MDT

Sure thing!  Closing this now.