Good morning, Scratching my head on this one... Job array 15205 was submitted with these sbatch directives: #SBATCH -J "NoNg_bwa" #SBATCH -o job-%A_%a.log #SBATCH --mail-type=ALL #SBATCH --mail-user=xzw0070@auburn.edu #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --cpus-per-task=48 ##SBATCH --mem=5G #SBATCH --partition=general #SBATCH -t 20-00:00:00 #SBATCH --array=1-27 Here is the first job in the array: root@c20-mgt01:utility # showjob 15205_1 JobId=15205 ArrayJobId=15205 ArrayTaskId=1 JobName=NoNg_bwa UserId=xzw0070(438804) GroupId=xzw0070(438804) MCS_label=N/A Priority=4294895255 Nice=0 Account=xzw0070_lab QOS=research JobState=PENDING Reason=ReqNodeNotAvail,_Reserved_for_maintenance Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=00:00:00 TimeLimit=20-00:00:00 TimeMin=N/A SubmitTime=2021-04-13T07:33:17 EligibleTime=2021-04-13T07:33:18 AccrueTime=2021-04-13T07:33:18 StartTime=2021-05-08T00:00:00 EndTime=2021-05-28T00:00:00 Deadline=N/A SuspendTime=None SecsPreSuspend=0 LastSchedEval=2021-04-13T09:19:33 Partition=general AllocNode:Sid=node502:204925 ReqNodeList=(null) ExcNodeList=(null) NodeList=(null) NumNodes=1-1 NumCPUs=48 NumTasks=1 CPUs/Task=48 ReqB:S:C:T=0:0:*:* TRES=cpu=48,node=1,billing=48 Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=* MinCPUsNode=48 MinMemoryNode=0 MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null) Command=/mmfs1/scratch/xzw0070/NoNg_hybrid/NoNg_27hybrid_SNPcalling_NvV4.sh WorkDir=/mmfs1/scratch/xzw0070/NoNg_hybrid StdErr=/mmfs1/scratch/xzw0070/NoNg_hybrid/job-15205_1.log StdIn=/dev/null StdOut=/mmfs1/scratch/xzw0070/NoNg_hybrid/job-15205_1.log Power= MailUser=xzw0070@auburn.edu MailType=BEGIN,END,FAIL,REQUEUE,STAGE_OUT Note that the general partition is used, but the allocnode is node502 which is not in the general partition. root@c20-mgt01:utility # scontrol show partition general PartitionName=general AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL AllocNodes=ALL Default=YES QoS=N/A DefaultTime=2-00:00:00 DisableRootJobs=YES ExclusiveUser=NO GraceTime=0 Hidden=NO MaxNodes=UNLIMITED MaxTime=90-00:00:00 MinNodes=1 LLN=NO MaxCPUsPerNode=UNLIMITED Nodes=node[010-036,038,040,042-064,066-069,071-077,079-132,134-138] PriorityJobFactor=1 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO OverTimeLimit=0 PreemptMode=REQUEUE State=UP TotalCPUs=5856 TotalNodes=122 SelectTypeParameters=NONE JobDefaults=(null) DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED Thought perhaps the user manually assigned a node, but testing this myself it doesn't work: terrykd@easley01:demo > sbatch -p general -w node400 myScript.sh sbatch: error: Batch job submission failed: Requested node configuration is not available Am I reading this wrong or is something else going on here? Thanks again. Keenan
Hi Keenan, AllocNode represents the node that requested the allocation, not a node that has been allocated. From that output, it appears that the job was submitted from node502, and hasn't yet been scheduled due to a maintenance reservation. How long after it was submitted was "scontrol show job" run? Depending on your configuration something like "SchedNodeList=some_nodes" may appear. This would tell you where that job is expected to actually run when the nodes become available again. Thanks! --Tim
OK. That makes sense. How do we restrict users to the login node for job submission? Keenan
You should be able to limit what nodes you can submit from by setting "AllocNodes=" with all of your login nodes. It is a partition specific option, so you would need to set it for every partition you wanted to limit. Thanks! --Tim
Thank you for the help. You can close this ticket. Keenan
Sure thing! Closing this now.