Ticket 10120

Summary: QOS flags=partitiontimelimit does not override default MaxTime of partition
Product: Slurm Reporter: Bom <bom.singiali>
Component: AccountingAssignee: Jacob Jenson <jacob>
Status: RESOLVED FIXED QA Contact:
Severity: 6 - No support contract    
Priority: --- CC: bom.singiali
Version: 19.05.3   
Hardware: Linux   
OS: Linux   
Site: -Other- Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: CentOS
Machine Name: CLE Version:
Version Fixed: 19.05.03 Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Bom 2020-11-03 10:56:12 MST
Dear Team,

I am trying to implement solution provided in:
https://bugs.schedmd.com/show_bug.cgi?id=4681

However, user job is going to pending state with following status:

[root@vicb-submit-01 ~]# squeue -u louis.kuemmerle
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
            679046   icb_gpu  test.sh louis.ku PD       0:00      1 (QOSMaxWallDurationPerJobLimit)


==== Partiton(icb_gpu) max time limit ===

# scontrol show part | grep MaxTime
   MaxNodes=UNLIMITED MaxTime=7-00:00:00 MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED

# qos (icb_long)max time limit is 6 days.

## Goal: to allow specific user to run job for 20 days with partition maxtime overwrite features, without modifying existing partition and qos.

Please advice, thank you for your help.

--
Best Regards
Bomay
Comment 1 Bom 2020-11-03 11:03:14 MST
Also,

Status keep switching between: 

NODELIST(REASON) 
(Priority) <->(QOSMaxWallDurationPerJobLimit)
Comment 2 Bom 2021-05-19 08:05:50 MDT
PartitionTimeLimit flag on qos resolved this issue.