Ticket 17452

Summary: timelimit beyond 10 days for single job
Product: Slurm Reporter: RAMYA ERANNA <reranna>
Component: User CommandsAssignee: Ben Roberts <ben>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: nate
Version: 22.05.2   
Hardware: Linux   
OS: Linux   
Site: SLAC Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description RAMYA ERANNA 2023-08-17 15:33:23 MDT
Hi Team,

I'm looking to increase the timelimit of single job beyond the partition time limit which is set to 10 days now in slurm.conf. How do I do that. 

Also I would like to understand about maximum walltime allowed for each time. Please let me know 

PartitionName=roma        Nodes=romaromes            Default=YES    Priority=50   MaxTime=10-00:00:00     DefaultTime=1-00:00:00  PreemptMode=OFF   State=UP


Thank you
Ramya
Comment 1 Nate Rini 2023-08-18 08:21:57 MDT
(In reply to RAMYA ERANNA from comment #0)
> I'm looking to increase the timelimit of single job beyond the partition
> time limit which is set to 10 days now in slurm.conf. How do I do that. 
> 
> Also I would like to understand about maximum walltime allowed for each
> time. Please let me know 

I suggest reviewing our documentation first:
> https://slurm.schedmd.com/slurm.conf.html#OPT_MaxTime
Comment 3 Ben Roberts 2023-08-21 09:00:29 MDT
Hi Ramya,

As Nate alluded to, normal users are not allowed to increase the time limit of a job past the defined MaxTime for a partition.  However, an administrator can bypass these restrictions.  Here's a quick example where I submit a job as 'user3', which doesn't have any administrative privileges.  My request to increase the time limit is rejected when making the request as this user:

user3@kitt:~$ sinfo -p short
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
short        up    1:00:00     10   idle node[01-10]

user3@kitt:~$ sbatch -pshort -n1 -t1:00:00 --wrap='srun sleep 3600'
Submitted batch job 9296

user3@kitt:~$ scontrol update jobid=9296 timelimit=2:00:00
Access/permission denied for job 9296




When I become 'user1', who does have administrative privileges, I'm able to make this change.

user1@kitt:~$ scontrol update jobid=9296 timelimit=2:00:00

user1@kitt:~$ scontrol show jobs 9296 | grep -i timelimit
   RunTime=00:02:11 TimeLimit=02:00:00 TimeMin=N/A




Let me know if you have any questions about this.

Thanks,
Ben
Comment 4 Ben Roberts 2023-09-26 13:13:53 MDT
Hi Ramya,

The information I sent should have helped and I haven't heard any follow up questions, so I'll close this ticket.

Thanks,
Ben