Ticket 6193

Summary: Reservation node not available because of walltime modification
Product: Slurm Reporter: Sergi More <sergi.more>
Component: OtherAssignee: Unassigned Developer <dev-unassigned>
Status: OPEN --- QA Contact:
Severity: 5 - Enhancement    
Priority: ---    
Version: 17.11.7   
Hardware: Linux   
OS: Linux   
Site: BSC-MN4 Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: --- DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---

Description Sergi More 2018-12-07 03:51:29 MST
Hello,

We are looking for a way to ensure that nodes included in a reservation are available at reservation time. We know that slurm do takes care to replace drain/down nodes, but we found a situation where these seems not to be enough. If, after creation but before a reservation starts, you increase manually the timelimit of a job that is running in a node included in a reservation, slurm does not take into consideration that such change can affect as well reserved nodes. 

What we would like to have is something similar to "REPLACE" flag but without taking actual jobs running in the reservation into account when considering that a node should be changed because is allocated. I.E. if job running is using the reservation, that's fine. Do not change nodes. Only do that if such job is not using the reservation. 

Is there already a way to get such behaviour? If not, do you think that it could be added? 

Thank you,
Sergi.
Comment 2 Nate Rini 2018-12-07 17:00:02 MST
(In reply to Sergi More from comment #0)
> Is there already a way to get such behaviour? If not, do you think that it
> could be added? 

We are reviewing your request.

--Nate
Comment 3 Nate Rini 2018-12-11 10:56:22 MST
>If, after creation but before a reservation starts, you increase manually the timelimit of a job that is running in a node included in a reservation, slurm does not take into consideration that such change can affect as well reserved nodes. 

I can confirm a reservation does not take into account an admin changing the time limit of a running job.

>Is there already a way to get such behaviour? 

A work around may be to drain the nodes in the same administrative step, which would move the reservation when FLAG=REPLACE_DOWN is set.

> If not, do you think that it could be added?

I'm going to set this as an enhancement request to follow our normal process.

--Nate
Comment 4 Sergi More 2018-12-12 02:40:15 MST
Ok, thank you Nate. 

Sergi.