Ticket 6193 - Reservation node not available because of walltime modification
Summary: Reservation node not available because of walltime modification
Status: OPEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: Other (show other tickets)
Version: 17.11.7
Hardware: Linux Linux
: 5 - Enhancement
Assignee: Unassigned Developer
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2018-12-07 03:51 MST by Sergi More
Modified: 2018-12-12 02:40 MST (History)
0 users

See Also:
Site: BSC-MN4
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Sergi More 2018-12-07 03:51:29 MST
Hello,

We are looking for a way to ensure that nodes included in a reservation are available at reservation time. We know that slurm do takes care to replace drain/down nodes, but we found a situation where these seems not to be enough. If, after creation but before a reservation starts, you increase manually the timelimit of a job that is running in a node included in a reservation, slurm does not take into consideration that such change can affect as well reserved nodes. 

What we would like to have is something similar to "REPLACE" flag but without taking actual jobs running in the reservation into account when considering that a node should be changed because is allocated. I.E. if job running is using the reservation, that's fine. Do not change nodes. Only do that if such job is not using the reservation. 

Is there already a way to get such behaviour? If not, do you think that it could be added? 

Thank you,
Sergi.
Comment 2 Nate Rini 2018-12-07 17:00:02 MST
(In reply to Sergi More from comment #0)
> Is there already a way to get such behaviour? If not, do you think that it
> could be added? 

We are reviewing your request.

--Nate
Comment 3 Nate Rini 2018-12-11 10:56:22 MST
>If, after creation but before a reservation starts, you increase manually the timelimit of a job that is running in a node included in a reservation, slurm does not take into consideration that such change can affect as well reserved nodes. 

I can confirm a reservation does not take into account an admin changing the time limit of a running job.

>Is there already a way to get such behaviour? 

A work around may be to drain the nodes in the same administrative step, which would move the reservation when FLAG=REPLACE_DOWN is set.

> If not, do you think that it could be added?

I'm going to set this as an enhancement request to follow our normal process.

--Nate
Comment 4 Sergi More 2018-12-12 02:40:15 MST
Ok, thank you Nate. 

Sergi.