Ticket 2589 - preemption: possible to allow a reservation to preempt jobs?
Summary: preemption: possible to allow a reservation to preempt jobs?
Status: RESOLVED INFOGIVEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: slurmctld (show other tickets)
Version: 15.08.8
Hardware: Linux Linux
: 4 - Minor Issue
Assignee: Tim Wickberg
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2016-03-28 02:42 MDT by Doug Jacobsen
Modified: 2016-04-07 04:38 MDT (History)
0 users

See Also:
Site: NERSC
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Doug Jacobsen 2016-03-28 02:42:38 MDT
Hello,

In the past we've allowed some jobs (which requested it) to allow themselves to be preempted by a reservation (e.g., maintenance or system-exclusive time).  Is there any way to do this in SLURM?

We are planning on using qos-based preemption for some things, but would also like to allow reservations to preempt jobs within that preemptable qos.  The goal is to increase even further our ability to backfill heading into system exclusive testing or maintenance periods.

Thanks,
Doug
Comment 1 Tim Wickberg 2016-03-28 05:06:36 MDT
When you say "preempted" here, you mean canceled at that time presumably? Suspend + resume, or gang scheduling obviously wouldn't work if you're bringing the machine down.

Could the jobs simply be submitted with a modest --time-min set? sched/backfill would then try to assign them as much time is available that schedules them soonest, which seems like what you're trying to accomplish to a limited extent.

You may want to combine that (possibly through a job_submit plugin) with a dedicated lower-priority partition to limit churn.
Comment 2 Doug Jacobsen 2016-04-07 04:38:25 MDT
I think this is a great solution -- I'll work with our user services folks to get users to do this!

Thanks,
Doug