Ticket 2589

Summary: preemption: possible to allow a reservation to preempt jobs?
Product: Slurm Reporter: Doug Jacobsen <dmjacobsen>
Component: slurmctldAssignee: Tim Wickberg <tim>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 4 - Minor Issue    
Priority: ---    
Version: 15.08.8   
Hardware: Linux   
OS: Linux   
Site: NERSC Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Doug Jacobsen 2016-03-28 02:42:38 MDT
Hello,

In the past we've allowed some jobs (which requested it) to allow themselves to be preempted by a reservation (e.g., maintenance or system-exclusive time).  Is there any way to do this in SLURM?

We are planning on using qos-based preemption for some things, but would also like to allow reservations to preempt jobs within that preemptable qos.  The goal is to increase even further our ability to backfill heading into system exclusive testing or maintenance periods.

Thanks,
Doug
Comment 1 Tim Wickberg 2016-03-28 05:06:36 MDT
When you say "preempted" here, you mean canceled at that time presumably? Suspend + resume, or gang scheduling obviously wouldn't work if you're bringing the machine down.

Could the jobs simply be submitted with a modest --time-min set? sched/backfill would then try to assign them as much time is available that schedules them soonest, which seems like what you're trying to accomplish to a limited extent.

You may want to combine that (possibly through a job_submit plugin) with a dedicated lower-priority partition to limit churn.
Comment 2 Doug Jacobsen 2016-04-07 04:38:25 MDT
I think this is a great solution -- I'll work with our user services folks to get users to do this!

Thanks,
Doug