Ticket 9980 - Cannot submit jobs to reservation if there is maint reservation
Summary: Cannot submit jobs to reservation if there is maint reservation
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: reservations (show other tickets)
Version: 20.02.5
Hardware: Linux Linux
: 4 - Minor Issue
Assignee: Dominik Bartkiewicz
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2020-10-13 05:47 MDT by CSC sysadmins
Modified: 2020-10-22 09:47 MDT (History)
1 user (show)

See Also:
Site: CSC - IT Center for Science
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 20.02.6
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description CSC sysadmins 2020-10-13 05:47:17 MDT
Hi,

I was updating our compute node image and had a short maintenance window for node reboot. But users cannot submit jobs to the normal reservation and error message is quite hard to understand (at least user point of view). I think job should go to queue and wait if it cannot finish before the maint-reservation starts.


# scontrol create reservation=test starttime=now duration=14-00:00 users=tervotom nodes=c1170
# scontrol create reservation=test_maint StartTime=2020-10-13T15:00:00 duration=1:00:00 users=tervotom nodes=c1170 flags=maint


$ sbatch -A project_2001659 --reservation=test --nodes=1 -p medium -t 1:00:00 gpcnet_opmi_load.sh 
sbatch: error: Batch job submission failed: Requested node configuration is not available

BR,
Tommi
Comment 5 Dominik Bartkiewicz 2020-10-22 09:47:28 MDT
Hi

We've fixed this in commit
https://github.com/SchedMD/slurm/commit/67116e73 which will be in 20.02.6.

I'm closing this as resolved/fixed. Let us know if you have any more issues.

Dominik