Ticket 10331

Summary: About notation of reason for pending job
Product: Slurm Reporter: issp2020support
Component: SchedulingAssignee: Marcin Stolarek <cinek>
Status: RESOLVED DUPLICATE QA Contact:
Severity: 2 - High Impact    
Priority: --- CC: cinek
Version: 20.02.3   
Hardware: Linux   
OS: Linux   
Site: U of Tokyo Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: --- DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---

Description issp2020support 2020-12-02 00:24:48 MST
Some pending jobs show "Reserved for maintenance" even though no reservation is in the system.

70919     F1cpu     HPhi  k014819 PD       0:00      1 (ReqNodeNotAvail, Reserved for maintenance)
75614     F1cpu XC6_L2_D  k003900 PD       0:00      1 (ReqNodeNotAvail, Reserved for maintenance)
75618     F1cpu XC6_L4_D  k003900 PD       0:00      1 (ReqNodeNotAvail, Reserved for maintenance)
75677     F1cpu XC4_L10_  k003900 PD       0:00      1 (ReqNodeNotAvail, Reserved for maintenance)
75679     F1cpu GP_XC4_L  k003900 PD       0:00      1 (ReqNodeNotAvail, Reserved for maintenance)
Comment 1 Marcin Stolarek 2020-12-02 02:17:57 MST
Did you have any maintenance reservation in the recent past?

I think that you're hitting an issue that got improved by 3d6902ebe9d and was already released in Slurm 20.02.6? The upgrade should be enough avoid it in the future. 

As the workaround for those jobs, you can call `scontrol release JOBID` on those jobs. After that, they will get displayed with "(None)" reason which will be updated in the next scheduler cycle.

Let me know if that worked. After the confirmation, I'll close the case as a duplicate of Bug 9720.

cheers,
Marcin
Comment 4 Marcin Stolarek 2020-12-03 04:10:51 MST
Did the workaround from comment 1 work for you?

cheers,
Marcin
Comment 5 issp2020support 2020-12-03 04:20:44 MST
Yes. It was fixed by this workaround.
Thanks.
Comment 6 Marcin Stolarek 2020-12-03 04:24:58 MST
Thanks for the confirmation. To avoid the issue in the future please upgrade to  Slurm 20.02.6.

I'm closing the bug as a duplicate now.

cheers,
Marcin

*** This ticket has been marked as a duplicate of ticket 9720 ***