Ticket 7136

Summary: Jobs are held with reason JobHeldAdmin instead of JobHeldUser
Product: Slurm Reporter: Renate Dohmen <dohmen>
Component: slurmctldAssignee: Dominik Bartkiewicz <bart>
Status: RESOLVED FIXED QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: bart
Version: 18.08.6   
Hardware: Linux   
OS: Linux   
Site: Max Planck Computing and Data Facility Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: 19.05.1 Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---
Attachments: slurmctld log file of the described scenario
jobinfo file

Description Renate Dohmen 2019-05-29 04:32:19 MDT
Created attachment 10415 [details]
slurmctld log file of the described scenario

When a user launches jobs with hold parameter and QOS which has MaxJobsPU limit, then jobs hold with reason JobHeldAdmin instead of JobHeldUser.

This happens in the following situation:

1) The user submits more than QOSMaxJobsPerUserLimit jobs in hold state.
2) The user releases his held jobs, some of them start to run, some are pending with QOSMaxJobsPerUserLimit reason.
3) At the same moment the user submits new hold jobs. And these new jobs are held with JobHeldAdmin reason.

In the attachment we provide the slurmctld log related to this situation and jobinfo for one of job (109015) which was held with reason JobHeldAdmin.
Comment 1 Renate Dohmen 2019-05-29 04:33:05 MDT
Created attachment 10416 [details]
jobinfo file
Comment 2 Dominik Bartkiewicz 2019-05-29 05:25:38 MDT
Hi

Thanks for reporting this, I can reproduce this.
I will let you know when the fix will be in the repo.

Dominik
Comment 5 Dominik Bartkiewicz 2019-06-05 04:22:26 MDT
Hi

This commit fixed this issue it will be in 19.05.1 and above.
https://github.com/SchedMD/slurm/commit/fe8226e72
I'll go ahead and close this. Thank you.

Dominik