Hi Support, We are using Slurm 17.02.5. I understand it is out of support and we are planing to do the update. I just have a general question to understand what the Pending code 'QOSGrpJobsLimit' actualy means. While there were still a lot resources available the cluster, I noticed a few hundreds jobs pending with reason 'QOSGrpJobsLimit'. What slurm checks and put a job pending with reason code 'QOSGrpJobsLimit'? Thanks, Hui
In addition, [gadmin@hkgslaqsdev110 17:13]$ squeue |grep PD|grep QOSGrpJobsLimit 52420639 emergency probejob root PD 0:00 1 (QOSGrpJobsLimit) 52420642 mosek probejob root PD 0:00 1 (QOSGrpJobsLimit) 52420644 medium probejob root PD 0:00 1 (QOSGrpJobsLimit) 52407304 medium lynx_agg c_guilba PD 0:00 1 (QOSGrpJobsLimit) Now most jobs have turned to R status. I see a few probejobs by root for different partitions are pending with QOSGrpJobsLimit. Anything special for those root probejobs, do I need to do anything to clear the status?
Hui, Thanks for reaching out to us. I would be happy to clarify this for you. Can you give me the output if you run this command: > sacctmgr list qos format=name,GrpJobs Thanks, - Jeff
Hi Jeff, Here is the command output: [root@hkgslaqsdev110 10:35]$ sacctmgr list qos format=name,GrpJobs Name GrpJobs ---------- ------- normal 1000 longjob 1000 weekendjob 500 lowjob 500 pretestjob 500 hugejob 500 localjob 500 gpujob 200 team 10000 Thanks, Hui
Hui, Thanks for providing that information. If I understand correctly, you're wanting to know why jobs are pending with the reason QOSGrpJobsLimit. Every user is associated with a QOS, and, as you provided, each QOS has a max running jobs limit (GrpJobs). If that limit is reached, the jobs will be pending until a running job finishes. From the sacctmgr man page: > NOTE: The group limits (GrpJobs, GrpTRES, etc.) are tested when a job is > being considered for being allocated resources. If starting a job would > cause any of its group limit to be exceeded, that job will not be considered > for scheduling even if that job might preempt other jobs which would release > sufficient group resources for the pending job to be initiated. You can increase the GrpJobs value for a QOS with this command: > sacctmgr modify qos where name=<name> set GrpJobs=<#> Does that answer your question? - Jeff
Hi Jeff, Understood. Last question regarding this topic: e.g., a user sends 200 jobs in a batch to a partition where normal QOS is set with 1000 GrpJobs. There are already 900 jobs running in the partition. In this case, will 100 jobs out of 200 be scheduled first or will the entire 200 jobs be put in pending status? Thanks, Hui
(In reply to hui.qiu from comment #5) > Last question regarding this topic: e.g., a user sends 200 jobs > in a batch to a partition where normal QOS is set with 1000 GrpJobs. There > are already 900 jobs running in the partition. In this case, will 100 jobs > out of 200 be scheduled first or will the entire 200 jobs be put in pending > status? That's a great question. From my own testing: $ sacctmgr list qos format=name,GrpJobs Name GrpJobs ---------- ------- normal gold 5 $ sbatch --array=0-9 -q gold --wrap="sleep 60" Submitted batch job 296 $ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 296_[5-9] debug wrap jeff PD 0:00 1 (QOSGrpJobsLimit) 296_0 debug wrap jeff R 0:03 1 linux1 296_1 debug wrap jeff R 0:03 1 linux1 296_2 debug wrap jeff R 0:03 1 linux1 296_3 debug wrap jeff R 0:03 1 linux1 296_4 debug wrap jeff R 0:03 1 linux2 So, in your question, 100 of those jobs would run and 100 would be put in a pending state initially. - Jeff
Hui, I'm going to go ahead and close out this ticket now, but feel free to open it back up if you have further questions. Thanks, - Jeff