We're getting complaints from our user groups that the queue waiting time for short jobs is way too long, making it difficult to work with testing and debug jobs on a very busy cluster. Backfilling is enabled, but is of limited effectiveness because our cluster is constantly oversubscribed by a factor of 5-10. The PriorityWeightAge in slurm.conf favors older jobs, but irrespective of whether such jobs are short or long, so that's not a solution. Based upon our past experience with the MAUI scheduler, the Expansion Factor (XFACTOR) gives us a great tool for prioritizing short jobs: XFACTOR = 1 + <EFFQUEUETIME> / <WALLCLOCKLIMIT> XFACTOR documentation for MAUI is at http://docs.adaptivecomputing.com/maui/5.1.2priorityfactors.php, also in MOAB at http://www.adaptivecomputing.com/blog-hpc/using-moab-job-priorities-exploring-priority-sub-components/ It seems to me that a future PriorityWeightXFACTOR flag in slurm.conf would be very similar to the existing PriorityWeightAge flag, needing only the division by WALLCLOCKLIMIT and a cap PriorityMaxXFACTOR (similar to MAUI's XFACTORCAP). I notice the recommendations in bug 5194 and the feature reminder in bug 5202, but there doesn't seem to be any solution coming any time soon. Question to Slurm developers: Would you kindly consider a functionality described here for inclusion in Slurm 19.05? IMHO, many Slurm sites might find this very useful. Thanks, Ole
Created attachment 8131 [details] slurm.conf file
Hi Ole. Please, let's centralize the discussion in bug 5202. I'm gonna mark this as a duplicate of that one. *** This ticket has been marked as a duplicate of ticket 5202 ***