Created attachment 8154 [details] Our slurm.conf Hello, I am very keen to revisit this issue if possible, please. I have raised the issue before (see https://bugs.schedmd.com/show_bug.cgi?id=5194), however at the time I didn't have the time to give this much attention. As I noted in #5194 we're migrating from TORQUE/MOAB. Using that software we us the XFACTOR to ensure that short jobs don't get starved out by longer jobs. There is no equivalent to XFACTOR in SLURM and so we need to achieve the above by another means.Also (above and beyond this) we need to be able to efficiently run a diverse workload well. In #5194 you made a number of points -- the most important being... "Continuing with the advice for the priority/multifactor plugin, we generally recommend ordering each of the PriorityWeight<something> factors from most to least important, then setting them each an order of magnitude apart. This should help some more jobs get scheduled. The weight values should be high enough to get a good set of significant digits since all the factors are floating point numbers from 0.0 to 1.0. Starting around 1000 or so for those factors you want to make predominant, as stated in the web documentation. Without any specific site requirements, perhaps what makes more sense is to set the highest weight to the QOS factor and the next one to the FairShare factor. We also usually recommend to set the PriorityFlags=FAIR_TREE." ....Is this good general advice for general/diverse workloads? On the SLURM community forum a user gave the following piece of advice.. ""PriorityFavorSmall=NO PriorityFlags=DEPTH_OBLIVIOUS,SMALL_RELATIVE_TO_TIME PriorityFavorSmall and SMALL_RELATIVE_TO_TIME are used by us to favour both short and large jobs. So if two jobs are equal in size, the shorter of the two is favoured. Also if two jobs are equal in time, the larger is favoured. We use this as a way to get short jobs in and out of the queues quickly as well as help large jobs (typically MPI) have priority over small serial jobs." ..... What are you comments re this advice, please? It seems to make a degree of sense. I'm keen to explore the management jobs in the cluster especially with respect to the treatment of small jobs (as I note above), please. I suspect a reasonable starting point is to attach my current slurm.conf so that you can make suggested amendments, please. Best regards, David
Hi David. Since other sites are also demanding this, please let's centralize the tracking in bug 5202, where you're already CC'd on. Thanks.
Tagging as duplicate. *** This ticket has been marked as a duplicate of ticket 5202 ***
Hello, I agree that this streamlining is sensible, however are you happy to address my immediate concerns re my questions set out in bug 5964, please? The XFACTOR would be a great addition in the future, however in the meantime it would be good to have some general advice to work with, please -- re my questions raised in bug 5964. Is that OK? Best regards, David ________________________________ From: bugs@schedmd.com <bugs@schedmd.com> Sent: 05 November 2018 12:27 To: Baker D.J. Subject: [Bug 5964] Advice on the management of short jobs in SLURM Alejandro Sanchez<mailto:alex@schedmd.com> changed bug 5964<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.schedmd.com%2Fshow_bug.cgi%3Fid%3D5964&data=01%7C01%7Cd.j.baker%40soton.ac.uk%7Cb784e2e6c1cc43299d6708d6431a1736%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&sdata=5x5JvQq5RB3cxd3lW18EHFZSZAPlwNc6DY1G3aC3a6U%3D&reserved=0> What Removed Added Resolution --- DUPLICATE Status UNCONFIRMED RESOLVED Comment # 3<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.schedmd.com%2Fshow_bug.cgi%3Fid%3D5964%23c3&data=01%7C01%7Cd.j.baker%40soton.ac.uk%7Cb784e2e6c1cc43299d6708d6431a1736%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&sdata=fiqNjpkbEn%2F6u4M4pHSWp8bd0hpz2ZoHxV08ELzUMI0%3D&reserved=0> on bug 5964<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.schedmd.com%2Fshow_bug.cgi%3Fid%3D5964&data=01%7C01%7Cd.j.baker%40soton.ac.uk%7Cb784e2e6c1cc43299d6708d6431a1736%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&sdata=5x5JvQq5RB3cxd3lW18EHFZSZAPlwNc6DY1G3aC3a6U%3D&reserved=0> from Alejandro Sanchez<mailto:alex@schedmd.com> Tagging as duplicate. *** This bug has been marked as a duplicate of bug 5202<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.schedmd.com%2Fshow_bug.cgi%3Fid%3D5202&data=01%7C01%7Cd.j.baker%40soton.ac.uk%7Cb784e2e6c1cc43299d6708d6431a1736%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&sdata=f6pC8kDhPYBCBSXqch1%2FVX77L8Jvm0ydD1G18yZO96Y%3D&reserved=0> *** ________________________________ You are receiving this mail because: * You reported the bug.
(In reply to David Baker from comment #4) > Hello, > > > I agree that this streamlining is sensible, however are you happy to address > my immediate concerns re my questions set out in bug 5964, please? The > XFACTOR would be a great addition in the future, however in the meantime it > would be good to have some general advice to work with, please -- re my > questions raised in bug 5964. Is that OK? Sure, no problem. (In reply to David Baker from comment #0) > Created attachment 8154 [details] > Our slurm.conf > > Hello, > > I am very keen to revisit this issue if possible, please. I have raised the > issue before (see https://bugs.schedmd.com/show_bug.cgi?id=5194), however at > the time I didn't have the time to give this much attention. > > As I noted in #5194 we're migrating from TORQUE/MOAB. Using that software we > us the XFACTOR to ensure that short jobs don't get starved out by longer > jobs. There is no equivalent to XFACTOR in SLURM and so we need to achieve > the above by another means.Also (above and beyond this) we need to be able > to efficiently run a diverse workload well. > > In #5194 you made a number of points -- the most important being... > > "Continuing with the advice for the priority/multifactor plugin, we > generally recommend ordering each of the PriorityWeight<something> factors > from most to least important, then setting them each an order of magnitude > apart. This should help some more jobs get scheduled. The weight values > should be high enough to get a good set of significant digits since all the > factors are floating point numbers from 0.0 to 1.0. Starting around 1000 or > so for those factors you want to make predominant, as stated in the web > documentation. > > Without any specific site requirements, perhaps what makes more sense is to > set the highest weight to the QOS factor and the next one to the FairShare > factor. We also usually recommend to set the PriorityFlags=FAIR_TREE." > > ....Is this good general advice for general/diverse workloads? As an starting point without any specific site requirements, yes it is a good advise for general/diverse workloads. > On the SLURM community forum a user gave the following piece of advice.. > > ""PriorityFavorSmall=NO > PriorityFlags=DEPTH_OBLIVIOUS,SMALL_RELATIVE_TO_TIME > > PriorityFavorSmall and SMALL_RELATIVE_TO_TIME are used by us to favour both > short and large jobs. So if two jobs are equal in size, the shorter of the > two is favoured. Also if two jobs are equal in time, the larger is > favoured. We use this as a way to get short jobs in and out of the queues > quickly as well as help large jobs (typically MPI) have priority over small > serial jobs." > > ..... What are you comments re this advice, please? It seems to make a > degree of sense. Looking at the code here: https://github.com/SchedMD/slurm/blob/slurm-18-08-3-1/src/plugins/priority/multifactor/priority_multifactor.c#L2060 If you only had PriorityWeightJobSize, then the higher the amount of requested CPUs the higher the JobSizeFactor. If you also have SMALL_RELATIVE_TO_TIME, then two jobs with the same amount of requested CPUs, the one with with shorter TimeLimit will have higher JobSizeFactor since it is divided by the TimeLimit. If you also have PriorityFavorSmall=Yes then the previously calculated factor is reversed: https://github.com/SchedMD/slurm/blob/slurm-18-08-3-1/src/plugins/priority/multifactor/priority_multifactor.c#L2079 if (favor_small) { job_ptr->prio_factors->priority_js = (double) 1.0 - job_ptr->prio_factors->priority_js; } Does it make sense? > I'm keen to explore the management jobs in the cluster especially with > respect to the treatment of small jobs (as I note above), please. I suspect > a reasonable starting point is to attach my current slurm.conf so that you > can make suggested amendments, please. Your current slurm.conf PriorityType=priority/multifactor PriorityDecayHalfLife=14-0 #PriorityUsageResetPeriod=MONTHLY PriorityWeightFairshare=100000 PriorityWeightAge=1000 PriorityWeightPartition=10000 PriorityWeightJobSize=1000 #PriorityWeightQOS=2000 PriorityMaxAge=3-0 gives more importance to FairShare factor as compared to JobSize, so JobSize factor contribution won't be as noticeable as FairShare or Partition for instance. Also I don't use you make use of the SMALL_RELATIVE_TO_TIME. So if you want to favor shorter TimeLimit jobs I would increase the JobSize factor and add SMALL_RELATIVE_TO_TIME. Please, let me know if you have further questions. Thanks.
Hi David. Is there anything else you need from here? thank you.
Hello, Please feel free to close this call. Apologies for not getting back to you earlier. Best regards, David ________________________________ From: bugs@schedmd.com <bugs@schedmd.com> Sent: 19 November 2018 14:40 To: Baker D.J. Subject: [Bug 5964] Advice on the management of short jobs in SLURM Comment # 6<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.schedmd.com%2Fshow_bug.cgi%3Fid%3D5964%23c6&data=01%7C01%7Cd.j.baker%40soton.ac.uk%7Cc0c93fecebf4466922a908d64e2cef2a%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&sdata=vuWgfC799eXjPMnJNIxGkeJpjcK1QY3gZ2ZO4U9tcL4%3D&reserved=0> on bug 5964<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.schedmd.com%2Fshow_bug.cgi%3Fid%3D5964&data=01%7C01%7Cd.j.baker%40soton.ac.uk%7Cc0c93fecebf4466922a908d64e2cef2a%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&sdata=a3HVk%2BKGbr1A75HdP8jZld%2F37UH3lgny7JpqCBcBRek%3D&reserved=0> from Alejandro Sanchez<mailto:alex@schedmd.com> Hi David. Is there anything else you need from here? thank you. ________________________________ You are receiving this mail because: * You reported the bug.