We have following situation: We have various node types (high-clock, high-core, high-mem and GPU) nodes in quite different quantities (mostly high-core, only a few high-mem and GPU nodes) We created 4 partitions for those node types with the default being the one with most nodes (high-core nodes). We now want to create 3 QOS that specify the maximum walltime: - short: 8 hours - medium: 2 days - long: 8 days At the same time we want to allow jobs that are short to potentially consume 100%, jobs in the medium QOS to consume 50 % and jobs in the long QOS to consume 20 % of the resources of each partiion. I checked the documentation and also tried to google this, but as far as I can tell GrpTRES, et al can only be defined in absolute numbers and not in percentages. However because the paritition have un-equal number of resources, we can not have only 3 QOSes (short, medium and long) but we would need to create 3 x 4 (number of partitons) QOSes to cover our use case. So my question is, is there any way to specify the resource limits in percentage values (50 %) instead of absolute numbers (500 cores) and if not what would be the best workaround/approach to solve our use case ? Thanks in advance Uemit
Uemit, As far as I understand you're looking for an option to configure global hard limits on allocated resources based on wall time. If this is the case the most efficient way I see to achieve this would be to configure three partitions - short, medium, long - with different "MaxTime" and assign all nodes you have to short partition(default), 50% of all nodes to medium and only 20% to long. Instead of separate partitions for high-clock, high-mem, etc. I'd recommend using "feature" configuration option for nodes. If this is not the case - please elaborate a little bit on what are you trying to achieve. Do you want a 50% limit to be applied per user or per account instead of having it globally? Did you consider use of priority plugin to give shorter jobs a boost instead of "hard" limits? This approach may be beneficial since it provides overall higher utilization of resources. cheers, Marcin
@Macin thanks for the reply. I don't think that your suggested approach would work for us We want to have distinct partitions for the different node types because we want to avoid overflow to the expensive nodes (high-mem nodes). The user has to specifically submit to the corresponding partition if he/she wants to target the high-mem nodes for example. Our current workaround is to create the MxN QOSes (short, medium and long for each partition) when we create the slurm cluster and then have a lua script that will re-write the user submitted qos (short, medium, long) to the actual qos/partition pair (c_short, g_medium, etc). Service Class Prio TimeLimit Resource/User TotalQoSLimit m_short 1000 cpu=202,mem=4109G cpu=404,mem=8218G g_short 1000 cpu=14,mem=173G cpu=28,mem=346G c_short 1000 cpu=966,mem=4132G cpu=1932,mem=8265G m_medium 500 cpu=80,mem=1643G cpu=202,mem=4109G g_medium 500 cpu=5,mem=69G cpu=14,mem=173G c_medium 500 cpu=386,mem=1653G cpu=966,mem=4132G m_long 100 cpu=40,mem=821G cpu=80,mem=1643G g_long 100 cpu=2,mem=34G cpu=5,mem=69G c_long 100 cpu=193,mem=826G cpu=386,mem=1653G short 0 08:00:00 medium 0 2-00:00:00 long 0 14-00:00:00 The user will use only short, medium and long QOS and the partition (c, m, g) and we will re-write it to the actual qos. For example if a user submits a job with q medium qos (medium) to the high-mem partition (m), we re-write it to m_medium This approach works, however if we could define the resource limits as percentages we could avoid the MxN combinations of qos/partitions. There is another advantage of having percentages instead of absolute values: Our slurm cluster might not be static and we might dynamically add and remove nodes from it. With the absolute values we always have to re-calculate/re-generate the QOSes. if we could specifiy the resource limits in percentages, we would not need to do that. I hope this clarifies our use case.
In this case, the option you can try is to configure GrpTRES limits based on "billing", with a command like: > sacctmgr create qos medium GrpTRES=billing=50 MaxWall=2-0:0:0 Billing is calculated based on TRESBillingWeights[1] option defined per partition. For instance, setting TRESBillingWeights="CPU=0.5" will result in billing of 50 when 100 CPUs are in use. If you'd like to take other parameters like memory into account you may find PriorityFlags=MAX_TRES setting useful. The default behavior calculates billing as a sum of all parameters, with MAX_TRES billing for each resource is calculated separately and the highest value is treated as final result. I believe it's very close to percentage configuration you've been looking for. Let me know if this works for you. cheers, Marcin [1] https://slurm.schedmd.com/slurm.conf.html
Since there were no further questions from you within a week. I'll close this ticket as "info given". Should you need any further information, please do not hesitate to reopen. cheers, Marcin