Ticket 15438

Summary: Mixed memory nodes
Product: Slurm Reporter: NASA JSC Aerolab <JSC-DL-AEROLAB-ADMIN>
Component: ConfigurationAssignee: Ben Glines <ben.glines>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 4 - Minor Issue    
Priority: ---    
Version: 21.08.5   
Hardware: Linux   
OS: Linux   
Site: Johnson Space Center Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description NASA JSC Aerolab 2022-11-16 15:05:27 MST
Hello,
We have nodes in our cluster with two different memory size, so we are trying to figure what is the best way to configure SLURM to get the maximum memory utilization.

We are considering either adding weight flag, so the nodes with less memory gets the job assigned first or adding a feature so users can request nodes with specific memory type.


RealMemory=126641
RealMemory=255672

What would SchedMD recommend?

Thank you.
Patrick
Comment 1 Ben Glines 2022-11-16 16:25:50 MST
Hi Patrick,

There are several options you could try that would work great. It really depends on what what is best for your site's needs.

If these nodes are all on the same queue (Slurm partition):
- I would recommend weighted nodes, so that jobs flow to low memory nodes first. Your exact case is even documented in the description for node weights: https://slurm.schedmd.com/slurm.conf.html#OPT_Weight Basically nodes with more resources should have a higher weight.
- Adding a node feature such as "largemem" is also an excellent idea if you want to give users the ability to explicitly request and require these higher memory nodes for their jobs.

If these nodes are in different queues (Slurm partitions):
- Specifying partition names is all that is needed, e.g. PartitionName=largemem, PartitionName=smallmem. This does mean though that jobs will be targeted to specific queues upon submission.

Some sites will also use the job_submit plugin and look at what the job is requesting and change partitions / or request a feature etc.

Let me know if you have any questions
Comment 2 NASA JSC Aerolab 2022-11-16 16:47:56 MST
Thank you for your suggestion.

Our nodes are shared between three partitions. Most of our workflow uses default 4GB/core memory, but there are few workflow that requires more memory per core.
Will adding weight and a feature (largemem) cause any issue?

This way SLURM will fill up low memory nodes first and leave largemem nodes for workflow with more memory requirement.

Patrick.
Comment 3 Ben Glines 2022-11-17 09:26:14 MST
(In reply to NASA JSC Aerolab from comment #2)
> Will adding weight and a feature (largemem) cause any issue?

No, you shouldn't run into any problems. I just tested this personally and all you'll need to do is run `scontrol reconfigure` after adding the weights and features.

> This way SLURM will fill up low memory nodes first and leave largemem nodes
> for workflow with more memory requirement.

Yes, this is correct. Although just to be clear, even jobs without memory requirements will run on the "largemem" nodes if all other low memory nodes (lower weight) are filled up.
Comment 4 NASA JSC Aerolab 2022-11-17 09:41:28 MST
Ben,
Thanks for testing, and yes we are alright if jobs without/low memory requirements run on the largemem nodes once the low memory nodes are filled up.
Comment 5 Ben Glines 2022-11-17 09:48:44 MST
Okay sounds good! I'll close this bug now, but feel free to reopen if you have question related to what we talked about.