Ticket 16806

Summary: MaxMemPerCPU at Node Level
Product: Slurm Reporter: X-ISS VSupport <njit.vsupport>
Component: SchedulingAssignee: Benjamin Witham <benjamin.witham>
Status: RESOLVED TIMEDOUT QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: benjamin.witham, gwolosh, kilian, ncarl
Version: 21.08.8   
Hardware: Linux   
OS: Linux   
Site: NJIT Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description X-ISS VSupport 2023-05-23 13:14:14 MDT
Hello,

Is it possible to set MaxMemPerCPU setting at a node level?

We have a couple different node types with different memory-per-core values. Understand this can be set globally and at partition level, but looking to reduce partition count where possible. 

Thanks!
Comment 1 Benjamin Witham 2023-05-23 15:37:22 MDT
Hello, 

There is no functionality for a MaxMemPerCPU on a node to node basis, we have found that most users have no need for it. As you have said, there is the MaxMemPerCPU option for partitions and globally. You could have many partitions or set a global MaxMem that is in the upper bound of your nodes. 

What are you uses cases for needing a MaxMemPerCPU node?
Comment 2 X-ISS VSupport 2023-05-24 09:53:42 MDT
Hello,

Current goals:
- Accounting/charging tracked via CPU hours as service units.
- If users request more memory it gives them more cores to match.
- Try to keep one purpose for SLURM parameter so easier to understand logic for users (ex: partitions vs constraints for hardware type - normal vs largemem). 

Thanks
Comment 3 Benjamin Witham 2023-05-24 15:04:08 MDT
Hello, 

Could you elaborate on some of your points?

> Accounting/charging tracked via CPU hours as service units.

What are the service units? We have data outputs in AccountingStorageTRES that can be accessed through sacct. This SystemCPU field or the TotalCPU field seem close to what you're hoping for.
https://slurm.schedmd.com/sacct.html#OPT_SystemCPU
https://slurm.schedmd.com/sacct.html#OPT_TotalCPU

> Try to keep one purpose for SLURM parameter so easier to understand logic for users (ex:
> partitions vs constraints for hardware type - normal vs largemem).
I'm afraid I don't quite understand what you mean by this. Are you trying to lessen the number of arguments that users need to input?
Comment 4 Benjamin Witham 2023-06-12 11:05:11 MDT
Hello, 

Is there any further elaboration on your points? I'm afraid I don't quite understand what you mean by some of them.
Comment 5 Benjamin Witham 2023-06-20 15:17:39 MDT
I'm going to close this ticket now. Feel free to open this ticket if you have some clarifications for your use cases.