16806 – MaxMemPerCPU at Node Level

Ticket 16806 - MaxMemPerCPU at Node Level

Summary: MaxMemPerCPU at Node Level

Status:	RESOLVED TIMEDOUT

Alias:	None

Product:	Slurm
Classification:	Unclassified
Component:	Scheduling (show other tickets)
Version:	21.08.8
Hardware:	Linux Linux

Severity:	4 - Minor Issue
Assignee:	Benjamin Witham
QA Contact:

URL:

Depends on:
Blocks:

Reported:	2023-05-23 13:14 MDT by X-ISS VSupport
Modified:	2023-06-20 15:17 MDT (History)
CC List:	4 users (show)

See Also:
Site:	NJIT
Slinky Site:	---
Alineos Sites:	---
Atos/Eviden Sites:	---
Confidential Site:	---
Coreweave sites:	---
Cray Sites:	---
DS9 clusters:	---
Google sites:	---
HPCnow Sites:	---
HPE Sites:	---
IBM Sites:	---
NOAA SIte:	---
NoveTech Sites:	---
Nvidia HWinf-CS Sites:	---
OCF Sites:	---
Recursion Pharma Sites:	---
SFW Sites:	---
SNIC sites:	---
Tzag Elita Sites:	---
Linux Distro:	---
Machine Name:
CLE Version:
Version Fixed:
Target Release:	---
DevPrio:	---
Emory-Cloud Sites:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this ticket.

Description X-ISS VSupport 2023-05-23 13:14:14 MDT

Hello,

Is it possible to set MaxMemPerCPU setting at a node level?

We have a couple different node types with different memory-per-core values. Understand this can be set globally and at partition level, but looking to reduce partition count where possible. 

Thanks!

Comment 1 Benjamin Witham 2023-05-23 15:37:22 MDT

Hello, 

There is no functionality for a MaxMemPerCPU on a node to node basis, we have found that most users have no need for it. As you have said, there is the MaxMemPerCPU option for partitions and globally. You could have many partitions or set a global MaxMem that is in the upper bound of your nodes. 

What are you uses cases for needing a MaxMemPerCPU node?

Comment 2 X-ISS VSupport 2023-05-24 09:53:42 MDT

Hello,

Current goals:
- Accounting/charging tracked via CPU hours as service units.
- If users request more memory it gives them more cores to match.
- Try to keep one purpose for SLURM parameter so easier to understand logic for users (ex: partitions vs constraints for hardware type - normal vs largemem). 

Thanks

Comment 3 Benjamin Witham 2023-05-24 15:04:08 MDT

Hello, 

Could you elaborate on some of your points?

> Accounting/charging tracked via CPU hours as service units.

What are the service units? We have data outputs in AccountingStorageTRES that can be accessed through sacct. This SystemCPU field or the TotalCPU field seem close to what you're hoping for.
https://slurm.schedmd.com/sacct.html#OPT_SystemCPU
https://slurm.schedmd.com/sacct.html#OPT_TotalCPU

> Try to keep one purpose for SLURM parameter so easier to understand logic for users (ex:
> partitions vs constraints for hardware type - normal vs largemem).
I'm afraid I don't quite understand what you mean by this. Are you trying to lessen the number of arguments that users need to input?

Comment 4 Benjamin Witham 2023-06-12 11:05:11 MDT

Hello, 

Is there any further elaboration on your points? I'm afraid I don't quite understand what you mean by some of them.

Comment 5 Benjamin Witham 2023-06-20 15:17:39 MDT

I'm going to close this ticket now. Feel free to open this ticket if you have some clarifications for your use cases.