Ticket 17585

Summary: How to limit job capacity on specific nodes
Product: Slurm Reporter: Thu-Ha Tran <thu-ha.tran>
Component: ConfigurationAssignee: Ben Roberts <ben>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 4 - Minor Issue    
Priority: ---    
Version: 21.08.4   
Hardware: Linux   
OS: Linux   
Site: Shell Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Thu-Ha Tran 2023-08-31 08:27:26 MDT
Hi support,

We have nodes running on the cluster and nodes has working so hard.  Is there a way to limit capacity of running jobs on the nodes to prevent the nodes could be burned out?

Is there easy way to configure in SLURM (e.g.:limit on percentage)?

Thanks,
Thu-Ha
Comment 1 Ben Roberts 2023-08-31 13:31:19 MDT
Hi Thu-Ha,

There are a couple parameters that I think could help you out.  There is a SelectTypeParameter that tells the scheduler to place jobs on the least loaded node first, rather than packing them on nodes that are already busy.  This doesn't stop nodes from being loaded to capacity, but it may help if your cluster isn't fully occupied.  You can read more about it here:
https://slurm.schedmd.com/slurm.conf.html#OPT_CR_LLN

The other option I think is a better fit for what you are asking for.  You can specify that a certain number of cores are set aside for system processes rather than being scheduled for jobs.  If you specify more cores than you need for system processes then they will sit idle and will keep the nodes from being maxed out.
https://slurm.schedmd.com/slurm.conf.html#OPT_CoreSpecCount

Let me know if either of these sound like they'll work for you.

Thanks,
Ben
Comment 2 Ben Roberts 2023-09-26 13:28:31 MDT
Hi Thu-Ha,

Did either of the parameters I suggested work for what you are trying to do?  Let me know if you still need help with this ticket.

Thanks,
Ben
Comment 3 Thu-Ha Tran 2023-09-26 15:40:39 MDT
Those parameters seemed not achieve on what we expected.
Anyway, we leave this option on the side now.
You can close ticket

Thanks for your helps!

Regards,
Thu-ha
Comment 4 Ben Roberts 2023-09-27 08:20:17 MDT
I'm sorry to hear these didn't quite get you the behavior you wanted.  If you'd like to look at this again down the road feel free to update the ticket.

Thanks,
Ben