| Summary: | Initial setup again: partitions, qos, features | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Bill Wichser <bill> |
| Component: | Scheduling | Assignee: | David Bigagli <david> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | da |
| Version: | 14.03.0 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | Princeton (PICSciE) | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
|
Description
Bill Wichser
2014-04-21 00:31:47 MDT
The more partitions that you configure, the more that resources tend to get fragmented and difficult to use. The highest system utilization tends to happen when there are one or two partitions (say one for small/short interactive jobs and another for batch jobs). I would suggest use of the Feature option that you have mentioned as a good solution. You might also want to make use of the "Weight" configuration parameter associated with the nodes (at least that's configuration used successfully at NASA Goddard). Below is from the slurm.conf man page: Weight The priority of the node for scheduling purposes. All things being equal, jobs will be allocated the nodes with the lowest weight which satisfies their requirements. For example, a heterogeneous collection of nodes might be placed into a single partition for greater system utilization, responsiveness and capability. It would be preferable to allocate smaller memory nodes rather than larger memory nodes if either will satisfy a job's requirements. The units of weight are arbitrary, but larger weights should be assigned to nodes with more processors, memory, disk space, higher processor speed, etc. Note that if a job allocation request can not be satisfied using the nodes with the lowest weight, the set of nodes with the next lowest weight is added to the set of nodes under consideration for use (repeat as needed for higher weight values). If you absolutely want to minimize the number of higher weight nodes allocated to a job (at a cost of higher scheduling overhead), give each node a distinct Weight value and they will be added to the pool of nodes being considered for scheduling individually. The default value is 1. Status: resolved All right then. Feature it is, single partition, and face the next issue. Closing. David |