| Summary: | Coming from SGE: equivalence of "Queue" is "Partition" or "QOS"? | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Hermann Schwärzler <hermann.schwaerzler> |
| Component: | Configuration | Assignee: | Ben Roberts <ben> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | ||
| Version: | 20.02.4 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | Innsbruck | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | ||
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
|
Description
Hermann Schwärzler
2020-09-22 05:20:24 MDT
Hi Hermann, There are a couple options available to you to separate the bigmem nodes from the standard nodes. One option is to create two partitions (the equivalent of queues in SGE). By creating separate partitions for the different types of nodes you can make sure that the bigmem nodes don't get used by jobs that don't have the additional memory requirements. You can make the normal partition the default for everyone so that they have to request the bigmem partition if they need to use it. If you're not sure that the bigmem nodes will be fully occupied and want them to be available to other jobs when the normal nodes are busy, then you can leave all the nodes in the same partition and use a feature to differentiate these nodes. If you specify a Weight for the nodes, (smaller Weight values get assigned first) you can make it so the normal nodes are used first and the bigmem ones are only used when nothing else is available. When users need to use the bigmem nodes they can specify the "bigmem" feature to request them. Here's an example of how this would look in your slurm.conf: Nodename=node[01-50] CPUs=24 RealMemory=65536 Weight=1 Nodename=node[51-75] CPUs=24 RealMemory=524288 Weight=10 Feature=bigmem You can obviously modify this to use the node names of your choice and appropriate number of CPUs along with any other things you want to define for the nodes. When users want to request the bigmem nodes then can do so with the '--constraint' flag: sbatch --constraint=bigmem job.sh You can use a QOS along with a partition if you so choose. They aren't replacements for eachother, but complement each other. A QOS allows you to define limits on things like the maximum number of a resource that can be requested by an individual job, or by all jobs in the QOS at a given time, or limits on usage over time, etc. You can read more about these options on the following pages: https://slurm.schedmd.com/qos.html https://slurm.schedmd.com/sacctmgr.html#SECTION_SPECIFICATIONS-FOR-QOS Please let me know if you have any additional questions about this. Thanks, Ben Hi Hermann, I wanted to follow up and see if you have any additional questions about this. Let me know if you still need help with this or if it's ok to close. Thanks, Ben Hi Ben, thank you for your help and sorry for not answering earlier - I was busy in other areas... As of now I don't have additional questions. I will have to ponder and test how to best configure things. :-) Thank you, Hermann |