| Summary: | Q: can a job request resources from different partitions? | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Kilian Cavalotti <kilian> |
| Component: | Scheduling | Assignee: | David Bigagli <david> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 5 - Enhancement | ||
| Priority: | --- | CC: | da |
| Version: | 2.6.3 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | Stanford | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
|
Description
Kilian Cavalotti
2014-07-30 11:41:26 MDT
Hi, it is possible users to submit jobs asking for multiple partitions using the -p/--partition option, however the resources are going to be allocated from the first partition available rather than across partitions. David (In reply to David Bigagli from comment #1) > Hi, > it is possible users to submit jobs asking for multiple partitions using > the > -p/--partition option, however the resources are going to be allocated from > the first partition available rather than across partitions. So that means they can get normal nodes OR GPU nodes but not both, right? Thanks! Yes if you keep the resources in separate partition. You could have a partition with cpus and gpus however and control the access to the gpus via a submit plugin for example. David (In reply to David Bigagli from comment #3) > Yes if you keep the resources in separate partition. You could have a > partition with cpus and gpus however and control the access to the gpus via > a submit plugin for example. All right. Thanks a lot for the quick answer! Here are two other options: 1. Create a partition that includes all nodes. You may want to restrict access to that partition (your call). Give nodes "Features" like "gpu", "big_mem", etc. Then users needing multiple resource types can do something like this: sbatch --partition=full_system --constraints=gpu:2,big_mem:8 ... A less attractive option would be to submit separate jobs to the various partitions and then merge them into a single job, but it would not be possible to insure all of those jobs started at the same time (or even close to the same time). Hi Moe, (In reply to Moe Jette from comment #5) > Here are two other options: > > 1. Create a partition that includes all nodes. You may want to restrict > access to that partition (your call). Give nodes "Features" like "gpu", > "big_mem", etc. Then users needing multiple resource types can do something > like this: > sbatch --partition=full_system --constraints=gpu:2,big_mem:8 ... Yes, we've been thinking about that approach and we may adopt it eventually. Will it work with gres too? I mean can a user mix constraints and gres or cores requests? Such as requesting 2 cores on 8 big_mem nodes, and 4 GPUs on 2 gpu nodes ? Thanks! (In reply to Kilian Cavalotti from comment #6) > Hi Moe, > > (In reply to Moe Jette from comment #5) > > Here are two other options: > > > > 1. Create a partition that includes all nodes. You may want to restrict > > access to that partition (your call). Give nodes "Features" like "gpu", > > "big_mem", etc. Then users needing multiple resource types can do something > > like this: > > sbatch --partition=full_system --constraints=gpu:2,big_mem:8 ... > > Yes, we've been thinking about that approach and we may adopt it eventually. > Will it work with gres too? I mean can a user mix constraints and gres or > cores requests? Such as requesting 2 cores on 8 big_mem nodes, and 4 GPUs on > 2 gpu nodes ? That is not possible today, but we have begun work on that capability. It will likely not be available in the 14.11 release, but the following release about one year from today. Your only option to today would be to submit multiple jobs are then combine them as described here: http://slurm.schedmd.com/faq.html#job_size The biggest problem with that is the inability to co-schedule them. |