| Summary: | Multiple jobs on a node despite Oversubscribe=EXCLUSIVE | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Luke Yeager <lyeager> |
| Component: | Scheduling | Assignee: | Director of Support <support> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 3 - Medium Impact | ||
| Priority: | --- | ||
| Version: | 19.05.4 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | NVIDIA (PSLA) | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
| Attachments: | sanitized slurm.conf | ||
Ah, we just discovered that PreemptMode=ON when we meant to have PreemptMode=CANCEL. That's probably it. Checking now. I think this is the problem: PartitionName=main Default=NO PriorityTier=2 DefaultTime=2:00:00 MaxTime=2:00:00 PreemptMode=OFF nodes=ALL OverSubscribe=NO PartitionName=backfill Default=YES PriorityTier=1 DefaultTime=0:30:00 MaxTime=8:00:00 PreemptMode=ON GraceTime=600 nodes=ALL OverSubscribe=NO QOS=backfill OverSubscribe=NO is not the same thing as OverSubscribe=EXCLUSIVE. Oh, we actually changed from EXCLUSIVE to NO just a few minutes ago based on the documentation here: https://slurm.schedmd.com/archive/slurm-19.05.4/cons_res_share.html We didn't really expect that change to help, because we were pretty sure we wanted EXCLUSIVE, but we thought following the docs seemed pretty safe. So we should set PreemptMode=CANCEL and go back to OverSubscribe=EXCLUSIVE, agreed? Sorry, you are right. With select/linear, it should be NO. But we highly recommend using select/cons_res or select/cons_tres with OverSubscribe=EXCLUSIVE instead. That gives you the same whole-node functionality as select/linear but with added flexibility. We also do not support select/linear very much and will likely remove it in the near future. (In reply to luke.yeager from comment #3) > So we should set PreemptMode=CANCEL and go back to OverSubscribe=EXCLUSIVE, > agreed? PreemptMode=ON is an invalid setting, so yes, setting it to CANCEL or REQUEUE is the common practice. Though that is not related to the issue of multiple jobs on the same node. OverSubscribe=EXCLUSIVE is an invalid setting for select/linear. Setting it to NO should probably fix the issue. Hi Luke, any updates? I'm going to go ahead and reduce the severity to 3, since configuration questions generally don't qualify as a sev 2. We've resolved the issue now. Thanks for the pointers! Closing. |
Created attachment 14614 [details] sanitized slurm.conf On our cluster, we are observing multiple jobs scheduled on the same node, despite having OverSubscribe=EXCLUSIVE set on all partitions. Here is the smoking gun: lyeager@login-01:~$ squeue -a -w node-00[59,60] JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 330337 main bash user_d R 25:47 2 node-[0059-0060] 330110 backfill job1 user_b R 1:40:32 1 node-0060 330065 backfill job2 user_a R 1:33:48 1 node-0059 I've attached our [sanitized] slurm.conf. Is there any issue with our configuration that you can see? We have basically the same config on another cluster and have not seen any issues. This is a very big problem for us.