Ticket 1501

Summary: Add defaut auto binding parameter
Product: Slurm Reporter: Brian Christiansen <brian>
Component: SchedulingAssignee: Brian Christiansen <brian>
Status: RESOLVED FIXED QA Contact:
Severity: 5 - Enhancement    
Priority: --- CC: brian, da
Version: 15.08.x   
Hardware: Linux   
OS: Linux   
Site: SchedMD Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: 14.11.5 15.08.0pre3 Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Brian Christiansen 2015-03-03 05:58:49 MST
It would be nice to have an option in the slurm.conf to set a default auto-binding for the cases that auto-binding doesn't work. This would be seperate from TaskPluginParams and would allow the user to still override the cpu binding.

brian@compy:~/slurm/14.11/compy$ srun -n3 ~/tools/whereami
   0 compy1     - MASK:0xff
   2 compy1     - MASK:0xff
   1 compy1     - MASK:0xff
Comment 1 Brian Christiansen 2015-03-03 08:17:34 MST
This will also help the case where auto binding doesn't work when using the --exclusive flag.

brian@compy:~/slurm/14.11/compy$ srun -n2 --exclusive ~/tools/whereami
   1 compy1     - MASK:0xff
   0 compy1     - MASK:0xff

debug:  binding tasks:2 to nodes:1 sockets:1:0 cores:4:0 threads:8
lllp_distribution jobid [73970] auto binding off: mask_cpu




brian@compy:~/slurm/14.11/compy$ srun -n2 ~/tools/whereami
   0 compy1     - MASK:0x1
   1 compy1     - MASK:0x10

debug:  binding tasks:2 to nodes:0 sockets:0:1 cores:1:0 threads:2
lllp_distribution jobid [73971] implicit auto binding: threads, dist 2
Comment 2 Brian Christiansen 2015-03-05 03:50:25 MST
Added in the following commits:

14.11: Added TaskpluginParam=autobind=threads (only thread because the protocol had to be changed to uint32_t to handle extra bits).
https://github.com/SchedMD/slurm/commit/ea51f870c19b92f9e251cbdef54b1e9013da959a

15.08: Added sockets and cores to autobind option.
https://github.com/SchedMD/slurm/commit/955ce4476fab0b26669d1710dc912b412194b709



E.g.

brian@compy:~/slurm/master2/compy$ srun -n1 ~/tools/whereami | sort -h
   0 compy1     - MASK:0x11

[Mar  5 09:43:54.256386 32577 0x7fbe9bfa0700] debug:  binding tasks:1 to nodes:0 sockets:0:1 cores:1:0 threads:2
[Mar  5 09:43:54.256398 32577 0x7fbe9bfa0700] lllp_distribution jobid [5208] implicit auto binding: cores, dist 1


brian@compy:~/slurm/master2/compy$ srun -n2 ~/tools/whereami | sort -h
   0 compy1     - MASK:0x1
   1 compy1     - MASK:0x10

[Mar  5 09:43:58.229396 32577 0x7fbe9bfa0700] debug:  binding tasks:2 to nodes:0 sockets:0:1 cores:1:0 threads:2
[Mar  5 09:43:58.229403 32577 0x7fbe9bfa0700] lllp_distribution jobid [5209] implicit auto binding: threads, dist 2


brian@compy:~/slurm/master2/compy$ srun -n2 --exclusive ~/tools/whereami | sort -h
   0 compy1     - MASK:0x1
   1 compy1     - MASK:0x10

[Mar  5 09:44:24.78206  32577 0x7fbe9bfa0700] debug:  binding tasks:2 to nodes:1 sockets:1:0 cores:4:0 threads:8
[Mar  5 09:44:24.78225  32577 0x7fbe9bfa0700] lllp_distribution jobid [5210] default auto binding: threads, dist 2


brian@compy:~/slurm/master2/compy$ srun -n3  ~/tools/whereami | sort -h
   0 compy1     - MASK:0x1
   1 compy1     - MASK:0x10
   2 compy1     - MASK:0x2

[Mar  5 09:44:34.256716 32577 0x7fbe9bfa0700] debug:  binding tasks:3 to nodes:0 sockets:0:1 cores:2:0 threads:4
[Mar  5 09:44:34.256736 32577 0x7fbe9bfa0700] lllp_distribution jobid [5211] default auto binding: threads, dist 2


brian@compy:~/slurm/master2/compy$ srun -n4  ~/tools/whereami | sort -h
   0 compy1     - MASK:0x1
   1 compy1     - MASK:0x10
   2 compy1     - MASK:0x2
   3 compy1     - MASK:0x20

[Mar  5 09:45:14.246323 32577 0x7fbe9bfa0700] debug:  binding tasks:4 to nodes:0 sockets:0:1 cores:2:0 threads:4
[Mar  5 09:45:14.246353 32577 0x7fbe9bfa0700] lllp_distribution jobid [5212] implicit auto binding: threads, dist 2


brian@compy:~/slurm/master2/compy$ srun -n4 --exclusive  ~/tools/whereami | sort -h
   0 compy1     - MASK:0x11
   1 compy1     - MASK:0x22
   2 compy1     - MASK:0x44
   3 compy1     - MASK:0x88

[Mar  5 09:45:21.189590 32577 0x7fbe9bfa0700] debug:  binding tasks:4 to nodes:1 sockets:1:0 cores:4:0 threads:8
[Mar  5 09:45:21.189600 32577 0x7fbe9bfa0700] lllp_distribution jobid [5213] implicit auto binding: cores, dist 2


brian@compy:~/slurm/master2/compy$ srun -n5  ~/tools/whereami | sort -h
   0 compy1     - MASK:0x1
   1 compy1     - MASK:0x10
   2 compy1     - MASK:0x2
   3 compy1     - MASK:0x20
   4 compy1     - MASK:0x4

[Mar  5 09:45:29.166387 32577 0x7fbe9bfa0700] debug:  binding tasks:5 to nodes:0 sockets:0:1 cores:3:0 threads:6
[Mar  5 09:45:29.166408 32577 0x7fbe9bfa0700] lllp_distribution jobid [5214] default auto binding: threads, dist 2


brian@compy:~/slurm/master2/compy$ srun -n6  ~/tools/whereami | sort -h
   0 compy1     - MASK:0x1
   1 compy1     - MASK:0x10
   2 compy1     - MASK:0x2
   3 compy1     - MASK:0x20
   4 compy1     - MASK:0x4
   5 compy1     - MASK:0x40

[Mar  5 09:45:41.702859 32577 0x7fbe9bfa0700] debug:  binding tasks:6 to nodes:0 sockets:0:1 cores:3:0 threads:6
[Mar  5 09:45:41.702878 32577 0x7fbe9bfa0700] lllp_distribution jobid [5215] implicit auto binding: threads, dist 2


brian@compy:~/slurm/master2/compy$ srun -n6 --exclusive ~/tools/whereami | sort -h
   0 compy1     - MASK:0x1
   1 compy1     - MASK:0x10
   2 compy1     - MASK:0x2
   3 compy1     - MASK:0x20
   4 compy1     - MASK:0x4
   5 compy1     - MASK:0x40

[Mar  5 09:45:49.62847  32577 0x7fbe9bfa0700] debug:  binding tasks:6 to nodes:1 sockets:1:0 cores:4:0 threads:8
[Mar  5 09:45:49.62877  32577 0x7fbe9bfa0700] lllp_distribution jobid [5216] default auto binding: threads, dist 2