Ticket 1501 - Add defaut auto binding parameter
Summary: Add defaut auto binding parameter
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: Scheduling (show other tickets)
Version: 15.08.x
Hardware: Linux Linux
: 5 - Enhancement
Assignee: Brian Christiansen
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2015-03-03 05:58 MST by Brian Christiansen
Modified: 2015-03-05 05:00 MST (History)
2 users (show)

See Also:
Site: SchedMD
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 14.11.5 15.08.0pre3
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Brian Christiansen 2015-03-03 05:58:49 MST
It would be nice to have an option in the slurm.conf to set a default auto-binding for the cases that auto-binding doesn't work. This would be seperate from TaskPluginParams and would allow the user to still override the cpu binding.

brian@compy:~/slurm/14.11/compy$ srun -n3 ~/tools/whereami
   0 compy1     - MASK:0xff
   2 compy1     - MASK:0xff
   1 compy1     - MASK:0xff
Comment 1 Brian Christiansen 2015-03-03 08:17:34 MST
This will also help the case where auto binding doesn't work when using the --exclusive flag.

brian@compy:~/slurm/14.11/compy$ srun -n2 --exclusive ~/tools/whereami
   1 compy1     - MASK:0xff
   0 compy1     - MASK:0xff

debug:  binding tasks:2 to nodes:1 sockets:1:0 cores:4:0 threads:8
lllp_distribution jobid [73970] auto binding off: mask_cpu




brian@compy:~/slurm/14.11/compy$ srun -n2 ~/tools/whereami
   0 compy1     - MASK:0x1
   1 compy1     - MASK:0x10

debug:  binding tasks:2 to nodes:0 sockets:0:1 cores:1:0 threads:2
lllp_distribution jobid [73971] implicit auto binding: threads, dist 2
Comment 2 Brian Christiansen 2015-03-05 03:50:25 MST
Added in the following commits:

14.11: Added TaskpluginParam=autobind=threads (only thread because the protocol had to be changed to uint32_t to handle extra bits).
https://github.com/SchedMD/slurm/commit/ea51f870c19b92f9e251cbdef54b1e9013da959a

15.08: Added sockets and cores to autobind option.
https://github.com/SchedMD/slurm/commit/955ce4476fab0b26669d1710dc912b412194b709



E.g.

brian@compy:~/slurm/master2/compy$ srun -n1 ~/tools/whereami | sort -h
   0 compy1     - MASK:0x11

[Mar  5 09:43:54.256386 32577 0x7fbe9bfa0700] debug:  binding tasks:1 to nodes:0 sockets:0:1 cores:1:0 threads:2
[Mar  5 09:43:54.256398 32577 0x7fbe9bfa0700] lllp_distribution jobid [5208] implicit auto binding: cores, dist 1


brian@compy:~/slurm/master2/compy$ srun -n2 ~/tools/whereami | sort -h
   0 compy1     - MASK:0x1
   1 compy1     - MASK:0x10

[Mar  5 09:43:58.229396 32577 0x7fbe9bfa0700] debug:  binding tasks:2 to nodes:0 sockets:0:1 cores:1:0 threads:2
[Mar  5 09:43:58.229403 32577 0x7fbe9bfa0700] lllp_distribution jobid [5209] implicit auto binding: threads, dist 2


brian@compy:~/slurm/master2/compy$ srun -n2 --exclusive ~/tools/whereami | sort -h
   0 compy1     - MASK:0x1
   1 compy1     - MASK:0x10

[Mar  5 09:44:24.78206  32577 0x7fbe9bfa0700] debug:  binding tasks:2 to nodes:1 sockets:1:0 cores:4:0 threads:8
[Mar  5 09:44:24.78225  32577 0x7fbe9bfa0700] lllp_distribution jobid [5210] default auto binding: threads, dist 2


brian@compy:~/slurm/master2/compy$ srun -n3  ~/tools/whereami | sort -h
   0 compy1     - MASK:0x1
   1 compy1     - MASK:0x10
   2 compy1     - MASK:0x2

[Mar  5 09:44:34.256716 32577 0x7fbe9bfa0700] debug:  binding tasks:3 to nodes:0 sockets:0:1 cores:2:0 threads:4
[Mar  5 09:44:34.256736 32577 0x7fbe9bfa0700] lllp_distribution jobid [5211] default auto binding: threads, dist 2


brian@compy:~/slurm/master2/compy$ srun -n4  ~/tools/whereami | sort -h
   0 compy1     - MASK:0x1
   1 compy1     - MASK:0x10
   2 compy1     - MASK:0x2
   3 compy1     - MASK:0x20

[Mar  5 09:45:14.246323 32577 0x7fbe9bfa0700] debug:  binding tasks:4 to nodes:0 sockets:0:1 cores:2:0 threads:4
[Mar  5 09:45:14.246353 32577 0x7fbe9bfa0700] lllp_distribution jobid [5212] implicit auto binding: threads, dist 2


brian@compy:~/slurm/master2/compy$ srun -n4 --exclusive  ~/tools/whereami | sort -h
   0 compy1     - MASK:0x11
   1 compy1     - MASK:0x22
   2 compy1     - MASK:0x44
   3 compy1     - MASK:0x88

[Mar  5 09:45:21.189590 32577 0x7fbe9bfa0700] debug:  binding tasks:4 to nodes:1 sockets:1:0 cores:4:0 threads:8
[Mar  5 09:45:21.189600 32577 0x7fbe9bfa0700] lllp_distribution jobid [5213] implicit auto binding: cores, dist 2


brian@compy:~/slurm/master2/compy$ srun -n5  ~/tools/whereami | sort -h
   0 compy1     - MASK:0x1
   1 compy1     - MASK:0x10
   2 compy1     - MASK:0x2
   3 compy1     - MASK:0x20
   4 compy1     - MASK:0x4

[Mar  5 09:45:29.166387 32577 0x7fbe9bfa0700] debug:  binding tasks:5 to nodes:0 sockets:0:1 cores:3:0 threads:6
[Mar  5 09:45:29.166408 32577 0x7fbe9bfa0700] lllp_distribution jobid [5214] default auto binding: threads, dist 2


brian@compy:~/slurm/master2/compy$ srun -n6  ~/tools/whereami | sort -h
   0 compy1     - MASK:0x1
   1 compy1     - MASK:0x10
   2 compy1     - MASK:0x2
   3 compy1     - MASK:0x20
   4 compy1     - MASK:0x4
   5 compy1     - MASK:0x40

[Mar  5 09:45:41.702859 32577 0x7fbe9bfa0700] debug:  binding tasks:6 to nodes:0 sockets:0:1 cores:3:0 threads:6
[Mar  5 09:45:41.702878 32577 0x7fbe9bfa0700] lllp_distribution jobid [5215] implicit auto binding: threads, dist 2


brian@compy:~/slurm/master2/compy$ srun -n6 --exclusive ~/tools/whereami | sort -h
   0 compy1     - MASK:0x1
   1 compy1     - MASK:0x10
   2 compy1     - MASK:0x2
   3 compy1     - MASK:0x20
   4 compy1     - MASK:0x4
   5 compy1     - MASK:0x40

[Mar  5 09:45:49.62847  32577 0x7fbe9bfa0700] debug:  binding tasks:6 to nodes:1 sockets:1:0 cores:4:0 threads:8
[Mar  5 09:45:49.62877  32577 0x7fbe9bfa0700] lllp_distribution jobid [5216] default auto binding: threads, dist 2