Ticket 16045

Summary: srun does not read thread binding from sbatch
Product: Slurm Reporter: antoine.jego
Component: SchedulingAssignee: Jacob Jenson <jacob>
Status: OPEN --- QA Contact:
Severity: 6 - No support contract    
Priority: ---    
Version: 22.05.0   
Hardware: Linux   
OS: Linux   
Site: -Other- Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: --- DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---
Attachments: Reproducing script

Description antoine.jego 2023-02-15 12:58:05 MST
Created attachment 28876 [details]
Reproducing script

When submitting a script through `sbatch` with specific options (e.g. number of tasks), calling `srun` does not pass these options. They can however be passed by hinting specifically at them.

This can cause hwloc to mess up thread bindings.

An attached script can help reproduce the issue. It should be modified to target the running architecture (the partition I use has 36 cores/node).
The attached script outputs the following

srun                  
0x0000000f,0xffffffff 
0x0000000f,0xffffffff 
0x0000000f,0xffffffff 
srun hint             
0x0000000f,0xffffffff 
0x0000000f,0xffffffff 
0x0000000f,0xffffffff 
srun hint ++ 3, 12    
0x00555555            
0x0000000f,0xff000000 
0x00aaaaaa            

The only correct binding is the last one where all hints have been given. I would expect this correct behaviour to occur even when no hints are passed.