Ticket 14654

Summary: --threads-per-core=1 gives cannot request more threads per core than job allocation error
Product: Slurm Reporter: David Gloe <david.gloe>
Component: ConfigurationAssignee: Skyler Malinowski <skyler>
Status: RESOLVED FIXED QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: alex
Version: 21.08.6   
Hardware: Linux   
OS: Linux   
Site: CRAY Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: Cray Internal DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed: 21.08.9; 22.05.4; 23.02.0pre1
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---
Attachments: slurm.conf file

Description David Gloe 2022-08-01 10:09:44 MDT
Created attachment 26092 [details]
slurm.conf file

On an internal system, we're seeing an issue where srun by itself works fine, but specifying --threads-per-core=1 or --hint=nomultithread is failing with "Cannot request more threads per core than the job allocation". This happens both in an salloc and when running srun by itself.

dgloe@hotlum-login:~> srun hostname
x1000c0s7b1n1
dgloe@hotlum-login:~> srun --threads-per-core=1 hostname
srun: error: Unable to create step for job 6276: Cannot request more threads per core than the job allocation
dgloe@hotlum-login:~> srun --hint=nomultithread hostname
srun: error: Unable to create step for job 6277: Cannot request more threads per core than the job allocation
dgloe@hotlum-login:~> salloc --threads-per-core=1
salloc: Granted job allocation 6278
salloc: Waiting for resource configuration
salloc: Nodes x1000c0s7b1n1 are ready for job
dgloe@hotlum-login:~> srun --threads-per-core=1 hostname
srun: error: Unable to create step for job 6278: Cannot request more threads per core than the job allocation
Comment 1 Skyler Malinowski 2022-08-02 13:34:37 MDT
I can reproduce this behavior. It looks to be an issue with select/linear specifically. I will keep you posted as I know more.

Thanks,
Skyler
Comment 3 Skyler Malinowski 2022-08-10 16:45:42 MDT
This is a regression caused in 21.08.6. I have created a patch that is out for review.
Comment 7 Skyler Malinowski 2022-08-16 14:06:28 MDT
Out of curiosity, why are you using `select/linear` instead of `select/cons_tres`? `select/cons_tres` can be used for whole node allocations too.


Also I notice in your slurm.conf that you could simply the node section with the meta node NodeName=DEFAULT.

```
# slurm.conf

NodeName=DEFAULT RealMemory=512000 Sockets=2 CoresPerSocket=64 ThreadsPerCore=2 State=idle
NodeName=x1000c0s0b0n0
NodeName=x1000c0s0b0n1
NodeName=x1000c0s0b1n0
... (omitted) ...
NodeName=x1000c7s7b1n1
```
Comment 8 David Gloe 2022-08-16 14:17:16 MDT
This is how the system was set up by the admins, I'm not sure why they used select/linear. I've recommended them to use select/cons_res, which is what we typically use.

Is there an advantage to use select/cons_tres instead of select/cons_res?
Comment 9 Skyler Malinowski 2022-08-16 14:46:40 MDT
`select/cons_tres` is a super set of `select/cons_res` and has more features than `select/linear`.


https://slurm.schedmd.com/cons_res.html#using_cons_tres

> Slurm's default select/linear plugin is using a best fit algorithm based on
> number of consecutive nodes. The same node allocation approach is used with
> select/cons_res and select/cons_tres for consistency.

> Consumable Trackable Resources (cons_tres) plugin provides all the same
> functionality provided by the Consumable Resources (cons_res) plugin. It also
> includes additional functionality specifically related to GPUs.

> The --exclusive srun option allows users to request nodes in exclusive mode
> even when consumable resources is enabled. See the srun man page for details.
Comment 12 Skyler Malinowski 2022-08-17 07:39:04 MDT
Commit c728da23f8 merged in for 21.08.9, 22.05.4, and 23.02.0pre1.

Please not that 21.08.9 does not have a planned release date and may not be released at all. Fixes are always upward propagated, so please consider 22.05.4 for the future should 21.08.9 not be released.

Cheers,
Skyler