Ticket 14654

Summary:	--threads-per-core=1 gives cannot request more threads per core than job allocation error
Product:	Slurm	Reporter:	David Gloe <david.gloe>
Component:	Configuration	Assignee:	Skyler Malinowski <skyler>
Status:	RESOLVED FIXED	QA Contact:
Severity:	4 - Minor Issue
Priority:	---	CC:	alex
Version:	21.08.6
Hardware:	Linux
OS:	Linux
Site:	CRAY	Slinky Site:	---
Alineos Sites:	---	Atos/Eviden Sites:	---
Confidential Site:	---	Coreweave sites:	---
Cray Sites:	Cray Internal	DS9 clusters:	---
Google sites:	---	HPCnow Sites:	---
HPE Sites:	---	IBM Sites:	---
NOAA SIte:	---	NoveTech Sites:	---
Nvidia HWinf-CS Sites:	---	OCF Sites:	---
Recursion Pharma Sites:	---	SFW Sites:	---
SNIC sites:	---	Tzag Elita Sites:	---
Linux Distro:	---	Machine Name:
CLE Version:		Version Fixed:	21.08.9; 22.05.4; 23.02.0pre1
Target Release:	---	DevPrio:	---
Emory-Cloud Sites:	---
Attachments:	slurm.conf file

Description David Gloe 2022-08-01 10:09:44 MDT

Created attachment 26092 [details]
slurm.conf file

On an internal system, we're seeing an issue where srun by itself works fine, but specifying --threads-per-core=1 or --hint=nomultithread is failing with "Cannot request more threads per core than the job allocation". This happens both in an salloc and when running srun by itself.

dgloe@hotlum-login:~> srun hostname
x1000c0s7b1n1
dgloe@hotlum-login:~> srun --threads-per-core=1 hostname
srun: error: Unable to create step for job 6276: Cannot request more threads per core than the job allocation
dgloe@hotlum-login:~> srun --hint=nomultithread hostname
srun: error: Unable to create step for job 6277: Cannot request more threads per core than the job allocation
dgloe@hotlum-login:~> salloc --threads-per-core=1
salloc: Granted job allocation 6278
salloc: Waiting for resource configuration
salloc: Nodes x1000c0s7b1n1 are ready for job
dgloe@hotlum-login:~> srun --threads-per-core=1 hostname
srun: error: Unable to create step for job 6278: Cannot request more threads per core than the job allocation

Comment 1 Skyler Malinowski 2022-08-02 13:34:37 MDT

I can reproduce this behavior. It looks to be an issue with select/linear specifically. I will keep you posted as I know more.

Thanks,
Skyler

Comment 3 Skyler Malinowski 2022-08-10 16:45:42 MDT

This is a regression caused in 21.08.6. I have created a patch that is out for review.

Comment 7 Skyler Malinowski 2022-08-16 14:06:28 MDT

Out of curiosity, why are you using `select/linear` instead of `select/cons_tres`? `select/cons_tres` can be used for whole node allocations too.


Also I notice in your slurm.conf that you could simply the node section with the meta node NodeName=DEFAULT.

```
# slurm.conf

NodeName=DEFAULT RealMemory=512000 Sockets=2 CoresPerSocket=64 ThreadsPerCore=2 State=idle
NodeName=x1000c0s0b0n0
NodeName=x1000c0s0b0n1
NodeName=x1000c0s0b1n0
... (omitted) ...
NodeName=x1000c7s7b1n1
```

Comment 8 David Gloe 2022-08-16 14:17:16 MDT

This is how the system was set up by the admins, I'm not sure why they used select/linear. I've recommended them to use select/cons_res, which is what we typically use.

Is there an advantage to use select/cons_tres instead of select/cons_res?

Comment 9 Skyler Malinowski 2022-08-16 14:46:40 MDT

`select/cons_tres` is a super set of `select/cons_res` and has more features than `select/linear`.


https://slurm.schedmd.com/cons_res.html#using_cons_tres

> Slurm's default select/linear plugin is using a best fit algorithm based on
> number of consecutive nodes. The same node allocation approach is used with
> select/cons_res and select/cons_tres for consistency.

> Consumable Trackable Resources (cons_tres) plugin provides all the same
> functionality provided by the Consumable Resources (cons_res) plugin. It also
> includes additional functionality specifically related to GPUs.

> The --exclusive srun option allows users to request nodes in exclusive mode
> even when consumable resources is enabled. See the srun man page for details.

Comment 12 Skyler Malinowski 2022-08-17 07:39:04 MDT

Commit c728da23f8 merged in for 21.08.9, 22.05.4, and 23.02.0pre1.

Please not that 21.08.9 does not have a planned release date and may not be released at all. Fixes are always upward propagated, so please consider 22.05.4 for the future should 21.08.9 not be released.

Cheers,
Skyler