Ticket 12613

Summary: 21.08.2: cgroup TaskAffinity=yes has gone: we had =no before
Product: Slurm Reporter: Kevin Buckley <kevin.buckley>
Component: ConfigurationAssignee: Marcin Stolarek <cinek>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: cinek
Version: 21.08.1   
Hardware: Cray XC   
OS: Linux   
Site: Pawsey Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: SUSE
Machine Name: magnus, galaxy, chaos CLE Version: 6 UP07
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Kevin Buckley 2021-10-06 00:06:28 MDT
Release notes for 21.08.2 say:

"There is one significant change include in this maintenance release: 
 the removal of support for the long-misunderstood TaskAffinity=yes 
 option in cgroup.conf. Please consider using "TaskPlugins=cgroup,affinity"
 in slurm.conf as an option."

and I'd like to ask I question, that will surely give more credence,
as if any was needed, to the "long-misunderstood" claim above!

So, all three of our Cray XCs (2x Prod = 20.11.8; TDS = 21.08.1) have

cgroup.conf
===========
TaskAffinity=no

slurm.conf
==========
TaskPlugin=task/cray_aries,task/affinity,task/cgroup


I can also see that the cgroup.conf man-page no longer has a mention
of the TaskAffinity parameter, so 

would it be more correct to say that the change was

"the removal of support for the long-misunderstood TaskAffinity option
 in cgroup.conf."

as in, it's gone completely, as opposed to TaskAffinity=no is still valid?

Kevin
-- 
Supercomputing Systems Administrator
Pawsey Supercomputing Centre
Comment 1 Marcin Stolarek 2021-10-06 02:47:47 MDT
Kevin,

Having TaskAffinity=no was just a repetition of a default value for the option, so it didn't have any effect compared to skipping it in cgroup.conf at all.

To avoid a breaking change on the minor release the option is still parsed on Slurm 21.08 and if set to 'no' it's accepted. If it's set to yes then we return with a fatal error.
>fatal: Support for TaskAffinity=yes in cgroup.conf has been removed. Consider adding task/affinity to TaskPlugins in slurm.conf instead

The parsing is completely removed in Slurm 22.05 and as it is today having TaskAffinity=no will result in fatal error like:
>fatal: Could not open/read/parse cgroup.conf file

Let me know if it's more clear for you now.

cheers,
Marcin
Comment 2 Kevin Buckley 2021-10-06 03:04:14 MDT
> Having TaskAffinity=no was just a repetition of a default value for the option,
> so it didn't have any effect compared to skipping it in cgroup.conf at all.
> 
> ...
> 
> Let me know if it's more clear for you now.

I'll ask a more direct question then Marcin,

if we remove it from the configuration now, it'll continue to
work as though we never had it, plus we'll have nothing to do
when 22.05 comes along ?


FWIW, I can't recall if it would have been something we put in,
or whether it was something that Cray's configuration generator
supplied, and so we just went with it.
Comment 3 Marcin Stolarek 2021-10-06 03:26:50 MDT
>if we remove it from the configuration now, it'll continue to
>work as though we never had it, plus we'll have nothing to do
>when 22.05 comes along ?

Yep - removing the TaskAffinity=no line from cgroup.conf is something I'd recommend.

cheers,
Marcin
Comment 4 Kevin Buckley 2021-10-06 03:29:15 MDT
On 2021/10/06 17:26, bugs@schedmd.com wrote:
> 
> Yep - removing the TaskAffinity=no line from cgroup.conf is something I'd
> recommend.

Already gone on the TDS I am awaiting rebuilding, instead of going
home, so feel free to close this one as INFOGIVEN.

Cheers,
Kevin