Ticket 7339 - slurm.conf: Ability to specify 'AvailableFeatures' separate from 'ActiveFeatures' on individual nodes
Summary: slurm.conf: Ability to specify 'AvailableFeatures' separate from 'ActiveFeatu...
Status: OPEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: Configuration (show other tickets)
Version: 18.08.7
Hardware: Linux Linux
: 5 - Enhancement
Assignee: Unassigned Developer
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2019-07-02 16:20 MDT by S Senator
Modified: 2019-08-01 16:23 MDT (History)
6 users (show)

See Also:
Site: LANL
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description S Senator 2019-07-02 16:20:09 MDT
We utilize node features to track capabilities of nodes. Usually these are set at job allocation and unset during a job's epilog. The node features in slurm.conf are interpreted as AvailableFeatures and manually cleared and then probed during job allocation.

Jobs submitted from within an allocation implicitly include the constraints of the node from which they are submitted.

We could make the job prolog much more efficient if nodes only have AvailableFeatures specified in slurm.conf and would only modify ActiveFeatures.

Please consider allowing AvailableFeatures specifications in slurm.conf which would not imply that the features are active until external mechanisms are invoked. This would be in contrast to the present behavior where Feature= tags are propagated to both AvailableFeatures and ActiveFeatures.
Comment 3 Tim Wickberg 2019-08-01 16:12:45 MDT
(In reply to S Senator from comment #0)
> We utilize node features to track capabilities of nodes. Usually these are
> set at job allocation and unset during a job's epilog. The node features in
> slurm.conf are interpreted as AvailableFeatures and manually cleared and
> then probed during job allocation.
> 
> Jobs submitted from within an allocation implicitly include the constraints
> of the node from which they are submitted.
> 
> We could make the job prolog much more efficient if nodes only have
> AvailableFeatures specified in slurm.conf and would only modify
> ActiveFeatures.
> 
> Please consider allowing AvailableFeatures specifications in slurm.conf
> which would not imply that the features are active until external mechanisms
> are invoked. This would be in contrast to the present behavior where
> Feature= tags are propagated to both AvailableFeatures and ActiveFeatures.

The way to handle this today would be through constructing an additional node_features plugin to control which are active or inactive.

Note that these are presumed to only change after the node reboots, and not modified while the node is up and running jobs.

I don't currently plan to extend this capability, as I have not heard any use cases for this from other customers, and the platform (KNL) this whole set of capabilities for is a dead end.

If you have a workaround that has been sufficient, I would suggest sticking with that approach for now, as there are other development projects I would prefer we focus on with broader appeal.

- Tim
Comment 4 S Senator 2019-08-01 16:23:04 MDT
Thank you for the update. Please note that we're not using this for KNLs, and we have found features useful for a number of different use cases to simplify user selection of nodes with Constraints, rather than with GRes.

We do have a working mechanism presently, although less elegant than such a plugin.