Description
Felix Abecassis
2020-08-11 15:16:15 MDT
BTW, we are at ~550 lines of code for this generic plugin. It could be less without having to reimplement some of the boilerplate functions. KNL plugins are at 1700 / 2600 lines of code. Of course they don't behave exactly the same (for instance, it seems the KNL plugins modify GRES, but I guess a helper plugin could also do it through the Slurm CLI). We prefer to stick with Bugzilla so we can come back to this readily at a later point. I'll stick to one point in my first comment here, I'm still mulling over some other details, and I'll need to discuss some pieces further internally. > Obviously, we don't want to write ad-hoc plugins for each case. The > requirements behind each are pretty much identical: > 1. Each feature must have only one active state. > 2. We need a way to get the current feature state on a node. > 3. We need a way to change the feature before the reboot is triggered. > 4. We need Slurm to keep track of the node's state regarding each feature. > 5. We need Slurm to favor scheduling to nodes with the right state already. > > To solve 1., a simple way to handle mutually exclusive options (like NPS4 > and NPS1), is to have the features expressed in a key-value form: > $ sinfo -a -o "%16N | %16b | %f" > NODELIST | ACTIVE_FEATURES | AVAIL_FEATURES > node-[1-2] | nps=4 | nps=1,nps=2,nps=4 > This way, you can't have "nps=4" and "nps=1" at the same time. This would > have been more difficult to express without a key-value format (e.g. "nps4" > and "nps1"). I'm not sold on the key=value syntax just yet. Especially as you note here: > b) We can't handle mutually exclusive keys right now. It's definitely > possible with the current approach since a helper can handle multiple keys, > it could decide to reject invalid combinations if we were to pass all > key-values at once: > $ ./helper foo=1 bar=notfoo Which means we'd still likely need some extra configuration syntax to cover that situation anyways. So I think I'd rather consider these features as "nps1", "nps2", "nps4" instead. I don't have an immediate fix for this, but I'm thinking some configuration declaration that a given collection of features are mutually exclusive is called for. Dunno what exactly that syntax would look like at the moment though. Thanks! As a single data point, we don't have a use case for mutually exclusive keys today. The key-value approach is to get a tiny bit of structure and make it easier for users to parse and understand which values available for a feature. If we had "nps1", "nps2" and "nps4", it would not be immediately obvious (nor enforced) that they are related, and whether you can have multiple of those active at the same time. We could also envision adding types to values, to limit the impact of misbehaving helpers. For instance NPS could be "int", NVIDIA MIG could be "bool", etc. It's not possible with strings everywhere. It's also possible to merge keys and express combination in the values, for instance: bar=notfoo+2 or bar=notfoo;2 (depending on what Slurm would accept). Going back to this after many months, I would like to clarify something regarding point a) mentioned in the initial message: I'm not suggesting to remove the constraint DSL in Slurm, that would be a radical change that would break many valid use cases today (--constraint is often used with static node features declared in slurm.conf). The goal of this RFE is to start a discussion regarding the current node_features API (which is just for KNL right now). It is not mergeable in the current state since we can't deprecate the constraint DSL for an optional plugin. My suggestion is to modify the node_features API to only pass features one at a time to the plugin, having plugins reimplement the DSL parsing is error prone. And not everything is implemented on the Slurm side anyway, so you would have to match the limitations of the Slurm parser / handler. Or the API should pass the full expression to the plugin in a non-ambiguous form like an abstract syntax tree. And in some situations the plugin has no way of knowing what to do, for instance with --constraint="[nps=4|nps=1]", node_features_p_changeable_feature on the slurmctld side is called with inputs "nps=4" and then "nps=1" (which is simple to deal with); but node_features_p_node_set on the slurmd side is called with "[nps=4|nps=1]", so the plugin doesn't know what to do here. Passing an AST wouldn't solve this problem. When you have time to look at this RFE again, please let me know if/how you wish to proceed. I file more bug reports with the node_features API, we can decide whether the key-value format is a good thing for this plugin, we can discuss whether we want to express that some features are mutually exclusive, etc. Created attachment 17109 [details]
node_features/helpers v2
Submitted v2 of the patch that uses a stopgap measure to restore the DSL when possible. In node_features_p_job_valid, it used to consider the constraint to be invalid whenever it contained one of those operators: []()|* Now, it also checks whether the expression contains one of the key/value handled by the plugin, but since it doesn't parse the expression it just uses strstr(3), e.g. looking for "nps=" in the string. If it does, then the constraint is considered invalid. As a result, node_features_p_job_xlate was also modified. If one of the operators mentioned above was used, we return an empty string. This prevents Slurm from calling node_features_p_node_set with this constraint and attempting to reboot the node. Created attachment 19223 [details]
node_features/helpers v3
Attaching an updated patch for the node features helpers plugin.
This includes everything discussed and agreed upon in the design document, including: removing the reliance on <ATTR>=<VALUE> syntax, adding the optional MutuallyExclusive setting, formatting, etc. Looking forward to feedback!
Created attachment 19224 [details]
Design Document
Attaching design document for reference.
Hi Doug - I've been working on reviewing the patch this afternoon, and this looks pretty good so far. Functionally things appear to be in pretty good shape, and the bulk of my first pass is focused on style quirks instead. I've pushed a new branch to our github to aid in review - bug9567 - which includes a number of trivial style changes, as well as a handful of FIXME comments that I'd like you to take a look at. If you don't have any problems with what's there those style commits will all be squashed into your initial commit on the branch; if you notice any issue or have questions on them please ask. As for the FIXMEs, the main notes I have and would appreciate some following patches for are: - Please avoid using ListIterator. It's quite slow compared to list_for_each / list_find_first, and we've been trying to avoid it in newer code. - Please look at using run_command() from our common library rather than your local reimplementation. Also, I did rename the config file to "helpers.conf" to match the plugin name. If you have some other preference please let me know. - Tim Thanks Tim, I'll start work on addressing those FIXMEs tomorrow. Created attachment 19604 [details]
Patch
Tim,
The attached patch addresses review feedback and FIXME items:
- Use standard run_command()
- Remove use of ListIterator
- Strict error handling of conf parsing failures (previously ignored duplicate entries)
Let me know if you have any issues or concerns with this latest update.
Hey Doug - This is on master now, and will be included in 21.08 when released next month. I did squash all the fixup work and some additional style tweaks into this single commit, but functionally this is unchanged from the original submission. I'm going to move ahead and mark this resolved. I'll split off test development (provided by SchedMD per the contract) into a new issue to track that, but I don't expect any issues there. - Tim commit f8b2c358b2baffceddfb59c97c0f56bd647427a6 Author: Douglas Wightman <dwightman@nvidia.com> AuthorDate: Thu Apr 29 09:05:38 2021 -0600 Add node_features/helpers plugin. A new node-features plugin for managing dynamic features through the use of helper binaries. These binaries are configured via the helpers.conf file, along with other plugin-specific options such as MutuallyExclusive. Bug 9567. Co-authored-by: Felix Abecassis <fabecassis@nvidia.com> Created attachment 20965 [details]
ExecTime Patch
Hi Tim, I'm attaching a small patch with 2 changes. (1) Allow admins to configure ExecTime, a time limit for the helpers, and (2) a simple README that explains how to configure and use the plugin. If there are no concerns with these changes please merge them at your earliest convenience, perhaps 21.08.1? Thanks!
Thanks Doug. I've added the configuration option with the following commit, it's in ahead of 21.08.0. READMEs aren't how we like to document things at this point in time. I'll ask Ben to take a look into getting some proper documentation put together based off that patch though. - Tim commit 0ee9df12503baeedde255256de14785bd4d1cd63 Author: Douglas Wightman <dwightman@nvidia.com> AuthorDate: Fri Aug 20 23:45:28 2021 -0600 node_features/helpers - add ExecTime option. Bug 9567. |