Ticket 15715 - Fix constraint expression validation
Summary: Fix constraint expression validation
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: slurmctld (show other tickets)
Version: 23.02.x
Hardware: Linux Linux
: 4 - Minor Issue
Assignee: Marshall Garey
QA Contact: Brian Christiansen
URL:
: 9560 (view as ticket list)
Depends on:
Blocks:
 
Reported: 2022-12-29 15:54 MST by Marshall Garey
Modified: 2023-01-30 21:28 MST (History)
0 users

See Also:
Site: SchedMD
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 23.02.0rc1
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Marshall Garey 2022-12-29 15:54:56 MST
_valid_feature_list can be collapsed into _feature_string2list. Also, _valid_feature_list doesn't parse the bracket correctly, and can incorrectly reject a valid constraint expression:

```
$ srun -C '[(a|b)&c]' whereami
srun: error: Unable to allocate resources: Invalid feature specification

slurmctld log:
[2022-12-29T14:47:47.259] Job specs has invalid feature list: (a|[b)&c]
```

Notice how it parsed the paren before the bracket: "**(a|[b)&c]**".

This happens because paren is higher priority, so the operator for "a" is OR, not XOR. `_valid_feature_list()` first looks at "a", sees that it is in a paren and is an OR operator, then puts "(a|" in the string `buf`. Then `_valid_feature_list()` looks at "b" and sees that its operator is an AND inside of a bracket (XAND), so it appends "[b&" to the string `buf`.

I think it's best to collapse the verification logic into the initial parsing and get rid of `_valid_feature_list()`.
Comment 1 Marshall Garey 2023-01-10 11:39:17 MST
In addition, there is a bug where square brackets and parentheses can be misplaced and it is accepted. I need to fix this.

https://bugs.schedmd.com/show_bug.cgi?id=12286#c22

(In reply to Marshall Garey from comment #22)
> (In reply to Felix Abecassis from comment #20)
> > Minor issue, but when I tested using a non-existing constraint rack3 in
> > expression "--constraint=rack1|rack3&nps=4" then the error message from
> > slurmctld shows an weird expression:
> > slurmctld: Job specs has invalid feature list: (rack1&[nps=4)|(rack3&nps=4)]
> > 
> > It looks like Slurm accepts having a closing ')' before a closing ']' in an
> > like "(rack1&[nps=4)", but that's an highly unusual form for an expression,
> > so please consider fixing it as the current form is hard to understand when
> > debugging (I did not figure out immediately that rack3 did not exist, I
> > thought the new code was generating invalid constraints expressions).
> 
> There are two bugs here. The first bug is actually just a cosmetic bug in
> the logs. Slurm first parses the user's string to build a job feature list,
> then attempts to reconstruct the string from the job feature list while
> doing some validation. The bug is in reconstructing the string, where you
> see the misplaced square bracket. I found this while working on this ticket,
> and I don't see any reason why the validation can't be done while
> constructing the job features list. I also see no reason why the string has
> to be reconstructed when we already have the original string supplied by the
> user. I already have an internal bug open to fix this.
> 
> The second bug is, as you point out, Slurm accepting having a closing ')'
> before a closing ']'. Slurm also should not accept having an opening '['
> after an opening '('.
> 
> $ srun -C '(a&[b)|(c&b)]' hostname
> curiosity
> 
> I'll fix this bug as well.
Comment 2 Marshall Garey 2023-01-10 11:40:07 MST
I'm making this bug public since it was also discovered by Nvidia.
Comment 6 Marshall Garey 2023-01-16 14:28:40 MST
(In reply to Marshall Garey from comment #0)
> _valid_feature_list can be collapsed into _feature_string2list. Also,
> _valid_feature_list doesn't parse the bracket correctly, and can incorrectly
> reject a valid constraint expression:
> 
> ```
> $ srun -C '[(a|b)&c]' whereami
> srun: error: Unable to allocate resources: Invalid feature specification
> 
> slurmctld log:
> [2022-12-29T14:47:47.259] Job specs has invalid feature list: (a|[b)&c]
> ```

Correction: This is actually invalid because these expressions do not have a count. A valid expression would be:

'[(a|b)*1&c*1]'

And this is actually accepted.

$ srun -C '[(a|b)*1&c*1]' whereami
0000 n1-1 - Cpus_allowed:       00000101        Cpus_allowed_list:      0,8
0001 n1-3 - Cpus_allowed:       00000101        Cpus_allowed_list:      0,8


I'll still make sure that syntax is correct.
Comment 9 Marshall Garey 2023-01-16 15:41:30 MST
*** Ticket 9560 has been marked as a duplicate of this ticket. ***
Comment 11 Brian Christiansen 2023-01-26 14:14:32 MST
Thanks Marshall!

We:
1. updated * checking to work with just one feature but validate that it needs to be enclosed in brackets if using more than one feature.

2. punted on the last commit for feature_err. "feature_err" is passed into all the fuctions. The only place in build_feature_list() that rc could be different than feature_err is the rc from _valid_batch_features(). I think build_feature_list() could just handle the case of feature_err and not have to pass it into the functions.


*   dcfa2f2c05 (HEAD -> master, origin/master, origin/HEAD) Merge branch 'bug15715'
|\  
| * 916377d953 NEWS for improving validation of constraint syntax
| * 4a603ea42f Add more detailed logging for constraint expression validation
| * 7f915cc63d Validate '*' in feature requests
| * baab82c575 Verify brackets and parentheses in feature requests
| * c5987d2d74 Improve verbose logging when validating features
| * 01d159e374 Return feature_err
| * 24621870fe Don't rebuild the feature string when validating
|/
Comment 14 Brian Christiansen 2023-01-30 21:27:58 MST
Looks good. Thanks! I'll go ahead and close the bug.

1b512df5f6 (HEAD -> master, origin/master, origin/HEAD) Cleanup handling errors in build_feature_list