Ticket 3063

Summary: squeue --licenses constraints match too many jobs
Product: Slurm Reporter: Doug Jacobsen <dmjacobsen>
Component: User CommandsAssignee: Alejandro Sanchez <alex>
Status: RESOLVED FIXED QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: alex
Version: 16.05.4   
Hardware: Cray XC   
OS: Linux   
Site: NERSC Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: --- DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed: 16.05.5 17.02
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---

Description Doug Jacobsen 2016-09-07 11:15:08 MDT
Hello,

We use licenses to manage access to our filesystems.  We're about to bring back our scratch filesystems and as such I need to manipulate jobs with licenses for scratch1 and scratch2, but NOT cscratch1.

It appears that when I search for "scratch1", "cscratch1" is also matching.  I'll work around this for now by reformatting the squeue output and grepping out cscratch1 and making a second pass.

Thanks,
Doug


nid01664:/global/syscom/es/nsg/src/cmd/scrdir # squeue -L scratch1 --format="%A,%W"
JOBID,LICENSES
1949629,cscratch1:1
1969081,cscratch1:1
1969146,cscratch1:1
1949630,cscratch1:1
1969127,cscratch1:1
1970478,cscratch1:1
...
...
1970480,cscratch1:1
1970524,cscratch1:1
938362,scratch1:1
938363,scratch1:1
938388,scratch1:1
981621,scratch1:1
981622,scratch1:1
1008673,scratch1:1
1008674,scratch1:1
1008675,scratch1:1
1030912,scratch1:1
1030914,scratch1:1
1033020,scratch1:1
1033021,scratch1:1
1076369,scratch1:1
1076371,scratch1:1
1076373,scratch1:1
1076375,scratch1:1
1123304,scratch1:1
1123305,scratch1:1
1123460,scratch1:1
1123461,scratch1:1
1123462,scratch1:1
1123463,scratch1:1
1123678,scratch1:1
1379660,scratch1:1
...
...
1565780,scratch1:1
1565782,scratch1:1
1565783,scratch1:1
1565785,scratch1:1
1648039,cscratch1:1
1648040,cscratch1:1
1648042,cscratch1:1
1648043,cscratch1:1
1648046,cscratch1:1
1648047,cscratch1:1
1648048,cscratch1:1
1648049,cscratch1:1
1648052,cscratch1:1
1648063,cscratch1:1
1701394,cscratch1:1
1701395,cscratch1:1
1701396,cscratch1:1
1701397,cscratch1:1
1701398,cscratch1:1
1709394,scratch1:1
1709397,scratch1:1
1709565,scratch1:1
1709567,scratch1:1
1709568,scratch1:1
...


and so on.
Comment 1 Alejandro Sanchez 2016-09-08 03:52:43 MDT
Hi Doug, we're looking into this and will come back to you.
Comment 2 Alejandro Sanchez 2016-09-08 04:15:03 MDT
I can reproduce this problem, working on a fix now. Most probably it's a wrong substring compare.
Comment 4 Alejandro Sanchez 2016-09-09 08:56:14 MDT
Hi Doug. Following patch should solve the filtering issues on licenses:

https://github.com/SchedMD/slurm/commit/1ec2a4ae07357d

It will be included in 16.05.5. If it's an urge for you, please apply it before the release date and let me know if we can close the ticket or if you find any issue. Otherwise, I'll mark the ticket as resolved/fixed.
Comment 13 Alejandro Sanchez 2016-09-09 09:15:56 MDT
Doug, please do not apply the patch yet, we think it'd fail if job requested more than 1 license, for instance scratch1:2 since xstrcmp tries to matches the exact token (avoiding cscratch1 to match too for being a substring) but then would fail with scratch1:2, so gonna ammend this patch. Will come back to you.
Comment 14 Doug Jacobsen 2016-09-09 09:31:38 MDT
Yeah, sorry, I was reviewing and was going to suggest a variant of strncmp
with the colon delimiter.

Thanks for continuing to look at this,
Doug

----
Doug Jacobsen, Ph.D.
NERSC Computer Systems Engineer
National Energy Research Scientific Computing Center <http://www.nersc.gov>
dmjacobsen@lbl.gov

------------- __o
---------- _ '\<,_
----------(_)/  (_)__________________________


On Fri, Sep 9, 2016 at 8:15 AM, <bugs@schedmd.com> wrote:

> *Comment # 13 <https://bugs.schedmd.com/show_bug.cgi?id=3063#c13> on bug
> 3063 <https://bugs.schedmd.com/show_bug.cgi?id=3063> from Alejandro Sanchez
> <alex@schedmd.com> *
>
> Doug, please do not apply the patch yet, we think it'd fail if job requested
> more than 1 license, for instance scratch1:2 since xstrcmp tries to matches the
> exact token (avoiding cscratch1 to match too for being a substring) but then
> would fail with scratch1:2, so gonna ammend this patch. Will come back to you.
>
> ------------------------------
> You are receiving this mail because:
>
>    - You reported the bug.
>
>
Comment 20 Alejandro Sanchez 2016-09-22 02:33:05 MDT
Doug, following commit should fix this bug (included in Slurm 16.05.5):

https://github.com/SchedMD/slurm/commit/cbd1ffadd5492

Please, let me know if we can close the ticket.
Comment 21 Alejandro Sanchez 2016-10-11 08:43:56 MDT
Doug - can we close this bug? Thanks.
Comment 22 Alejandro Sanchez 2016-10-20 03:00:10 MDT
Closing as resolved/fixed. Please reopen if you encounter further issues.
Comment 23 Doug Jacobsen 2016-10-20 03:02:24 MDT
Great, thank you!


On 10/20/16 2:00 AM, bugs@schedmd.com wrote:
> Alejandro Sanchez <mailto:alex@schedmd.com> changed bug 3063 
> <https://bugs.schedmd.com/show_bug.cgi?id=3063>
> What 	Removed 	Added
> Resolution 	--- 	FIXED
> Version Fixed 		16.05.5 17.02
> Status 	CONFIRMED 	RESOLVED
>
> *Comment # 22 <https://bugs.schedmd.com/show_bug.cgi?id=3063#c22> on 
> bug 3063 <https://bugs.schedmd.com/show_bug.cgi?id=3063> from 
> Alejandro Sanchez <mailto:alex@schedmd.com> *
> Closing as resolved/fixed. Please reopen if you encounter further issues.
> ------------------------------------------------------------------------
> You are receiving this mail because:
>
>   * You reported the bug.
>