Ticket 433

Summary: Incorrect depth for AcceleratorAllocation in parser_basil_5.1.c
Product: Slurm Reporter: David Gloe <david.gloe>
Component: Cray ALPSAssignee: Moe Jette <jette>
Status: RESOLVED DUPLICATE QA Contact:
Severity: 2 - High Impact    
Priority: --- CC: da
Version: 2.6.x   
Hardware: Linux   
OS: Linux   
Site: CRAY Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: --- DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---
Attachments: Patch which should fix the issue

Description David Gloe 2013-09-30 06:17:49 MDT
Ian Ryan at Cray reported this issue:
"I was trying to build/install a hybrid alps/slurm, with the new slurm 2.6.2.
I can get one job to run but the server crashes after the job returns. This is the message in the server log:

[2013-09-30T11:39:49.003] fatal: Tag 'AcceleratorAllocation' appeared at depth 7 instead of 5

If I search for 'AcceleratorAllocation' in the source I find it in an alps plugin file. Any idea why I’m getting this error?"

It seems that in parser_basil_5.1.c, AcceleratorAllocation is listed at depth 5 rather than depth 7. Depth 7 is correct, since it should appear at 
ResponseData->Inventory->NodeArray->Node->AcceleratorArray->Accelerator->AcceleratorAllocation by the BASIL 1.2 specification.

The depth is correct in parser_basil_4.0.c. I've attached a patch which should fix this issue.
Comment 1 David Gloe 2013-09-30 06:18:54 MDT
Created attachment 420 [details]
Patch which should fix the issue
Comment 2 Danny Auble 2013-09-30 06:23:31 MDT
It would appear I guessed wrong when dealing with this in the orignal bug on the matter, I'll change it back.

Thanks for reporting

*** This ticket has been marked as a duplicate of ticket 395 ***
Comment 3 Danny Auble 2013-09-30 06:26:36 MDT
David, could you send me the document you refer to here on the spec of BASIL 1.2.  I feel things changed in 1.3.  If you happen to have the spec file fro BASIL 1.3 that would be helpful on the topic.
Comment 4 Danny Auble 2013-09-30 06:30:32 MDT
What version of CLE is this person running.  You reference BASIL 1.2 but anything running a CLE version of 5.1 should allow BASIL 1.3 to work and that is what we are using.
Comment 5 David Gloe 2013-09-30 06:34:22 MDT
As far as I know the location of the AcceleratorAllocation hasn't changed between BASIL 1.2 and 1.3. I mentioned 1.2 because that's where the AcceleratorAllocation element was added; it is still supported in BASIL 1.3, and as far as I know the depth hasn't changed.

It looks like your original patch moved AcceleratorArray to 5 and Accelerator to 6, but neglected to move AcceleratorAllocation to a child of Accelerator at 7.

I'll send the 1.2 and 1.3 documentation I have to you in an email.
Comment 6 Danny Auble 2013-09-30 06:40:59 MDT
The original patch moves the AcceleratorAllocation from 7 to 5 but it was never tested.  I just guessed it was that way since the other Accelerator depths changed.

I am moving future comments to 395 so we can just deal with one bug.