Ticket 12897

Summary: improve the slurm rest api openapi specification
Product: Slurm Reporter: Markus Kötter <koetter>
Component: slurmrestdAssignee: Nate Rini <nate>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: cinek, nate
Version: 20.11.8   
Hardware: Linux   
OS: Linux   
See Also: https://bugs.schedmd.com/show_bug.cgi?id=13990
Site: -Other- Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: 22.05pre1 Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---
Ticket Depends on:    
Ticket Blocks: 12982, 13195, 13518, 13522, 13524    
Attachments: patch
patch for master (v2)
patch for master (v3)

Description Markus Kötter 2021-11-21 05:46:24 MST
Created attachment 22353 [details]
patch

The openapi.json spec published may be valid,
but can not be used for the slurmrestd service.
slurmrestd does not conform to the protocol described by the openapi.json
I want to use the openapi.json to use the the slurmrestd api.
This is what I got so far.
To my understanding there are some OpenAPI incompatible constructs
in the protocol which can not be fixed by the openapi.json.
(e.g. job_resources/allocated_nodes using dynamic dict keys)
Sometimes status=500 is a valid response,
often the arguments/responce is incomplete,
the _error schema is used with every permuation of error errno error_number

While I can only test with 0.0.36 I provide patches for 37 & 38 as well.

Project to improve the API is here: https://github.com/commonism/slurmrest
Comment 3 Nate Rini 2022-02-23 15:28:56 MST
Markus

Thanks for your patchset. I'm currently in the process of reviewing and chopping it up into multiple patches.

This change being correct in bug#12982
> 2166 @@ -1743,11 +1767,21 @@⏎
> 2167            },⏎
> 2168            "sockets": {⏎
> 2169              "type": "object",⏎                                                                     
> 2170 -            "description": "assignment status of each socket by socket id"⏎                        
> 2171 +            "description": "FIXME",⏎                                                               
> 2172 +            "properties": {⏎                                                                       
> 2173 +              "0": {⏎                                                                              
> 2174 +                "type": "string"⏎                                                                  
> 2175 +              }⏎
> 2176 +            }⏎ 
> 2177            },⏎
> 2178            "cores": {⏎                                                                              
> 2179              "type": "object",⏎                                                                     
> 2180 -            "description": "assignment status of each core by core id"⏎                            
> 2181 +            "description": "FIXME",⏎                                                               
> 2182 +            "properties": {⏎                                                                       
> 2183 +              "0": {⏎
> 2184 +                "type": "string"⏎
> 2185 +              }⏎                                                                                   
> 2186 +            }⏎
> 2187            }⏎
> 2188          }⏎ 
> 2189        },

--Nate
Comment 6 Nate Rini 2022-02-25 16:42:35 MST
Created attachment 23642 [details]
patch for master (v2)

> The openapi.json spec published may be valid,
> but can not be used for the slurmrestd service.

Thanks for submitting your patchset. In the future, please make sure to do one change per patch otherwise we need to break them up before we can review and push them upstream. This makes it easier for us to review and faster overall.

> To my understanding there are some OpenAPI incompatible constructs
> in the protocol which can not be fixed by the openapi.json.
> (e.g. job_resources/allocated_nodes using dynamic dict keys)
This is getting handled in bug#12982. The dumping of the core/nodes fell out of sync with the upstream API.

> Sometimes status=500 is a valid response,
> often the arguments/responce is incomplete,
> the _error schema is used with every permuation of error errno error_number

Translating internal Slurm error codes in the HTTP error codes is an ongoing process. Instead opted to provide the error output as default since anything other than 200 is a failure of one type or another.

> While I can only test with 0.0.36 I provide patches for 37 & 38 as well.
We only do security patches for back levels of the plugins. The attached is only targeted at the v38 plugins that will be included with slurm-22.05 release.

Please take a look at the patchset and see if anything was missed.
Comment 8 Nate Rini 2022-02-25 16:56:53 MST
Created attachment 23643 [details]
patch for master (v3)

rebased to master (3d69f0c)
Comment 11 Nate Rini 2022-03-04 18:50:19 MST
Comment on attachment 23643 [details]
patch for master (v3)

Thank you for your contribution. The changes are now upstream in the master branch and will be included in the slurm-22.05 release.

> https://github.com/SchedMD/slurm/compare/19fb02bd16d...82455a3fa637d85
Comment 12 Markus Kötter 2022-03-07 04:56:24 MST
(In reply to Nate Rini from comment #3)

> This change being correct in bug#12982
> > 2166 @@ -1743,11 +1767,21 @@⏎
> > 2167            },⏎
> > 2168            "sockets": {⏎
> > 2169              "type": "object",⏎                                                                     
> > 2170 -            "description": "assignment status of each socket by socket id"⏎                        
> > 2171 +            "description": "FIXME",⏎                                                               
> > 2172 +            "properties": {⏎                                                                       
> > 2173 +              "0": {⏎                                                                              
> > 2174 +                "type": "string"⏎                                                                  
> > 2175 +              }⏎
> > 2176 +            }⏎ 
> > 2177            },⏎
> > 2178            "cores": {⏎                                                                              
> > 2179              "type": "object",⏎                                                                     
> > 2180 -            "description": "assignment status of each core by core id"⏎                            
> > 2181 +            "description": "FIXME",⏎                                                               
> > 2182 +            "properties": {⏎                                                                       
> > 2183 +              "0": {⏎
> > 2184 +                "type": "string"⏎
> > 2185 +              }⏎                                                                                   
> > 2186 +            }⏎
> > 2187            }⏎
> > 2188          }⏎ 
> > 2189        },


The patch is … let's say as correct as possible. 

To my understanding the data structure chosen can not be described by OpenAPI due to the use of dynamic property names (depending on the number of cores/sockets).
core/sockets should be an array of objects, the identifier a key in this object.
Comment 13 Nate Rini 2022-03-07 07:52:27 MST
(In reply to Markus Kötter from comment #12)
> (In reply to Nate Rini from comment #3)
> > This change being correct in bug#12982

It will be corrected in bug#12982. The entire parsing of the CPU usage was incorrect.

> To my understanding the data structure chosen can not be described by
> OpenAPI due to the use of dynamic property names (depending on the number of
> cores/sockets).
> core/sockets should be an array of objects, the identifier a key in this
> object.

Which openapi generator are you using? Yes, the dynamic property names are going to get removed and replaced by an array.
Comment 17 Nate Rini 2022-03-07 10:09:02 MST
Closing this ticket as the patches are now upstream. Please reply if you have any more questions.

Thanks,
--Nate
Comment 18 Nate Rini 2022-03-07 11:42:31 MST
*** Ticket 13522 has been marked as a duplicate of this ticket. ***
Comment 19 Nate Rini 2022-03-07 12:10:19 MST
*** Ticket 13524 has been marked as a duplicate of this ticket. ***
Comment 20 Markus Kötter 2022-05-03 00:30:35 MDT
(In reply to Nate Rini from comment #13)
> Which openapi generator are you using?

https://github.com/commonism/aiopenapi3
https://github.com/commonism/slurmrest
Comment 21 Nate Rini 2022-05-03 12:06:52 MDT
Thanks for sharing the openapi parser for python.
Comment 22 Nate Rini 2022-07-29 10:08:57 MDT
*** Ticket 11955 has been marked as a duplicate of this ticket. ***
Comment 23 Markus Kötter 2022-07-31 23:21:43 MDT
(In reply to Markus Kötter from comment #0)
> To my understanding there are some OpenAPI incompatible constructs
> in the protocol which can not be fixed by the openapi.json.
> (e.g. job_resources/allocated_nodes using dynamic dict keys)

This is not correct.
It is possible and part of the specification via "additionalProperties".

https://swagger.io/docs/specification/data-models/dictionaries/

And will be useful for SLURM for Environment & others as well.

I would not combine additionalProperties with static properties due to lacking client side support.
Comment 24 Nate Rini 2022-08-05 12:10:08 MDT
(In reply to Markus Kötter from comment #23)
> I would not combine additionalProperties with static properties due to
> lacking client side support.

This has been an ongoing issue with client generators. We have implemented many things in the standard that turned out to not be implemented by any of the clients. With any luck, the generators will improve and we will be able to use more of these features.

Are there any more questions?
Comment 25 Nate Rini 2022-08-25 07:38:59 MDT
Closing out the ticket as there appear to be no more questions.