Ticket 13519

Summary: Slurmrestd Associations Issue
Product: Slurm Reporter: Matt Ezell <ezellma>
Component: slurmrestdAssignee: Nate Rini <nate>
Status: RESOLVED TIMEDOUT QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: barlowat
Version: 21.08.5   
Hardware: Linux   
OS: Linux   
Site: ORNL-OLCF Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---
Attachments: patch for 21.08 (test only)

Description Matt Ezell 2022-02-25 13:34:34 MST
Hi, 

This is a follow-up from https://bugs.schedmd.com/show_bug.cgi?id=13254.

Our development team encountered an issue setting  “MaxJobs”, “MaxSubmit”, “MaxJobsAccrue”, and “qos” with the associations API endpoint with OpenAPI DB version 0.0.38 on Slurm 22.05.0-0pre1 through the master branch around 3pm EST, February 24th. Previously, we encountered the same issues in the Slurm versions 20.11.8, 21-8-4-1, and 21-8-5-1 on OpenAPI DB version 0.0.36 and 0.0.37.

Other fields appear to not be set as well from the API, albeit, I currently do not use those fields.

Bug Reproducer

---

API Request
Given an existing cluster, account, and qos named "test1"
POST request to /slurmdb/v0.0.38/associations
Body:
{
  "associations": [
    {
      "cluster": "test1",
      "account": "test1",
      "priority": 1,
      "qos": [
        "test1"
      ],
      "max": {
        "jobs": {
          "per": {
            "count": 2,
            "accruing": 3,
            "submitted": 4,
            "wall_clock": 5
          }
        },
        "per": {
          "account": {
            "wall_clock": 6
          }
        }
      },
      "limits": {
        "max": {
          "active_jobs": {
            "accruing": 7,
            "count": 8
          },
          "wall_clock": {
            "per": {
              "qos": 9,
              "job": 10
            }
          },
          "jobs": {
            "per": {
              "account": 11,
              "user": 12
            }
          },
          "accruing": {
            "per": {
              "account": 13,
              "user": 14
            }
          }
        }
      }
    }
  ]
}


Response
{
  "meta": {
    "plugin": {
      "type": "openapi/dbv0.0.38",
      "name": "Slurm OpenAPI DB v0.0.38"
    },
    "Slurm": {
      "version": {
        "major": 22,
        "micro": 0,
        "minor": 5
      },
      "release": "22.05.0-0pre1"
    }
  },
  "errors": []
}


Actual Saved Fields
[vagrant@rats-slurm ~]$ sacctmgr list associations format=MaxJobs,MaxSubmit,MaxJobsAccrue,qos
MaxJobs MaxSubmit MaxJobsAccrue                  QOS
------------ --------------- ---------------------- --------------------
                                                                           normal
                                                                           normal
                                                                           normal
                                                                           normal
                                                                           normal
                                                                           normal
[vagrant@rats-slurm ~]$


Relevant Slurmrestd Log
-- Logs begin at Thu 2022-02-24 17:53:42 UTC. --
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: debug2: _on_body: [[localhost]:45580] received 48 bytes for HTTP body length 49/48 bytes
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: operations_router: [[localhost]:45580] POST /slurmdb/v0.0.38/clusters
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: debug:  rest_auth/local: slurm_rest_auth_p_authenticate: slurm_rest_auth_p_authenticate: [[localhost]:45580] socket authentication only supported on UNIX sockets
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: rest_auth/jwt: slurm_rest_auth_p_authenticate: [[localhost]:45580] attempting user_name vagrant token authentication pass through
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: debug4: _resolve_mime: [[localhost]:45580] accepts */* with q=1.000000
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: debug4: _resolve_mime: [[localhost]:45580] found accepts */*=application/json with q=1.000000
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: debug3: _resolve_mime: [[localhost]:45580] mime read: application/json write: application/json
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: serializer/json: _try_parse: _try_parse: WARNING: Extra 1 characters after JSON string detected
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: debug:  accounting_storage/slurmdbd: _connect_dbd_conn: Sent PersistInit msg
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: debug4: parse_http: [[localhost]:45580] parsed 434/434 bytes
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug:  parse_http: [[localhost]:45580] Accepted HTTP connection
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug:  _on_url: [[localhost]:45580] url path: /slurmdb/v0.0.38/associations query: (null)
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: slurmrestd: operations_router: [[localhost]:45580] POST /slurmdb/v0.0.38/associations
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: slurmrestd: rest_auth/jwt: slurm_rest_auth_p_authenticate: [[localhost]:45580] attempting user_name vagrant token authentication pass through
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: slurmrestd: serializer/json: _try_parse: _try_parse: WARNING: Extra 1 characters after JSON string detected
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug2: _on_header_value: [[localhost]:45580] Header: Host Value: localhost:6250
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug2: _on_header_value: [[localhost]:45580] Header: User-Agent Value: insomnia/2021.7.2
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug2: _on_header_value: [[localhost]:45580] Header: Cookie Value: __profilin=p%3Dt
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug2: _on_header_value: [[localhost]:45580] Header: X-SLURM-USER-NAME Value: vagrant
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug2: _on_header_value: [[localhost]:45580] Header: X-SLURM-USER-TOKEN Value: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE2NTA5MTE5MTksImlhdCI6MTY0NTcyNzkxOSwic3VuIjoidmFncmFudCJ9.NC0QrX0Ugic1pL-x-wYCfplvJOQTCDCVtQbUOxR0lyI
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug2: _on_header_value: [[localhost]:45580] Header: Content-Type Value: application/json
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug2: _on_header_value: [[localhost]:45580] Header: Accept Value: */*
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug2: _on_header_value: [[localhost]:45580] Header: Content-Length Value: 695
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug3: _on_headers_complete: [[localhost]:45580] HTTP/1.1 connection
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug2: _on_body: [[localhost]:45580] received 695 bytes for HTTP body length 696/695 bytes
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: operations_router: [[localhost]:45580] POST /slurmdb/v0.0.38/associations
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug:  rest_auth/local: slurm_rest_auth_p_authenticate: slurm_rest_auth_p_authenticate: [[localhost]:45580] socket authentication only supported on UNIX sockets
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: rest_auth/jwt: slurm_rest_auth_p_authenticate: [[localhost]:45580] attempting user_name vagrant token authentication pass through
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug4: _resolve_mime: [[localhost]:45580] accepts */* with q=1.000000
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug4: _resolve_mime: [[localhost]:45580] found accepts */*=application/json with q=1.000000
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug3: _resolve_mime: [[localhost]:45580] mime read: application/json write: application/json
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: serializer/json: _try_parse: _try_parse: WARNING: Extra 1 characters after JSON string detected
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug:  accounting_storage/slurmdbd: _connect_dbd_conn: Sent PersistInit msg
Feb 25 00:54:33 rats-slurm slurmrestd[16036]: debug4: parse_http: [[localhost]:45580] parsed 1086/1086 bytes
Comment 1 Nate Rini 2022-03-01 16:28:46 MST
Created attachment 23680 [details]
patch for 21.08 (test only)

This may be a duplicate of bug#13047. If possible, please apply this patch and see if it corrects the issue.
Comment 2 Nate Rini 2022-03-29 08:39:31 MDT
(In reply to Nate Rini from comment #1)
> Created attachment 23680 [details]
> patch for 21.08 (test only)
> 
> This may be a duplicate of bug#13047. If possible, please apply this patch
> and see if it corrects the issue.

Reducing severity while waiting for feedback.
Comment 3 Nate Rini 2022-04-21 09:15:15 MDT
Any updates?
Comment 4 Nate Rini 2022-04-29 16:17:03 MDT
Matt

I'm going to time this ticket out. Please respond when convenient and we can continue debugging.

Thanks,
--Nate