Ticket 13519 - Slurmrestd Associations Issue
Summary: Slurmrestd Associations Issue
Status: RESOLVED TIMEDOUT
Alias: None
Product: Slurm
Classification: Unclassified
Component: slurmrestd (show other tickets)
Version: 21.08.5
Hardware: Linux Linux
: 4 - Minor Issue
Assignee: Nate Rini
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2022-02-25 13:34 MST by Matt Ezell
Modified: 2022-05-02 16:54 MDT (History)
1 user (show)

See Also:
Site: ORNL-OLCF
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
patch for 21.08 (test only) (9.88 KB, patch)
2022-03-01 16:28 MST, Nate Rini
Details | Diff

Note You need to log in before you can comment on or make changes to this ticket.
Description Matt Ezell 2022-02-25 13:34:34 MST
Hi, 

This is a follow-up from https://bugs.schedmd.com/show_bug.cgi?id=13254.

Our development team encountered an issue setting  “MaxJobs”, “MaxSubmit”, “MaxJobsAccrue”, and “qos” with the associations API endpoint with OpenAPI DB version 0.0.38 on Slurm 22.05.0-0pre1 through the master branch around 3pm EST, February 24th. Previously, we encountered the same issues in the Slurm versions 20.11.8, 21-8-4-1, and 21-8-5-1 on OpenAPI DB version 0.0.36 and 0.0.37.

Other fields appear to not be set as well from the API, albeit, I currently do not use those fields.

Bug Reproducer

---

API Request
Given an existing cluster, account, and qos named "test1"
POST request to /slurmdb/v0.0.38/associations
Body:
{
  "associations": [
    {
      "cluster": "test1",
      "account": "test1",
      "priority": 1,
      "qos": [
        "test1"
      ],
      "max": {
        "jobs": {
          "per": {
            "count": 2,
            "accruing": 3,
            "submitted": 4,
            "wall_clock": 5
          }
        },
        "per": {
          "account": {
            "wall_clock": 6
          }
        }
      },
      "limits": {
        "max": {
          "active_jobs": {
            "accruing": 7,
            "count": 8
          },
          "wall_clock": {
            "per": {
              "qos": 9,
              "job": 10
            }
          },
          "jobs": {
            "per": {
              "account": 11,
              "user": 12
            }
          },
          "accruing": {
            "per": {
              "account": 13,
              "user": 14
            }
          }
        }
      }
    }
  ]
}


Response
{
  "meta": {
    "plugin": {
      "type": "openapi/dbv0.0.38",
      "name": "Slurm OpenAPI DB v0.0.38"
    },
    "Slurm": {
      "version": {
        "major": 22,
        "micro": 0,
        "minor": 5
      },
      "release": "22.05.0-0pre1"
    }
  },
  "errors": []
}


Actual Saved Fields
[vagrant@rats-slurm ~]$ sacctmgr list associations format=MaxJobs,MaxSubmit,MaxJobsAccrue,qos
MaxJobs MaxSubmit MaxJobsAccrue                  QOS
------------ --------------- ---------------------- --------------------
                                                                           normal
                                                                           normal
                                                                           normal
                                                                           normal
                                                                           normal
                                                                           normal
[vagrant@rats-slurm ~]$


Relevant Slurmrestd Log
-- Logs begin at Thu 2022-02-24 17:53:42 UTC. --
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: debug2: _on_body: [[localhost]:45580] received 48 bytes for HTTP body length 49/48 bytes
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: operations_router: [[localhost]:45580] POST /slurmdb/v0.0.38/clusters
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: debug:  rest_auth/local: slurm_rest_auth_p_authenticate: slurm_rest_auth_p_authenticate: [[localhost]:45580] socket authentication only supported on UNIX sockets
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: rest_auth/jwt: slurm_rest_auth_p_authenticate: [[localhost]:45580] attempting user_name vagrant token authentication pass through
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: debug4: _resolve_mime: [[localhost]:45580] accepts */* with q=1.000000
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: debug4: _resolve_mime: [[localhost]:45580] found accepts */*=application/json with q=1.000000
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: debug3: _resolve_mime: [[localhost]:45580] mime read: application/json write: application/json
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: serializer/json: _try_parse: _try_parse: WARNING: Extra 1 characters after JSON string detected
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: debug:  accounting_storage/slurmdbd: _connect_dbd_conn: Sent PersistInit msg
Feb 25 00:54:26 rats-slurm slurmrestd[16036]: debug4: parse_http: [[localhost]:45580] parsed 434/434 bytes
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug:  parse_http: [[localhost]:45580] Accepted HTTP connection
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug:  _on_url: [[localhost]:45580] url path: /slurmdb/v0.0.38/associations query: (null)
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: slurmrestd: operations_router: [[localhost]:45580] POST /slurmdb/v0.0.38/associations
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: slurmrestd: rest_auth/jwt: slurm_rest_auth_p_authenticate: [[localhost]:45580] attempting user_name vagrant token authentication pass through
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: slurmrestd: serializer/json: _try_parse: _try_parse: WARNING: Extra 1 characters after JSON string detected
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug2: _on_header_value: [[localhost]:45580] Header: Host Value: localhost:6250
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug2: _on_header_value: [[localhost]:45580] Header: User-Agent Value: insomnia/2021.7.2
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug2: _on_header_value: [[localhost]:45580] Header: Cookie Value: __profilin=p%3Dt
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug2: _on_header_value: [[localhost]:45580] Header: X-SLURM-USER-NAME Value: vagrant
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug2: _on_header_value: [[localhost]:45580] Header: X-SLURM-USER-TOKEN Value: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE2NTA5MTE5MTksImlhdCI6MTY0NTcyNzkxOSwic3VuIjoidmFncmFudCJ9.NC0QrX0Ugic1pL-x-wYCfplvJOQTCDCVtQbUOxR0lyI
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug2: _on_header_value: [[localhost]:45580] Header: Content-Type Value: application/json
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug2: _on_header_value: [[localhost]:45580] Header: Accept Value: */*
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug2: _on_header_value: [[localhost]:45580] Header: Content-Length Value: 695
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug3: _on_headers_complete: [[localhost]:45580] HTTP/1.1 connection
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug2: _on_body: [[localhost]:45580] received 695 bytes for HTTP body length 696/695 bytes
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: operations_router: [[localhost]:45580] POST /slurmdb/v0.0.38/associations
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug:  rest_auth/local: slurm_rest_auth_p_authenticate: slurm_rest_auth_p_authenticate: [[localhost]:45580] socket authentication only supported on UNIX sockets
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: rest_auth/jwt: slurm_rest_auth_p_authenticate: [[localhost]:45580] attempting user_name vagrant token authentication pass through
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug4: _resolve_mime: [[localhost]:45580] accepts */* with q=1.000000
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug4: _resolve_mime: [[localhost]:45580] found accepts */*=application/json with q=1.000000
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug3: _resolve_mime: [[localhost]:45580] mime read: application/json write: application/json
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: serializer/json: _try_parse: _try_parse: WARNING: Extra 1 characters after JSON string detected
Feb 25 00:54:32 rats-slurm slurmrestd[16036]: debug:  accounting_storage/slurmdbd: _connect_dbd_conn: Sent PersistInit msg
Feb 25 00:54:33 rats-slurm slurmrestd[16036]: debug4: parse_http: [[localhost]:45580] parsed 1086/1086 bytes
Comment 1 Nate Rini 2022-03-01 16:28:46 MST
Created attachment 23680 [details]
patch for 21.08 (test only)

This may be a duplicate of bug#13047. If possible, please apply this patch and see if it corrects the issue.
Comment 2 Nate Rini 2022-03-29 08:39:31 MDT
(In reply to Nate Rini from comment #1)
> Created attachment 23680 [details]
> patch for 21.08 (test only)
> 
> This may be a duplicate of bug#13047. If possible, please apply this patch
> and see if it corrects the issue.

Reducing severity while waiting for feedback.
Comment 3 Nate Rini 2022-04-21 09:15:15 MDT
Any updates?
Comment 4 Nate Rini 2022-04-29 16:17:03 MDT
Matt

I'm going to time this ticket out. Please respond when convenient and we can continue debugging.

Thanks,
--Nate