Dear Slurm devs, I'm giving a try to Slurm 25.11.0 rc1 and I saw this change in slurmrestd errors handling: With previous versions of Slurm: $ slurmrestd -V slurm 24.05.3 $ curl -v --header X-SLURM-USER-TOKEN:$SLURM_JWT http://localhost:6820/slurm/v0.0.41/fail * Trying 127.0.0.1:6820... * Connected to localhost (127.0.0.1) port 6820 (#0) > GET /slurm/v0.0.41/fail HTTP/1.1 > Host: localhost:6820 > User-Agent: curl/7.88.1 > Accept: */* > X-SLURM-USER-TOKEN:<redacted> > < HTTP/1.1 404 NOT FOUND < Connection: Close < Content-Length: 69 < Content-Type: text/plain < * Closing connection 0 Unable find requested URL. Please view /openapi/v3 for API reference. With Slurm 25.11.0 rc1: $ slurmrestd -V slurm 25.11.0-0rc1 $ curl -v --header X-SLURM-USER-TOKEN:$SLURM_JWT http://localhost:6820/slurm/v0.0.41/fail * Trying ::1:6820... * connect to ::1 port 6820 failed: Connection refused * Trying 127.0.0.1:6820... * Connected to localhost (127.0.0.1) port 6820 (#0) > GET /slurm/v0.0.41/fail HTTP/1.1 > Host: localhost:6820 > User-Agent: curl/7.76.1 > Accept: */* > X-SLURM-USER-TOKEN:<redacted> > * Mark bundle as not supporting multiuse < HTTP/1.1 404 NOT FOUND < Connection: Close < HTTP/1.1 404 NOT FOUND * Closing connection 0 The response now miss the Content-Type and Content-Length headers, and the descriptive error message is not part of the response anymore. I suspect this to be a bug considering this recent change mentionned in the changelog: https://github.com/SchedMD/slurm/commit/c4965e98d8b553258b168ed9fa87ffc361bea1e9 Is this really the new expected slurmrestd error handling behavior? I wish you to tell me more, as this has an important impact on the development of Slurm-web.
(In reply to Rémi Palancher from comment #0) > The response now miss the Content-Type and Content-Length headers, and the > descriptive error message is not part of the response anymore. Thank you for reporting this bug (functional regression).
This regression has now been fixed: > https://github.com/SchedMD/slurm/commit/fd96208e9fe34ce722770562bdaf3afd480bcb70 This fix also causes a few more errors to be included on rejected requests.
Thank you Nate! I tested your patch successfully in my environment. I also discovered a difference coming with Slurm 25.11 when JWT is missing in request headers: With previous versions of Slurm: $ slurmrestd -V slurm 24.05.8 $ curl -v http://localhost:6820/slurm/v0.0.41/ping * Host localhost:6820 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:6820... * connect to ::1 port 6820 from ::1 port 42362 failed: Connection refused * Trying 127.0.0.1:6820... * Connected to localhost (127.0.0.1) port 6820 * using HTTP/1.x > GET /slurm/v0.0.41/ping HTTP/1.1 > Host: localhost:6820 > User-Agent: curl/8.14.1 > Accept: */* > * Request completely sent off < HTTP/1.1 401 UNAUTHORIZED < Connection: Close < Content-Length: 22 < Content-Type: text/plain < * shutting down connection #0 Authentication failure With Slurm 25.11.0 rc1 (patched): $ slurmrestd -V slurm 25.11.0-0rc1 $ curl -v http://localhost:6820/slurm/v0.0.41/ping * Host localhost:6820 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:6820... * connect to ::1 port 6820 from ::1 port 32800 failed: Connection refused * Trying 127.0.0.1:6820... * Established connection to localhost (127.0.0.1 port 6820) from 127.0.0.1 port 50946 * using HTTP/1.x > GET /slurm/v0.0.41/ping HTTP/1.1 > Host: localhost:6820 > User-Agent: curl/8.17.0-rc3 > Accept: */* > * Request completely sent off < HTTP/1.1 500 INTERNAL ERROR < Connection: Close < Content-Length: 40 < Content-Type: text/plain < * Excess found writing body: excess = 117, size = 40, maxdownload = 40, bytecount = 40 * shutting down connection #0 Authentication does not apply to request Beyond the error message, the HTTP status code changed from 401 to 500. I cannot find mention of this change in changelog, is this expected?
(In reply to Rémi Palancher from comment #5) > Beyond the error message, the HTTP status code changed from 401 to 500. I > cannot find mention of this change in changelog, is this expected? I'm not able to replicate this. How is slurmrestd being run? Is it possible to get this output? > ps -ef|grep slurmrestd > systemctl status slurmrestd
(In reply to Nate Rini from comment #10) > I'm not able to replicate this. How is slurmrestd being run? Is it possible > to get this output? > > ps -ef|grep slurmrestd > > systemctl status slurmrestd Of course! root@admin:~# slurmrestd -V slurm 25.11.0-0rc1 root@admin:~# ps -ef | grep slurmrestd slurmre+ 13169 1 0 16:59 ? 00:00:00 /usr/sbin/slurmrestd -a rest_auth/jwt [::]:6820 root 17792 17694 0 17:36 pts/1 00:00:00 grep slurmrestd root@admin:~# systemctl status slurmrestd.service ● slurmrestd.service - Slurm REST daemon Loaded: loaded (/usr/lib/systemd/system/slurmrestd.service; enabled; preset: enabled) Drop-In: /etc/systemd/system/slurmrestd.service.d └─firehpc.conf Active: active (running) since Thu 2025-11-06 16:59:33 CET; 37min ago Invocation: b5d4a140821d479fb1e5159ba9665268 Main PID: 13169 (slurmrestd) Tasks: 33 (limit: 19070) Memory: 13.8M (max: 3G, available: 2.9G, peak: 14.8M) CPU: 639ms CGroup: /system.slice/slurmrestd.service └─13169 /usr/sbin/slurmrestd -a rest_auth/jwt "[::]:6820" Nov 06 17:35:43 admin.nova slurmrestd[13169]: operations_router: [localhost:6820(fd:25)] GET /slurm/v0.0.41/jobs Nov 06 17:35:43 admin.nova slurmrestd[13169]: rest_auth/jwt: slurm_rest_auth_p_authenticate: [localhost:6820(fd:25)] attempting user_name slurm token authentication pass through Nov 06 17:36:43 admin.nova slurmrestd[13169]: [2025-11-06T17:36:43.244] operations_router: [localhost:6820(fd:29)] GET /slurm/v0.0.41/nodes Nov 06 17:36:43 admin.nova slurmrestd[13169]: [2025-11-06T17:36:43.244] rest_auth/jwt: slurm_rest_auth_p_authenticate: [localhost:6820(fd:29)] attempting user_name slurm token authentication pass through Nov 06 17:36:43 admin.nova slurmrestd[13169]: operations_router: [localhost:6820(fd:29)] GET /slurm/v0.0.41/nodes Nov 06 17:36:43 admin.nova slurmrestd[13169]: rest_auth/jwt: slurm_rest_auth_p_authenticate: [localhost:6820(fd:29)] attempting user_name slurm token authentication pass through Nov 06 17:36:43 admin.nova slurmrestd[13169]: [2025-11-06T17:36:43.275] operations_router: [localhost:6820(fd:29)] GET /slurm/v0.0.41/jobs Nov 06 17:36:43 admin.nova slurmrestd[13169]: [2025-11-06T17:36:43.275] rest_auth/jwt: slurm_rest_auth_p_authenticate: [localhost:6820(fd:29)] attempting user_name slurm token authentication pass through Nov 06 17:36:43 admin.nova slurmrestd[13169]: operations_router: [localhost:6820(fd:29)] GET /slurm/v0.0.41/jobs Nov 06 17:36:43 admin.nova slurmrestd[13169]: rest_auth/jwt: slurm_rest_auth_p_authenticate: [localhost:6820(fd:29)] attempting user_name slurm token authentication pass through root@admin:~# systemctl cat slurmrestd.service # /usr/lib/systemd/system/slurmrestd.service [Unit] Description=Slurm REST daemon After=network-online.target remote-fs.target slurmctld.service Wants=network-online.target ConditionPathExists=/etc/slurm/slurm.conf [Service] Type=simple EnvironmentFile=-/etc/sysconfig/slurmrestd EnvironmentFile=-/etc/default/slurmrestd # slurmrestd should never run as root or the slurm user. # Use a drop-in to change the default User and Group to site specific IDs. User=slurmrestd Group=slurmrestd ExecStart=/usr/sbin/slurmrestd $SLURMRESTD_OPTIONS # Enable auth/jwt be default, comment out the line to disable it for slurmrestd Environment=SLURM_JWT=daemon # Listen on TCP socket by default. Environment=SLURMRESTD_LISTEN=:6820 ExecReload=/bin/kill -HUP $MAINPID LimitMEMLOCK=infinity [Install] WantedBy=multi-user.target # /etc/systemd/system/slurmrestd.service.d/firehpc.conf [Service] # Unset vendor unit ExecStart and Environment to avoid cumulative definition ExecStart= Environment= Environment="SLURM_JWT=daemon" ExecStart=/usr/sbin/slurmrestd $SLURMRESTD_OPTIONS -a rest_auth/jwt [::]:6820 DynamicUser=yes User=slurmrestd Group=slurmrestd MemoryMax=3G Restart=always
The issue has been replicated, and I will update once it is corrected.
(In reply to Nate Rini from comment #13) > The issue has been replicated, and I will update once it is corrected. Thank you for looking at this so carefully despite the absence of support contrat :)