Ticket 21049 - Directly scancel the job in the suspend state, the end time record of the job is not as expected
Summary: Directly scancel the job in the suspend state, the end time record of the job...
Status: OPEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: Accounting (show other tickets)
Version: 23.02.0
Hardware: Linux Linux
: 6 - No support contract
Assignee: Jacob Jenson
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2024-09-30 02:37 MDT by Hyigehaor
Modified: 2024-09-30 02:37 MDT (History)
0 users

See Also:
Site: -Other-
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Hyigehaor 2024-09-30 02:37:43 MDT
I submitted a job, then suspended it using scontrol suspend jobid, and finally canceled it using scancel.

But when I used sacct to look at the start and end time of the job, I found that the end time of the job was the hang time, not the time when I actually executed the scancel command.

But the start and end times at the job step level are displayed correctly

The process is as follows:

[root@gv100 slurm]# date
Mon Sep 30 16:31:50 CST 2024
[root@gv100 slurm]# sbatch command2.slurm
Submitted batch job 1240
[root@gv100 slurm]# scontrol suspend 1240
You have new mail in /var/spool/mail/root
[root@gv100 slurm]# sacct -j 1240 -o jobid,state,start,end
JobID             State               Start                 End 
------------ ---------- ------------------- ------------------- 
1240          SUSPENDED 2024-09-30T16:31:53             Unknown 
1240.batch    SUSPENDED 2024-09-30T16:31:53             Unknown 
1240.extern   SUSPENDED 2024-09-30T16:31:53             Unknown 
[root@gv100 slurm]# 
[root@gv100 slurm]# 
[root@gv100 slurm]# 
[root@gv100 slurm]# scancel 1240
[root@gv100 slurm]# sacct -j 1240 -o jobid,state,start,end
JobID             State               Start                 End 
------------ ---------- ------------------- ------------------- 
1240         CANCELLED+ 2024-09-30T16:31:53 2024-09-30T16:32:02 
1240.batch    CANCELLED 2024-09-30T16:31:53 2024-09-30T16:34:34 
1240.extern   COMPLETED 2024-09-30T16:31:53 2024-09-30T16:34:34


I'm not sure if this is a bug, or if it's intentional, can you explain why