Ticket 13025

Summary: Recording PendingTime for analysis
Product: Slurm Reporter: Gordon Dexter <gmdexter>
Component: AccountingAssignee: Oriol Vilarrubi <jvilarru>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: sts
Version: 21.08.2   
Hardware: Linux   
OS: Linux   
Site: Johns Hopkins Univ. HLTCOE Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Gordon Dexter 2021-12-13 14:32:40 MST
As part of procurement planning we'd like to have some visibility into how long jobs wait in the queue before they are run.  This would help us answer questions like "How long, on average, does a job wait for a T4?"

The PendingTime attribute seems to provide this but is only available via squeue, not sacct, making it unsuited for long-term stats.

Is there some way to access the PendingTime via sacct?  Or if not, is there some way to add a field in sacct that a prolog script can write to?
Comment 1 Oriol Vilarrubi 2021-12-14 03:44:32 MST
Hi Gordon,

Pending time is just a calculated difference between the submittime and the actual time if pending, or the start time if running.
If you want to get that value from sacct you can do it very easily:
  
I've submitted a job that was waiting in queue for 81 seconds:
[jvilarru@centos ~]$ squeue -O JobId,PendingTime
JOBID               PENDING_TIME        
43                  81                  

Then with sacct you can get the submittime and the start time and subtract them.

Let's take this job 43 for example:

sacct -X -j 43 -o Submit,Start
             Submit               Start 
------------------- ------------------- 
2021-12-14T11:11:27 2021-12-14T11:12:48 

This output can be converted so that it is easily subtractable:

SLURM_TIME_FORMAT="%s" sacct --noheader -P -X -j 43 -o Submit,Start
1639476687|1639476768

And then you can do it for example with awk:
SLURM_TIME_FORMAT="%s" sacct --noheader -P -X -j 43 -o Submit,Start | awk -F'|' '{ print $2-$1}'
81

You can do that for multiple jobs too if you specify them comma separated in the -j, or you can also specify other filters like using the state of the jobs, start and end time, etc...

Greetings
Comment 2 Gordon Dexter 2021-12-14 13:02:01 MST
Thank you.  This should do the trick.