Ticket 13843

Summary: Equivalent of "squeue -j $JOB -O TimeUsed" in JSON output?
Product: Slurm Reporter: Chris Samuel (NERSC) <csamuel>
Component: User CommandsAssignee: Jason Booth <jbooth>
Status: RESOLVED TIMEDOUT QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: nate
Version: 21.08.6   
Hardware: Linux   
OS: Linux   
Site: NERSC Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Chris Samuel (NERSC) 2022-04-13 22:33:54 MDT
Hi there,

Following up on a query from a staff user (who noticed that TimeUsed and TimeLimit in squeue and Elapsed & Timelimit in sacct are not affected by SLURM_TIME_FORMAT) I pointed them to the `--json/--yaml` options for squeue/sacct and then noticed that for squeue there didn't seem to be a field for TimeUsed.

sacct seems to have both `elapsed` and `limit`, and squeue has `time_limit` but am I missing something when trying to find a `TimeLimit` equivalent?

All the best,
Chris
Comment 1 Jason Booth 2022-04-14 14:08:22 MDT
Chris,

I have a few questions here just to make sure I understand what you are after. 

1. You are looking for "time used" in the --json / --yaml output?
2. Your reference to SLURM_TIME_FORMAT not affecting Elapsed & Timelimit was just an observation correct?

> (who noticed that TimeUsed and TimeLimit in squeue and Elapsed & Timelimit in sacct are not affected by SLURM_TIME_FORMAT)

If not can you elaborate on what you and the user except for the output of Elapsed & Timelimit when SLURM_TIME_FORMAT is set? 

squeue has a format option "%M". Is this what the user is looking for?

https://slurm.schedmd.com/squeue.html#OPT_%M

> Time used by the job or job step in days-hours:minutes:seconds. The days and hours are printed only as needed. 

Having SLURM_TIME_FORMAT effect these fields does not make much sense to me.

Is the user expecting SLURM_TIME_FORMAT to give an output that converts the run time to the value you specify? For example, if the user-specified %T is the expectation that they have the time used in DD:HH:MM:SS?


Although not the best solution, however, for now, obtaining the elapsed time can still be done with "date +%s - start_time". This does look like something we could improve on.
Comment 2 Chris Samuel (NERSC) 2022-04-14 22:44:25 MDT
Hi Jason,

1. Yes, trying to find where the equivalent value to "sque
Comment 3 Chris Samuel (NERSC) 2022-04-14 22:49:34 MDT
oops - too tired to drive a keyboard sorry, let me try that again..

Hi Jason,

1. Yes, trying to find where the equivalent value to the -o/-O formats %M/TimeUsed value is in the json/yaml output.

2. This was because our staff person was hoping there was a way to use SLURM_TIME_FORMAT to make both the %M/TimeUsed and %l/TimeLimit appear in seconds instead of HH:MM:SS format and I was hoping the JSON/YAML format could do that.

All the best,
Chris
Comment 5 Jason Booth 2022-06-01 13:48:24 MDT
Chris,

I appreciate your patience while we reviewed this.

As I am sure you are aware, TimeUsed doesn't exist in the json/yaml output since it's not a "real" field, but something calculated within squeue.

For running jobs, it's (now - start). It would be trivial to recreate in your script, (the entire logic is below).

If you want it added directly in the output, we could look at potentially adding it, however the soonest Slurm release would be 23.02. Let me know if the example below satisfies your request, or if more is needed.


>        if ((job_ptr->start_time == 0) || IS_JOB_PENDING(job_ptr))
>                return 0L;
>
>        if (IS_JOB_SUSPENDED(job_ptr))
>                return (long) job_ptr->pre_sus_time;
>
>        if (IS_JOB_RUNNING(job_ptr) || (job_ptr->end_time == 0))
>                end_time = time(NULL);
>        else
>                end_time = job_ptr->end_time;
>
>        if (job_ptr->suspend_time)
>                return (long) (difftime(end_time, job_ptr->suspend_time)
>                                + job_ptr->pre_sus_time);
>        return (long) (difftime(end_time, job_ptr->start_time));
>}
Comment 6 Jason Booth 2022-06-07 11:30:34 MDT
Resolving as timed out. Feel free to re-open if you would like to follow up.