Ticket 1005

Summary: sacct cputime documentation
Product: Slurm Reporter: Ryan Cox <ryan_cox>
Component: AccountingAssignee: David Bigagli <david>
Status: RESOLVED FIXED QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: da
Version: 14.03.6   
Hardware: Linux   
OS: Linux   
Site: BYU - Brigham Young University Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: 14.03.7 Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---
Attachments: sacct_cputime_clarification.diff

Description Ryan Cox 2014-07-31 09:15:12 MDT
Created attachment 1099 [details]
sacct_cputime_clarification.diff

The sacct manpage reads:
              cputime   Formatted number of cpu seconds a process was allocated.

              cputimeraw
                        How much cpu time process was allocated in second format, not formatted like above.



but the code says:
                case PRINT_CPU_TIME:
                        switch(type) {
                        case JOB:
                                tmp_uint64 = (uint64_t)job->elapsed
                                        * (uint64_t)job->alloc_cpus;
                                break;
                        case JOBSTEP:
                                tmp_uint64 = (uint64_t)step->elapsed
                                        * (uint64_t)step->ncpus;
                                break;



Note that "allocated" is confusing.  The sacct code says elapsed.  The actual code in https://github.com/SchedMD/slurm/blob/f8025c1484838ecbe3e690fa565452d990123361/src/plugins/priority/multifactor/priority_multifactor.c#L1041 calculates raw usage as the entire cputime up to the timelimit, whether it is used or not (we like that behavior, by the way).

I propose that sacct be clarified through the patch I am submitting.  I chose not to say "cputime" since cputime isn't clear if it's a cputime Linux might report (i.e. actual CPU usage * time) or simply CPUs * time.
Comment 1 Ryan Cox 2014-08-01 05:25:15 MDT
Okay, so it appears I messed up in my understanding of the code in one way.  I said that the charge is (timelimit * cpus).  It is actually (elapsed * cpus).  Some of the similarly named variables (cpu_run_delta vs run_delta vs run_decay) all in close proximity seem to have gotten combined in my brain...

However, that doesn't affect the need for the manpage clarification.  The patch is still valid.
Comment 2 David Bigagli 2014-08-01 07:01:12 MDT
In Slurm the CPU time always refers to elapsed time, I think that allocated
is correct because it is the cputime that has been allocated to the job/step
times the number of cpus. This is also true for GrpCPURunMins and GrpCPUMins
which always refer to elapsed time. I think this concept comes from parallel job computing where times and speed up are always considered as elapsed/wall clock times. 

We should explain what the concept is... somewhere.

David
Comment 3 Ryan Cox 2014-08-01 07:15:14 MDT
Okay.  It's just that "elapsed" and "allocated" are different in my mind.  You may be allocated 4 CPUs but only use 1 (not threading properly).  Likewise, you may be allocated 1 hour (timelimit) but only run for 5 minutes.

If you don't think it needs changing that's fine too.  It just seemed ambiguous.
Comment 4 David Bigagli 2014-08-01 08:36:48 MDT
I agree with you the concept is not explained right. 

The Slurm way of thinking is that you have been allocated x resources and whatever you do on then, run or not is your business, so every second that passes Slurm counts it as cpu time/allocated time. 

Those parameters in my opinion should not be called cputime and cputimeraw, they should be called elapsed/wall-clock time or time since the allocation started to meet the universally accepted context that cputime is the time consumed by using.

I think that your patch clarifies things a bit. Committed d1b0dfd6d5 with minor
change.

David