Created attachment 1099 [details] sacct_cputime_clarification.diff The sacct manpage reads: cputime Formatted number of cpu seconds a process was allocated. cputimeraw How much cpu time process was allocated in second format, not formatted like above. but the code says: case PRINT_CPU_TIME: switch(type) { case JOB: tmp_uint64 = (uint64_t)job->elapsed * (uint64_t)job->alloc_cpus; break; case JOBSTEP: tmp_uint64 = (uint64_t)step->elapsed * (uint64_t)step->ncpus; break; Note that "allocated" is confusing. The sacct code says elapsed. The actual code in https://github.com/SchedMD/slurm/blob/f8025c1484838ecbe3e690fa565452d990123361/src/plugins/priority/multifactor/priority_multifactor.c#L1041 calculates raw usage as the entire cputime up to the timelimit, whether it is used or not (we like that behavior, by the way). I propose that sacct be clarified through the patch I am submitting. I chose not to say "cputime" since cputime isn't clear if it's a cputime Linux might report (i.e. actual CPU usage * time) or simply CPUs * time.
Okay, so it appears I messed up in my understanding of the code in one way. I said that the charge is (timelimit * cpus). It is actually (elapsed * cpus). Some of the similarly named variables (cpu_run_delta vs run_delta vs run_decay) all in close proximity seem to have gotten combined in my brain... However, that doesn't affect the need for the manpage clarification. The patch is still valid.
In Slurm the CPU time always refers to elapsed time, I think that allocated is correct because it is the cputime that has been allocated to the job/step times the number of cpus. This is also true for GrpCPURunMins and GrpCPUMins which always refer to elapsed time. I think this concept comes from parallel job computing where times and speed up are always considered as elapsed/wall clock times. We should explain what the concept is... somewhere. David
Okay. It's just that "elapsed" and "allocated" are different in my mind. You may be allocated 4 CPUs but only use 1 (not threading properly). Likewise, you may be allocated 1 hour (timelimit) but only run for 5 minutes. If you don't think it needs changing that's fine too. It just seemed ambiguous.
I agree with you the concept is not explained right. The Slurm way of thinking is that you have been allocated x resources and whatever you do on then, run or not is your business, so every second that passes Slurm counts it as cpu time/allocated time. Those parameters in my opinion should not be called cputime and cputimeraw, they should be called elapsed/wall-clock time or time since the allocation started to meet the universally accepted context that cputime is the time consumed by using. I think that your patch clarifies things a bit. Committed d1b0dfd6d5 with minor change. David