I have a simple test script which does a bit of memory allocation just to generate some user and system CPU usage - #!/bin/bash #SBATCH -N 3 #SBATCH -n 40 #SBATCH -J mytest #SBATCH -A aaa #SBATCH -t 100 #SBATCH --mem=200G hostname sleep 2 srun -n3 ./memalloc srun -n3 ./memalloc srun -n3 ./memalloc ~ After the script is run, I use sacct to check the CPU usage stats - [jlong@iforgehn2 testing]$ sacct -j 175 --format jobid,state,user,NNodes,partition,TotalCPU,SystemCPU,UserCPU JobID State User NNodes Partition TotalCPU SystemCPU UserCPU ------------ ---------- --------- -------- ---------- ---------- ---------- ---------- 175 COMPLETED jlong 3 normal 00:52.695 00:43.278 00:09.417 175.batch COMPLETED 1 00:00.098 00:00.056 00:00.042 175.0 COMPLETED 3 00:17.540 00:14.363 00:03.177 175.1 COMPLETED 3 00:17.527 00:14.452 00:03.075 175.2 COMPLETED 3 00:17.527 00:14.407 00:03.120 The usage for the job allocation gets reported as the total for all of the job steps as expected. However, when I add the -X argument to see just the job allocation stats, the CPU usages stats suddenly get reported as zero - [jlong@iforgehn2 testing]$ sacct -j 175 -X --format jobid,state,user,NNodes,partition,TotalCPU,SystemCPU,UserCPU JobID State User NNodes Partition TotalCPU SystemCPU UserCPU ------------ ---------- --------- -------- ---------- ---------- ---------- ---------- 175 COMPLETED jlong 3 normal 00:00:00 00:00:00 00:00:00
Jim, Thanks for bringing this to our attention. I'm looking into it now and will update you with any progress. - Jeff
Tim, From the sacct man page: > -X, --allocations > Only show statistics relevant to the job allocation itself, not taking steps into consideration. Allocations/jobs don't actually run anything on the node, steps do. You can't get any cpu utilization stats without steps. This is functioning as intended.
I'm a little confused here then. Why do I get CPU usage numbers on the first non-header line when the -X argument is not used. Isn't that reflecting the stats for the job/allocation. How/why is that different than what is reported with -X? I guess what I am asking is - why doesn't the first non-header line match whether or not the -X argument is used?
Jim, (In reply to Jim Long from comment #3) > I guess what I am asking is - why doesn't the first non-header line match > whether or not the -X argument is used? Yes, it is a bit confusing. It has to do with the way the code queries the database. After it gets the all of the steps for a job, it aggregates all of the information and reports that as the total for the job. With the -X option, no steps are pulled from the database, so it has nothing to aggregate. > -X, --allocations > Only show statistics relevant to the job allocation itself, not taking steps into consideration. I thought I was taking "not taking steps into consideration" too literally by assuming it would entirely ignore stats from steps, but that's exactly what intention is. This code can be found in the file src/sacct/options.c:get_data() if you want to see it. Does that all make sense? > - Jeff
It makes sense, but not very useful. No way the get total CPU stats without displaying all of the steps too. Perhaps there should be an aggregate option. From a billing perspective I'm probably not interested in the steps, but might be interested in total resource usage.
Jim, We have documented this special case and it will appear on the man page for future releases. I will go ahead and close this now. Don't hesitate to reach out if you have any questions. - Jeff