Ticket 22319 - tres_usage_in mapping
Summary: tres_usage_in mapping
Status: OPEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: Accounting (show other tickets)
Version: 24.11.3
Hardware: Linux Linux
: 4 - Minor Issue
Assignee: Ethan Simmons
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2025-03-11 14:04 MDT by Michael DiDomenico
Modified: 2025-03-13 10:31 MDT (History)
0 users

See Also:
Site: IDACCR
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Michael DiDomenico 2025-03-11 14:04:02 MDT
in the contributed seff module this line of perl code exists

    if (exists $step->{'stats'} && exists $step->{'stats'}{'tres_usage_in_tot'}) {

i'm trying to build a tool for job efficiency which kinda of does the what seff does, but using the --json output from sacct.  however, i cannot figure out how the tres_usage_in data is produced.

when i look at the sacct json for a job step, i see something like (sorry can't cut/paste)

{
  job_id: 1
  steps: [
    {
      tres: {
        requested: {
          max: []
          min: []
          average: []
          total: []
       },
       consumed: {
         max: []
         min: []
         average: []
         total: []
       },
       allocated: [
       ]       
     }
   }
]
    
under each of the max/min/avg/tot there are subs for cpu/disk/mem/energy/pages
but it seems inconsistent. i naively would think i could look in the consumed section of a step and compare that to the allocated/requested part and see how efficient a job was based on the resources asked for

however, in my steps the consumed part only lists energy/disk, no cpu/mem/nodes/pages/etc

so my question is, is something missing from the json output from sacct?  if not can you tell me where/how the calculations for 'tres_usage' are done
Comment 1 Ethan Simmons 2025-03-13 10:04:59 MDT
Please upload your slurm.conf, as this contains information about what information is tracked.
Comment 2 Michael DiDomenico 2025-03-13 10:31:09 MDT
(In reply to Ethan Simmons from comment #1)
> Please upload your slurm.conf, as this contains information about what
> information is tracked.

i can't do that, the environment is on a protected network.  however, i believe what your looking for is

accountingstoragetres = cpu,mem,energy,node,billing,fs/disk,vmem,pages,gres/gpu,gres/gpumem,gres/gpuutil
jobacctgatherfrequency=task=50,energy=60,network=60,filesystem=60
jobacctgathertype=jobacct_gather/cgroup

i did poke at this a little more after i opened the ticket.  it seems like what's actually missing from the JSON is the TRESUsage fields.  when i run

sacct --helpformat | grep TRES

i can see a list of TRESUsageIn/Out and when i run something like 'sacct -o jobid,tresusageinmax' i do see values.  however those fields doesn't seem to end up natively in the --json output for sacct.  or atleast i don't understand how to calculate it

ideally, every field listed in --helpformat for sacct should be in the --json output