Ticket 20208

Summary: Empty ConsumedEnergyRaw values from sacct
Product: Slurm Reporter: Ole.H.Nielsen <Ole.H.Nielsen>
Component: AccountingAssignee: Felip Moll <felip.moll>
Status: RESOLVED DUPLICATE QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: felip.moll
Version: 23.11.8   
Hardware: Linux   
OS: Linux   
Site: DTU Physics Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Ole.H.Nielsen@fysik.dtu.dk 2024-06-20 02:52:18 MDT
This issue may be related to bug 20207.

We have enabled AcctGatherEnergyType=acct_gather_energy/ipmi in slurm.conf and I started to test some accounting with sacct printing the ConsumedEnergyRaw values.  The ConsumedEnergyRaw mostly looks sensible, but we have found some empty values like in this example:

/usr/bin/sacct --user <omitted> --partition a100 -np -X -S 061424 -E 061924 -o JobID,Group,Partition,AllocNodes,AllocCPUS,Submit,Eligible,Start,End,CPUTimeRAW,State,Nodelist,ConsumedEnergyRaw  -s to
7393882|catvip|a100|1|32|14-Jun-2024_15:02|14-Jun-2024_15:02|14-Jun-2024_15:02|14-Jun-2024_15:14|23104|TIMEOUT|sd651||
7393903|catvip|a100|1|32|14-Jun-2024_15:16|14-Jun-2024_15:16|14-Jun-2024_15:16|14-Jun-2024_19:16|461088|TIMEOUT|sd651||
7403327|catvip|a100|1|32|17-Jun-2024_12:47|17-Jun-2024_12:47|17-Jun-2024_13:03|18-Jun-2024_13:04|2765216|TIMEOUT|sd652||

This was found both with 23.11.7 and after upgrading top 23.11.8 at this time:

# rpm -qi slurm-slurmd | grep Install
Install Date: Mon 17 Jun 2024 11:37:04 AM CEST

Other jobs on these nodes report non-empty values for ConsumedEnergyRaw.  I can't see any reason offhand for this behavior.
Comment 1 Felip Moll 2024-06-21 03:04:43 MDT
Hi Ole,

If you don't mind let's work on ticket 20207 which seems the same issue
I am closing this one and responding in the other bug.

*** This ticket has been marked as a duplicate of ticket 20207 ***
Comment 2 Ole.H.Nielsen@fysik.dtu.dk 2024-06-21 03:05:00 MDT
I'm out of the office, back on June 24.
Jeg er ikke på kontoret, tilbage igen 24. juni.

Best regards / Venlig hilsen,
Ole Holm Nielsen