Ticket 9956

Summary: RAPL plugin: incorrect *Watts and ConsumedEnergy values
Product: Slurm Reporter: Alexey Kozlov <alexey.kozlov>
Component: AccountingAssignee: Oriol Vilarrubi <jvilarru>
Status: OPEN --- QA Contact: Tim Wickberg <tim>
Severity: 4 - Minor Issue    
Priority: --- CC: mahendra.paipuri, markus.hilger, sts, tim, uemit.seren
Version: 21.08.x   
Hardware: Linux   
OS: Linux   
Site: -Other- Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
OCF Sites: --- Recursion Pharma Sites: ---
SFW Sites: --- SNIC sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---
Attachments: proposed patch

Description Alexey Kozlov 2020-10-07 15:57:18 MDT
AcctGatherEnergy RAPL plugin is using the same energy unit for all CPU and DRAM packages:

https://github.com/SchedMD/slurm/blob/master/src/plugins/acct_gather_energy/rapl/acct_gather_energy_rapl.c#L326

However, on many modern server architectures (Haswell, Skylake X/SP, CascadeLake SP), DRAM energy unit is distinct from the package energy unit stored in the MSR_RAPL_POWER_UNIT register. Instead, it has a fixed value of 1/15300.

The (gloomy) situation becomes clear when looking at the Linux powercap driver code, which gives correct measurements:    

https://github.com/torvalds/linux/blob/master/drivers/powercap/intel_rapl_common.c#L964

https://github.com/torvalds/linux/blob/master/drivers/powercap/intel_rapl_common.c#L1017

So apparently, the only viable solution would be to check CPU model and set DRAM energy unit accordingly.

As a result of this bug, AcctGatherEnergy reports power and energy values which are incorrect, and in my experiments they were usually inflated by as much as 30%-50%.
Comment 3 Alexey Kozlov 2020-10-12 12:55:22 MDT
Created attachment 16196 [details]
proposed patch

This patch fixes multiple bugs/issues in power computation:

- CurrentWatts: using CPU energy unit for DRAM domain resulted in wrong values on many systems (Intel Haswell/Skylake/CascadeLake)

- CurrentWatts: same energy unit was used for all packages -> might work for now, but could break anytime 

- AveWatts: incorrect value due to missing normalization by the polling interval

- AveWatts: inaccurate value due to using integer type to compute running average (at some point contribution of the current measurement becomes <1.0 -> AveWatts is frozen)
Comment 4 Mahendra Paipuri 2024-07-15 04:04:48 MDT
Hello,

Any reason why this issue never got attention. The bug exists still in the RAPL plugin due to which the energy consumption reported by SLURM is significantly over-estimated than the actual values. Here is a little [report](https://gist.github.com/mahendrapaipuri/bcd357747d32073e3cb4622940db408b) on the bug.
Comment 6 Oriol Vilarrubi 2024-07-29 07:11:02 MDT
Hello Mahendra,

I am looking at how to best integrate this patch to current slurm version, your report is being very useful, many thanks