Ticket 7489 - Extend acct_gather_energy_rapl to support AMD Zen
Summary: Extend acct_gather_energy_rapl to support AMD Zen
Status: OPEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: Accounting (show other tickets)
Version: 21.08.8
Hardware: Linux Linux
: 5 - Enhancement
Assignee: Dominik Bartkiewicz
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2019-07-31 06:00 MDT by Jurij Pečar
Modified: 2024-01-15 05:18 MST (History)
6 users (show)

See Also:
Site: EMBL
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
amd epyc support for acct_gather_energy_rapl (9.21 KB, patch)
2023-11-17 05:07 MST, Oliver Smith
Details | Diff

Note You need to log in before you can comment on or make changes to this ticket.
Description Jurij Pečar 2019-07-31 06:00:59 MDT
Hi,

we started exploring energy accounting options and noticed that our Epyc nodes show n/a for currentwatts, lowestjoules and consumedjoules. I took a look at how this is implemented and saw that you read values directly from /dev/cpu/*/msr.

Maybe you need some hints from here to extend that to also work for Zen:
https://github.com/djselbeck/rapl-read-ryzen

Thanks.

PS. Hope that GPU power will be included in this number in the not too distant future too :)
Comment 1 Jason Booth 2019-07-31 09:37:03 MDT
Hi, Jurij and thank you for pointing this out. We will evaluate this and see what is possible.
Comment 4 Alan Sill 2021-05-08 09:33:04 MDT
I believe you'll need a kernel of version 5.8 or higher to read AMD Zen power through RAPL. Important fixes were also introduced in version 5.11 of the kernel.

The patches were originally introduced by Google and eventually merged:

https://lore.kernel.org/lkml/20200515215733.20647-1-eranian@google.com/#r
https://lore.kernel.org/lkml/20200601155437.GA1042527@gmail.com/
Comment 6 Alan Sill 2021-05-08 09:38:09 MDT
Ah - bad news: It looks like support for AMD via RAPL was removed as ov version 5.13:

https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.13-AMD-Energy-Removed
Comment 7 Jurij Pečar 2021-05-08 11:10:01 MDT
Yeah I'm following this saga. Now we need to see how the major distros will decide - I'd hate to maintain external patches for each kernel upgrade just to get the energy info ...
Comment 8 Jurij Pečar 2022-11-25 06:01:20 MST
Revisiting this topic. Looks like amd_energy kernel module is available in el8 since 8.4 and I can confirm that "sensors" command shows me per-core kJ power usage. However slurm (21.08.8) still shows me n/a for watts and joules associated with our amd nodes. What's missing?
Comment 9 Jurij Pečar 2022-11-28 04:29:08 MST
Looks like due to CVE-2020-12912 something like

chmod 444 /sys/devices/platform/amd_energy.0/hwmon/hwmon*/energy*

is needed. I guess it's up to each admin to determine if this is acceptable risk for their systems.

Doing this makes 'sensors' output KJ numbers per core also to nonprivileged users. Now to see if slurm can make use of that...
Comment 10 Oliver Smith 2023-11-17 05:07:21 MST
Created attachment 33361 [details]
amd epyc support for acct_gather_energy_rapl

Hi,

I've written a patch to add support for AMD Epyc CPUs to acct_gather_energy_rapl in slurm 23.02. Honestly I'm not a developer so I'm sure it's rough in some places but hopefully it might serve as a starting point to adding support officially. 
It's working on our test cluster and seems to report reasonable stats for CPU power use.

Thanks
Comment 11 Jurij Pečar 2024-01-15 05:18:10 MST
Patch applies fine to 23.11.1 too and it appears to function correctly.

However I'm not 100% sure about the values it collects. Energy numbers on zen2 seem to be much lower than on zen3 and zen4. Not sure what to make of that ...

Do we have a reference job that uses known amount of energy? Maybe some simple HPL run?