Bug 9956 - RAPL plugin: incorrect *Watts and ConsumedEnergy values
Summary: RAPL plugin: incorrect *Watts and ConsumedEnergy values
Status: OPEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: Accounting (show other bugs)
Version: 21.08.x
Hardware: Linux Linux
: --- C - Contributions
Assignee: Tim Wickberg
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2020-10-07 15:57 MDT by Alexey Kozlov
Modified: 2020-10-12 12:55 MDT (History)
0 users

See Also:
Site: -Other-
Alineos Sites: ---
Bull/Atos Sites: ---
Confidential Site: ---
Cray Sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---


Attachments
proposed patch (7.71 KB, patch)
2020-10-12 12:55 MDT, Alexey Kozlov
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alexey Kozlov 2020-10-07 15:57:18 MDT
AcctGatherEnergy RAPL plugin is using the same energy unit for all CPU and DRAM packages:

https://github.com/SchedMD/slurm/blob/master/src/plugins/acct_gather_energy/rapl/acct_gather_energy_rapl.c#L326

However, on many modern server architectures (Haswell, Skylake X/SP, CascadeLake SP), DRAM energy unit is distinct from the package energy unit stored in the MSR_RAPL_POWER_UNIT register. Instead, it has a fixed value of 1/15300.

The (gloomy) situation becomes clear when looking at the Linux powercap driver code, which gives correct measurements:    

https://github.com/torvalds/linux/blob/master/drivers/powercap/intel_rapl_common.c#L964

https://github.com/torvalds/linux/blob/master/drivers/powercap/intel_rapl_common.c#L1017

So apparently, the only viable solution would be to check CPU model and set DRAM energy unit accordingly.

As a result of this bug, AcctGatherEnergy reports power and energy values which are incorrect, and in my experiments they were usually inflated by as much as 30%-50%.
Comment 3 Alexey Kozlov 2020-10-12 12:55:22 MDT
Created attachment 16196 [details]
proposed patch

This patch fixes multiple bugs/issues in power computation:

- CurrentWatts: using CPU energy unit for DRAM domain resulted in wrong values on many systems (Intel Haswell/Skylake/CascadeLake)

- CurrentWatts: same energy unit was used for all packages -> might work for now, but could break anytime 

- AveWatts: incorrect value due to missing normalization by the polling interval

- AveWatts: inaccurate value due to using integer type to compute running average (at some point contribution of the current measurement becomes <1.0 -> AveWatts is frozen)