Summary: | RAPL plugin: incorrect *Watts and ConsumedEnergy values | ||
---|---|---|---|
Product: | Slurm | Reporter: | Alexey Kozlov <alexey.kozlov> |
Component: | Accounting | Assignee: | Tim Wickberg <tim> |
Status: | OPEN --- | QA Contact: | |
Severity: | C - Contributions | ||
Priority: | --- | CC: | sts, uemit.seren |
Version: | 21.08.x | ||
Hardware: | Linux | ||
OS: | Linux | ||
Site: | -Other- | Alineos Sites: | --- |
Atos/Eviden Sites: | --- | Confidential Site: | --- |
Coreweave sites: | --- | Cray Sites: | --- |
DS9 clusters: | --- | HPCnow Sites: | --- |
HPE Sites: | --- | IBM Sites: | --- |
NOAA SIte: | --- | OCF Sites: | --- |
Recursion Pharma Sites: | --- | SFW Sites: | --- |
SNIC sites: | --- | Linux Distro: | --- |
Machine Name: | CLE Version: | ||
Version Fixed: | Target Release: | --- | |
DevPrio: | --- | Emory-Cloud Sites: | --- |
Attachments: | proposed patch |
Description
Alexey Kozlov
2020-10-07 15:57:18 MDT
Created attachment 16196 [details]
proposed patch
This patch fixes multiple bugs/issues in power computation:
- CurrentWatts: using CPU energy unit for DRAM domain resulted in wrong values on many systems (Intel Haswell/Skylake/CascadeLake)
- CurrentWatts: same energy unit was used for all packages -> might work for now, but could break anytime
- AveWatts: incorrect value due to missing normalization by the polling interval
- AveWatts: inaccurate value due to using integer type to compute running average (at some point contribution of the current measurement becomes <1.0 -> AveWatts is frozen)
|