Power tracks stop after a few hours on Intel Macs
Categories
(Core :: Gecko Profiler, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox113 | --- | fixed |
People
(Reporter: florian, Assigned: florian)
References
Details
Attachments
(1 file)
Example profile: https://share.firefox.dev/3JDrclg
The track that stops first is always the "CPU package" one, which contains the higher value. The "CPU cores" track stops later. And after leaving the profiling running an entire week-end, I only have the "iGPU" track left.
This behavior seems to indicate we have a value that exceeds the maximum possible value of a variable.
The check at https://searchfox.org/mozilla-central/rev/47aea2f603cc18144afcedbd604a418f11e90f9b/tools/profiler/core/PowerCounters-mac-amd64.cpp#184 ensures we only capture a new sample if it's value is higher than the previous one, so that likely explains why the track stops, if somehow the counter went back to zero.
I'm not sure where things overflow. We have mSample
that is a double
at https://searchfox.org/mozilla-central/rev/47aea2f603cc18144afcedbd604a418f11e90f9b/tools/profiler/core/PowerCounters-mac-amd64.cpp#191 and the raw counters are uint64_t at https://searchfox.org/mozilla-central/source/tools/profiler/core/PowerCounters-mac-amd64.cpp#97-112
We seem to overflow after using a bit more than 3W for about 24h.
Assignee | ||
Comment 1•1 year ago
|
||
I got a long profile last night from a 2018 macbook pro: https://share.firefox.dev/3zdeDXR where I added debug logging.
There are 3 interesting events in the profile:
- after about 13,000s, the "CPU package" track stops updating. Here is what I have in the log:
pkg: 4294944464, pp0: 2225911504, pp1: 1486256680, ddr: 505918927
pkg: 78, pp0: 2225914945, pp1: 1486273631, ddr: 505921229
AddSample failed: mSample: 262142,606445, aSample: 0,004761
- about 25,000s (~7h) later, there's a little glitch on the "CPU package" track:
pkg: 4294935371, pp0: 3794278915, pp1: 2887882738, ddr: 1602795378
pkg: 4294956320, pp0: 3794282243, pp1: 2887897686, ddr: 1602797550
pkg: 7661, pp0: 3794285788, pp1: 2887910164, ddr: 1602799880
This explains the little glitch as 4294956320 > 4294944464 (which was the last value recorded 7h before).
- The "Power: iGPU" track stops after about 38,200s:
pkg: 36554547, pp0: 3030930415, pp1: 4294963938, ddr: 1083658241
pkg: 36578830, pp0: 3030934437, pp1: 14209, ddr: 1083660720
AddSample failed: mSample: 262143,795044, aSample: 0,867249
4294944464 = 0b11111111111111111010011011010000
4294956320 = 0b11111111111111111101010100100000
4294963938 = 0b11111111111111111111001011100010
So it seems despite being stored in uint64_t
fields of the pkg_energy_statistics_t
structure, the pkg_energy
and pp1_energy
fields are uint32_t
values.
Looking at the source code of the kernel (https://github.com/apple/darwin-xnu/blob/2ff845c2e033bd0ff64b5b6aa6063a1f8f65aa32/osfmk/i386/Diagnostics.c#L252-L260) it seems two 32 bit values are read and put together into a 64bit value, so I'm not sure why the high part of the 64 bit value is always zeros for us.
Assignee | ||
Comment 2•1 year ago
|
||
Updated•1 year ago
|
Assignee | ||
Comment 3•1 year ago
|
||
(In reply to Florian Quèze [:florian] from comment #1)
I got a long profile last night from a 2018 macbook pro: https://share.firefox.dev/3zdeDXR where I added debug logging.
Here's an even longer profile (more than 21h) from the same machine with the patch applied: https://share.firefox.dev/3zjewtH, showing that it does fix the bug.
Pushed by fqueze@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e6edaff6f405 Handle wraparounds in RAPL power counters on Intel Macs, r=dlrobertson.
Comment 5•1 year ago
|
||
bugherder |
Description
•