MemoryTelemetry::GatherReports should not run in idle processes
Categories
(Core :: XPCOM, defect, P2)
Tracking
()
People
(Reporter: florian, Assigned: pbone)
References
(Blocks 3 open bugs, Regression)
Details
(Keywords: perf:resource-use, regression)
Attachments
(5 files, 1 obsolete file)
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-beta+
|
Details | Review |
Here is a 10 minute subset of a power profile I took over the night on my Macbook: https://share.firefox.dev/3z94fDY (If you are curious the full profile is at https://share.firefox.dev/4dPX96F ; I was testing my patches for bug 1914862).
In the profile, we see that every content process (including the preallocated ones that should be completely inactive):
- starts every minute a MemoryTelemetry::GatherReports runnable (off a 59,999ms timer that we can see on the timer thread),
- the main thread MemoryTelemetry::GatherReports runnable spawns a new StreamTrans thread, which also runs a MemoryTelemetry::GatherReports runnable.
- both the main thread and the StreamTrans MemoryTelemetry::GatherReports runnables save data to Telemetry Histograms.
- saving any data in Telemetry starts a timer, which triggers an IPC to the parent process after 2 seconds (also causing a wake-up of the parent process)
- after 30s of idleness, the StreamTrans thread is shutdown (also waking up the main thread).
So that's at least 3 (more likely significantly more) process wake-ups per minute per content process.
In the profile, my Firefox had a single tab, but 6 content processes (1 for my tab, 1 webextension, 1 priviledged about, 3 preallocated), so these MemoryTelemetry::GatherReports runnables caused at least 18 process wake-ups per minute. The cost of these wake-ups is easily visible in the shape of the power tracks for child processes (ignore the power track in the parent, the excessive power use there is due to bug 1735408).
Paul, from reading bug 1894582 I'm not sure if this more frequent memory telemetry collection was meant to be temporary (if so, given it has already been in for multiple cycles, maybe we already have the data we need?) or permanent. If it was meant to be permanent, I think it should really be based on activity instead of on timers. From my reading of bug 1894582 it seems using timers was meant to avoid over-reporting for busy processes and avoid under-reporting of low-activity processes. I think both of these could be achieved without timer:
- To avoid over-reporting, you could save a timestamp of the last report, and ignore new opportunities to make a report if the saved timestamp is recent enough.
- To avoid under-reporting, you could save the same memory report multiple times at once (ie. call TelemetryHistogram.Add multiple times in a row) if you notice the last report is old.
Comment 2•1 year ago
|
||
Set release status flags based on info from the regressing bug 1894582
Comment 3•1 year ago
|
||
:pbone, since you are the author of the regressor, bug 1894582, could you take a look? Also, could you set the severity field?
For more information, please visit BugBot documentation.
Assignee | ||
Comment 4•1 year ago
|
||
Thanks Florian, I will definitly look at this.
Updated•1 year ago
|
Updated•1 year ago
|
Comment 5•11 months ago
|
||
The severity field is not set for this bug.
:nika, could you have a look please?
For more information, please visit BugBot documentation.
Updated•11 months ago
|
Updated•11 months ago
|
Assignee | ||
Comment 6•11 months ago
|
||
Assignee | ||
Comment 7•11 months ago
|
||
Assignee | ||
Comment 8•11 months ago
|
||
Assignee | ||
Comment 9•11 months ago
|
||
Comment 10•9 months ago
|
||
Hi Paul,
Is there any progress on this bug? This issue seems to be a significant contributor for why we consume 2x more power and cpu time vs chrome during the background-resource test.. Here is another profile of this specific test case showing a lot of cycles spent in GatherReports
: https://share.firefox.dev/3VxnVJW
Comment 11•8 months ago
|
||
Here are results from disabling the memory reporter. We see a 50% improvement in the background CPU time.
Assignee | ||
Comment 12•8 months ago
|
||
Okay. I'm sorry I dropped this. It's now my priority.
Comment 13•8 months ago
|
||
We see a 50% improvement in the background CPU time.
This could be a significant improvement on Android background tab kills, so rising to S2.
Updated•8 months ago
|
Updated•8 months ago
|
Assignee | ||
Comment 14•8 months ago
|
||
Assignee | ||
Comment 15•8 months ago
|
||
This patch adds a sliding window of recent calls to Poke() that is used to
determine if the process is idle. This avoids running the memory telemetry
when the process should be idle.
Updated•8 months ago
|
Updated•8 months ago
|
Comment 16•8 months ago
|
||
Comment 17•8 months ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/7cefcb3f6e31
https://hg.mozilla.org/mozilla-central/rev/cfe4a3109ecf
https://hg.mozilla.org/mozilla-central/rev/a2b1914cd7be
https://hg.mozilla.org/mozilla-central/rev/3d05cf9721c0
https://hg.mozilla.org/mozilla-central/rev/25d2eab62bf9
Updated•8 months ago
|
Comment 18•8 months ago
|
||
The patch landed in nightly and beta is affected.
:pbone, is this bug important enough to require an uplift?
- If yes, please nominate the patch for beta approval.
- If no, please set
status-firefox135
towontfix
.
For more information, please visit BugBot documentation.
Assignee | ||
Comment 19•8 months ago
|
||
Comment on attachment 9444497 [details]
Bug 1917184 - Detect "idleness" more accurately r=mccr8
Beta/Release Uplift Approval Request
- User impact if declined/Reason for urgency: Firefox continues to process and send this telemetry even when idle, this wakes the CPU more frequently than necessary which can consume battery power.
- Is this code covered by automated tests?: Unknown
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: Other patches on this bug.
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): This is not risky, it's a change to some telemetry code that a user won't notice.
- String changes made/needed: None.
- Is Android affected?: Yes
Comment 20•8 months ago
|
||
Can we confirm that this has led to the expected improvements on Nightly builds?
Reporter | ||
Comment 21•8 months ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #20)
Can we confirm that this has led to the expected improvements on Nightly builds?
If I compare the behavior of the privileged content process in the profile from comment 0 (https://share.firefox.dev/429Wrhj) and in a long profile captured today (https://share.firefox.dev/4habFap), there's a pretty clear improvement. In the first profile, the MemoryTelemetry::GatherReports timer fired every minute. In today's profile, it fired once in 3 hours.
Today, no timer fired in 3 hours in the preallocated content processes: https://share.firefox.dev/4237tVL In the profile from comment 0 the timers fired every minute in preallocated processes: https://share.firefox.dev/3Pv0c9D
The profile I captured today doesn't have any non-empty content process (I was profiling about:blank), and was on Windows (vs Mac for the profile from comment 0) but I think that's enough to confirm a dramatic improvement.
Comment 22•8 months ago
|
||
There is also a noticeable improvement in the background-resource test on Android: https://treeherder.mozilla.org/perfherder/graphs?highlightAlerts=1&highlightChangelogData=1&highlightCommonAlerts=0&replicates=0&series=mozilla-central,182982,1,15&timerange=1209600
Comment 23•8 months ago
|
||
Comment on attachment 9444497 [details]
Bug 1917184 - Detect "idleness" more accurately r=mccr8
Thanks for the extra info, glad to hear it's making such a noticeable difference already! Approved for 135.0b4.
Comment 24•8 months ago
|
||
uplift |
Updated•8 months ago
|
Description
•