Multi second cycle collections on Twitter with LargestContentfulPaint
Categories
(Core :: DOM: Core & HTML, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr115 | --- | unaffected |
firefox119 | --- | unaffected |
firefox120 | --- | unaffected |
firefox121 | + | fixed |
People
(Reporter: scineram, Assigned: sefeng)
References
(Regression)
Details
(Keywords: perf-alert, regression)
Attachments
(4 files)
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0
Steps to reproduce:
Browsing Twitter (any of my accounts in their container).
Actual results:
Within a few minutes on the last few Nightly updates Twitter grinds to a halt. Everything loads glacially, including likeing or retweeting. After scrolling clicking on tweet loads the wrong tweet, presumably the scroll hasn't caught up. Happens after disabling Ublock Origin, even in troubleshooting mode. Other websites however seem to be unaffected.
Expected results:
Should be at least as fast as before. Downgrading to Nightly 2023-10-30-16-49-30 (the only old version I tried) restores the usable performance.
Comment 1•11 months ago
|
||
The Bugbug bot thinks this bug should belong to the 'Core::Panning and Zooming' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
Comment 2•11 months ago
|
||
Does it happen in a fresh profile?
If an older nightly doesn't have the problem you could use mozregression https://mozilla.github.io/mozregression/ to see what caused it?
Or uploading a profile using https://profiler.firefox.com/ could be helpful.
I reported this and uploaded about:support from an office machine once it updated to a new version. But I first encountered the slowdown on my home computer over the weekend. Now I updated the office PC to 2023-11-07 and all the slowdown is back . Ran the profiler over twitter: https://share.firefox.dev/47o1lGZ
Tested more downgrades:
Nightly 2023-11-02-15-48-33 is not broken. Fast, no slowdown on twitter.
Nightly 2023-11-03-09-38-36 is broken. Twitter gets glacial after few minutes of browsing.
Comment 5•11 months ago
|
||
Thanks for the profile. Looks like a lot of time in cycle collection.
Comment 6•11 months ago
|
||
(In reply to John from comment #4)
Tested more downgrades:
Nightly 2023-11-02-15-48-33 is not broken. Fast, no slowdown on twitter.
Nightly 2023-11-03-09-38-36 is broken. Twitter gets glacial after few minutes of browsing.
Thanks for testing those. That narrows down what caused the problem to this list
If you are feeling adventures you can use mozregression to fetch autoland builds for revisions of the middle of that to test.
Comment 7•11 months ago
|
||
Does switching dom.enable_largest_contentful_paint to false help?
Comment 8•11 months ago
|
||
This bug was moved into the Performance component.
:scineram, could you make sure the following information is on this bug?
✅ For slowness or high CPU usage, capture a profile with http://profiler.firefox.com/, upload it and share the link here.- For memory usage issues, capture a memory dump from
about:memory
and attach it to this bug. - Troubleshooting information: Go to
about:support
, click "Copy raw data to clipboard", paste it into a file, save it, and attach the file here.
If the requested information is already in the bug, please confirm it is recent.
Thank you.
Comment 9•11 months ago
|
||
It looks like the CC is spending huge amounts of time traversing JS. That usually only happens when a page leaks, but in this case the Twitter process is only using about a dozen MB of memory (except for the CC itself, which is using about 1 GB of memory), so that doesn't make much sense. Maybe the CC CanSkip mechanism is breaking for some other reason.
Comment 10•11 months ago
|
||
There was some discussion on Matrix and we thought this might be related to bug 1863104. The regression range matches and if there's a leak caused by those changes it'd explain what we're seeing here.
Comment 11•11 months ago
|
||
I can reproduce this. You have to scroll down a bit so there's a decent amount loaded, but not an absurd amount. The multi second CCs persist even if you aren't interacting with Twitter but just letting it sit there.
Comment 12•11 months ago
|
||
Setting dom.enable_largest_contentful_paint to false does seem to fix the issue for me, though it is possible I didn't have enough loaded in with the pref set to false.
Updated•11 months ago
|
Updated•11 months ago
|
Comment 13•11 months ago
|
||
My guess here, without looking at any logs, is that we need to make CanSkip do something about the chain of strong references nsGlobalInnerWindow.mPerformance --> PerformanceMainThread::mLargestContentfulPaintEntries --> LargestContentfulPaint::mElement. If there's an element holds a lot of stuff alive and is not currently in the DOM tree, it could have this kind of issue. (Assuming there's no mechanism that evicts the LCP entries in that case.)
Comment 14•11 months ago
|
||
This is easy to reproduce for me so I don't think we need more information from the reporter. Thanks for the report!
Comment 15•11 months ago
|
||
Emilio pointed out that while the element is exposed to JS, GetElement uses nsContentUtils::GetAnElementForTiming(), and that returns null if !IsInComposedDoc()
so maybe it would be okay to make LargestContentfulPaint::mElement weak.
Comment 16•11 months ago
|
||
Set release status flags based on info from the regressing bug 1722322
Comment 17•11 months ago
|
||
[Tracking Requested - why for this release]: Big performance regression on Twitter that is pretty easy to reproduce. I've also seen complaints about at least one other site.
Reporter | ||
Comment 18•11 months ago
|
||
While away I did another profiling of latest version this time on the home machine. https://share.firefox.dev/3Svt5Fq
Reporter | ||
Comment 19•11 months ago
|
||
Also have a memory profile just in case.
Assignee | ||
Comment 20•11 months ago
|
||
Updated•11 months ago
|
Updated•10 months ago
|
Comment 21•10 months ago
|
||
Pushed by sefeng@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e2c6fa08aa82 Use weak pointers for LargestContentfulPaint to avoid keeping the element alive for unnecessary period r=emilio,mccr8
Comment 22•10 months ago
|
||
Backed out for causing multiple crashes @ nsIGlobalObject::GetAsInnerWindow
- Backout link
- Push with failures
- Failure Log
- Failure line: PROCESS-CRASH | application crashed [@ nsIGlobalObject::GetAsInnerWindow] | toolkit/components/passwordmgr/test/mochitest/test_password_length.html
Log 2: https://treeherder.mozilla.org/logviewer?job_id=435807963&repo=autoland
Log 3: https://treeherder.mozilla.org/logviewer?job_id=435806628&repo=autoland
Comment 23•10 months ago
|
||
Pushed by sefeng@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/24b44cfac27e Use weak pointers for LargestContentfulPaint to avoid keeping the element alive for unnecessary period r=emilio,mccr8
Comment 24•10 months ago
•
|
||
Backed out for @ nsIGlobalObject::GetAsInnerWindow related crashes.
- Backout link
- Push with failures
- Failure Log
- Failure line: PROCESS-CRASH | application crashed [@ nsIGlobalObject::GetAsInnerWindow] | toolkit/components/passwordmgr/test/mochitest/test_password_length.html
L.E. Also there are these mochitests.
Comment 25•10 months ago
|
||
Pushed by sefeng@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/cfed378ac68a Use weak pointers for LargestContentfulPaint to avoid keeping the element alive for unnecessary period r=emilio,mccr8
Comment 26•10 months ago
|
||
Backed out for causing build bustages in PerformanceMainThread.cpp.
- Backout link
- Push with failures
- Failure Log
- Failure line: /builds/worker/checkouts/gecko/dom/performance/PerformanceMainThread.cpp:757:13: error: unused variable 'innerWindow' [-Werror=unused-variable]
Updated•10 months ago
|
Comment 27•10 months ago
|
||
Pushed by sefeng@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/abf27349f36e Use weak pointers for LargestContentfulPaint to avoid keeping the element alive for unnecessary period r=emilio,mccr8
Comment 28•10 months ago
|
||
bugherder |
Assignee | ||
Updated•10 months ago
|
Updated•10 months ago
|
Comment 30•10 months ago
|
||
(In reply to Pulsebot from comment #27)
Pushed by sefeng@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/abf27349f36e
Use weak pointers for LargestContentfulPaint to avoid keeping the element
alive for unnecessary period r=emilio,mccr8
== Change summary for alert #40287 (as of Fri, 17 Nov 2023 13:12:11 GMT) ==
Improvements:
Ratio | Test | Platform | Options | Absolute values (old vs new) | Performance Profiles |
---|---|---|---|---|---|
10% | matrix-react-bench | windows10-64-shippable-qr | fission webrender | 25.77 -> 23.28 | Before/After |
8% | matrix-react-bench | windows10-64-shippable-qr | fission webrender | 25.24 -> 23.35 | Before/After |
For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=40287
Updated•10 months ago
|
Updated•10 months ago
|
Comment 31•10 months ago
|
||
(In reply to Pulsebot from comment #25)
Pushed by sefeng@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/cfed378ac68a
Use weak pointers for LargestContentfulPaint to avoid keeping the element
alive for unnecessary period r=emilio,mccr8
== Change summary for alert #40390 (as of Fri, 24 Nov 2023 11:19:38 GMT) ==
Improvements:
Ratio | Test | Platform | Options | Absolute values (old vs new) | Performance Profiles |
---|---|---|---|---|---|
10% | matrix-react-bench | linux1804-64-shippable-qr | fission webrender | 26.99 -> 24.42 | Before/After |
For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=40390
Comment 32•10 months ago
|
||
John, can you please try and see if the issue is still reproducing on your side?
I tried reproducing on Win10x64 using FF build 121.0a1(20231030164930) from description but it did not reproduce on my machine. Thank you.
Reporter | ||
Comment 33•10 months ago
|
||
I have not seen this problem since the fix rolled out to Nightly.
Comment 34•9 months ago
|
||
== Change summary for alert #40229 (as of Tue, 14 Nov 2023 06:35:30 GMT) ==
Regressions:
Ratio | Test | Platform | Options | Absolute values (old vs new) |
---|---|---|---|---|
10% | damp browser-toolbox.debugger-ready.DAMP | windows10-64-shippable-qr | e10s fission stylo webrender-sw | 917.95 -> 1,010.57 |
5% | damp browser-toolbox.close-process.DAMP | linux1804-64-shippable-qr | e10s fission stylo webrender-sw | 45.72 -> 48.12 |
5% | damp custom.jsdebugger.pretty-print.DAMP | windows10-64-shippable-qr | e10s fission stylo webrender | 1,553.18 -> 1,626.13 |
4% | damp simple.netmonitor.close.DAMP | macosx1015-64-shippable-qr | e10s fission stylo webrender | 6.97 -> 7.23 |
4% | damp custom.jsdebugger.project-search.first-search-result.DAMP | linux1804-64-shippable-qr | e10s fission stylo webrender-sw | 314.04 -> 325.67 |
Improvements:
Ratio | Test | Platform | Options | Absolute values (old vs new) |
---|---|---|---|---|
106% | reload-netmonitor:parent-process objects-with-stacks | linux1804-64-shippable-qr | 10,737.67 -> -660.33 | |
104% | reload-netmonitor:parent-process objects-with-stacks | linux1804-64-qr | 11,000.81 -> -486.67 | |
12% | damp custom.netmonitor.close.DAMP | macosx1015-64-shippable-qr | e10s fission stylo webrender | 22.46 -> 19.71 |
11% | damp custom.netmonitor.close.DAMP | windows10-64-shippable-qr | e10s fission stylo webrender-sw | 26.20 -> 23.28 |
7% | damp server.protocoljs.DAMP | linux1804-64-shippable-qr | e10s fission stylo webrender-sw | 1,086.86 -> 1,008.21 |
For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=40229
Description
•