Flashing button on twitter.com with Fission enabled
Categories
(Core :: Graphics: WebRender, defect, P3)
Tracking
()
Fission Milestone | Future |
People
(Reporter: gregp, Assigned: tnikkel, NeedInfo)
References
(Depends on 1 open bug, Blocks 1 open bug, )
Details
Attachments
(2 files)
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:93.0) Gecko/20100101 Firefox/93.0
Steps to reproduce:
Visit https://twitter.com/linusgsebastian/status/1450220088368328704
Scroll
Actual results:
Flashing Google Sign In button
Expected results:
no flashing Google Sign In button
Comment 1•3 years ago
|
||
The Bugbug bot thinks this bug should belong to the 'Core::Panning and Zooming' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Comment 2•3 years ago
|
||
Regression range for me points to turning on webrender.
There is a div containing the button that toggles between sticky and fixed pos depending on scroll position.
Comment 3•3 years ago
|
||
I can reproduce in Nightly95.0a1 Windows10, but not inFirefox93.
Disabling Fission seems to be fixed in Nightly95.0a1.
Assignee | ||
Comment 4•3 years ago
|
||
Ah, yes you are correct. I was testing in a non-clean profile that had fission enabled. So this seems to be a regression from enabling fission.
Assignee | ||
Updated•3 years ago
|
Comment 5•3 years ago
|
||
Jim, this looks like a WebRender bug when rendering out-of-process iframes when Fission is enabled. We plan to start rolling out Fission in Firefox 94 in two weeks.
I can reproduce in Firefox 93 if I enable fission.autostart
.
Updated•3 years ago
|
Updated•3 years ago
|
Comment 6•3 years ago
|
||
I was able to reproduce this and so far it looks like Gecko is supplying incorrect display lists when the flickering occurs. With fission enabled, the "Sign up with Google" button is an iframe in the WR content display list, whereas the other sign up buttons are normal content within the main display list.
Logging out WR API calls to set_display_list
and scene building normally looks like:
!!! set_display_list PipelineId(1, 13) Epoch(5) 480.0x66.0 s=3656
push_iframe 480.0x66.0 l=3089 i=3089 c=185 s=382
pop_iframe 480.0x66.0
push_iframe 480.0x66.0 l=3089 i=3089 c=185 s=382
pop_iframe 480.0x66.0
push_iframe 480.0x66.0 l=3089 i=3089 c=185 s=382
pop_iframe 480.0x66.0
push_iframe 480.0x66.0 l=3089 i=3089 c=185 s=382
pop_iframe 480.0x66.0
push_iframe 480.0x66.0 l=3089 i=3089 c=185 s=382
pop_iframe 480.0x66.0
push_iframe 480.0x66.0 l=3089 i=3089 c=185 s=382
pop_iframe 480.0x66.0
In this case, we can see that the sign up iframe has a viewport size of 480 x 66, and a payload length of 3356 bytes. Then, scene building proceeds and traverses that iframe.
However, when the flickering occurs, the logging shows:
!!! set_display_list PipelineId(1, 13) Epoch(6) 0.0x0.0 s=520
push_iframe 480.0x66.0 l=185 i=185 c=185 s=150
pop_iframe 480.0x66.0
push_iframe 480.0x66.0 l=185 i=185 c=185 s=150
pop_iframe 480.0x66.0
In these cases, the display list for the iframe have a viewport rect of 0x0, and only 520 bytes in the DL payload array (effectively an empty display list).
I'll dig a bit more in to the Gecko side, but if the analysis above is correct this will probably need someone who knows the Gecko/fission DL code better than me to diagnose further.
Comment 7•3 years ago
|
||
I confirmed that Gecko is explicitly clearing (emptying) the display list for that iframe pipeline in this scenario at [1].
This call clears out the display list for that iframe, which means it flickers off in this case since it's still referenced by the parent iframe content display list.
As a test, I commented out the call to ClearDisplayList
in that function. With this change, the main content of the iframe button stops flickering. However, the SVG image of the google icon does still flicker, which probably means it has the same lifetime and is being cleared / removed from WR as a (blob) image resource.
I don't know the code in RecvClearCachedResources
at all, but it seems likely that this shouldn't be getting called at all, since the iframe in question is still on-screen and referenced by the parent iframe when it gets cleared.
NI'ing a few people who might know more about this and what the right fix is?
[1] https://searchfox.org/mozilla-central/source/gfx/layers/wr/WebRenderBridgeParent.cpp#1893
Updated•3 years ago
|
Assignee | ||
Comment 8•3 years ago
|
||
Thanks for debugging Glenn. (I didn't intend my previous comments to imply that this was a problem inside webrender.)
(In reply to Timothy Nikkel (:tnikkel) from comment #2)
There is a div containing the button that toggles between sticky and fixed pos depending on scroll position.
Based on your description, I think what is happening is that when we change between sticky and fixedpos we destroy and re-create the subdocument frame. The destroy happens immediately I think, and the recreate happens at the next refresh driver tick (or sooner if something forces a layout flush). nsSubdocumentFrame::Destroy will call DestroyDisplayItemDataForFrames (via BeginSwapDocShellsForViews) which will call UpdateEffects
this will eventually get into BrowserChild::UpdateVisibility and then MakeHidden which will call ClearCachedResources. Then we have to wait until the next refresh driver tick for the subdocument frame to get recreated and make a display list and send it over.
Hmm, thinking of how to solve this...the important part of DestroyDisplayItemDataForFrames is to destroy the stale stored display item data. I think maybe we can delay the UpdateEffects call until either nsHideViewer::Run or the next refresh driver tick. Although I'm not completely sure of all of the things that ClearCachedResources does and if it's okay to slightly delay that here.
Assignee | ||
Comment 9•3 years ago
|
||
(In reply to Timothy Nikkel (:tnikkel) from comment #8)
Hmm, thinking of how to solve this...the important part of DestroyDisplayItemDataForFrames is to destroy the stale stored display item data. I think maybe we can delay the UpdateEffects call until either nsHideViewer::Run or the next refresh driver tick. Although I'm not completely sure of all of the things that ClearCachedResources does and if it's okay to slightly delay that here.
There are a couple other paths where we call ClearCachedResources when destroy the display item data, but with that understanding a patch like I proposed does fix the problem. I'll work my patch into something landable.
Comment 10•3 years ago
•
|
||
Hmm, I did not expected that BrowserChild::MakeHidden() is called frequently during scrolling.
As a simpler fix, we might defer ClearCachedResources() call when BrowserChild is used as iframe.
Assignee | ||
Comment 11•3 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #10)
Hmm, I did not expected that BrowserChild::MakeHidden() is called frequently during scrolling.
It's not the scrolling that does it, the page changes the position css value of the iframe when the page moves from/to 0 scroll position, this causes us to destroy and create the layout frame containing the iframe, and destroy the layout frame calls ClearCachedResources, which we should try to avoid. We already avoid reconstructing the child document in this case, so this is a similar type of fix.
Updated•3 years ago
|
Assignee | ||
Comment 12•3 years ago
|
||
Depends on D135995
Updated•3 years ago
|
Reporter | ||
Comment 13•2 years ago
|
||
Reproduces in Firefox Nightly 114.0a1 (2023-04-14)
Updated•10 months ago
|
Updated•10 months ago
|
Description
•