[Fission] Autoscroll doesn't work with (cross-origin?) iframes
Categories
(Core :: Panning and Zooming, defect)
Tracking
()
People
(Reporter: mmis1000, Assigned: u608768)
References
(Blocks 2 open bugs)
Details
(Whiteboard: fission-soft-blocker)
Attachments
(4 files, 1 obsolete file)
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0
Steps to reproduce:
- Open css snap example at https://developer.mozilla.org/en-US/docs/Web/CSS/scroll-snap-type
- middle click on one of the example, make the cursor show the scroll indicator
- drag the mouse to make it scroll
Actual results:
It didn't scroll at all.
Expected results:
It scrolls normally (in proximity example)
It scrolls to next anchor (in mandatory example)
Juts like what chrome did
![]() |
||
Comment 1•4 years ago
•
|
||
There are two problems.
#1 Middle-click scrolling works after the second click. (see Bug 1719031)
This seems to be implementation bug of Firefox. Because the #1 problem can be reproduced in Firefox 68.
#2 Middle-click scrolling does not work at all, if Fission
Nightly experiments is enabled.
I set tracking flags for the #2 problem.
Comment 2•4 years ago
|
||
It works fine both on Linux and Mac. I don't think scroll-snap-type is related.
Updated•4 years ago
|
Comment 3•4 years ago
|
||
Comment 4•4 years ago
|
||
Comment 5•4 years ago
|
||
I was able to reproduce on my end. I'll use mozregression to provide the regression range ASAP.
Best,
Clara
Comment 6•4 years ago
|
||
:mmis1000, since this bug is a regression, could you fill (if possible) the regressed_by field?
For more information, please visit auto_nag documentation.
It seems that breaks on all version as long as web fission is enabled.
So it is regressed by "web fission".
If you add fission.autostart
to the moz-regression
config ,then no build of firefox world scroll correctly on css snap
enabled container
Config used
URL: https://developer.mozilla.org/en-US/docs/Web/CSS/scroll-snap-type
Start date: 2019/07/03
End date: 2021/07/03
Prefs:
fission.autostart true
gfx.webrender.all true
according to https://wiki.mozilla.org/Project_Fission
Reporter | ||
Comment 10•4 years ago
|
||
Consider it breaks even on scroll-snap-type: none;
probably it breaks on all container?
But I can auto scroll on youtube side bar just fine, that really don't make sense.
Reporter | ||
Comment 11•4 years ago
|
||
Probably what breaks it actually is iframe?
It turns out MDN wraps every examples in an iframe.
Reporter | ||
Comment 12•4 years ago
|
||
https://output.jsbin.com/tojemet
This issue can be reproduced with a iframe to //example.com/
under any version of firefox with web fission enabled
Updated•4 years ago
|
Comment 13•4 years ago
|
||
(CCing masayuki, since he might recall a thing or two about autoscroll.)
Comment 14•4 years ago
•
|
||
I can't reproduce this. If I enable middle clicking on Windows, autoscroll works in Fission. Or is this bug only about css snap or about cross-origin iframes? I think the latter.
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Comment 15•4 years ago
|
||
According to the debugger, JSActor::SendQuery
in AutoScrollChild.startScroll
returned false
and null
. But I see the autoscroller popup. So, I guess that AutoScrollParent
does not return the result to proper child process.
Gijs: WDYT?
Comment 16•4 years ago
|
||
(In reply to Masayuki Nakano [:masayuki] (he/him)(JST, +0900) from comment #15)
According to the debugger,
JSActor::SendQuery
inAutoScrollChild.startScroll
returnedfalse
andnull
I'm confused - which part returns null
? None of the return statement in this codepath appear to ever return null.
. But I see the autoscroller popup. So, I guess that
AutoScrollParent
does not return the result to proper child process.Gijs: WDYT?
I don't really have thoughts based on this information? Is there a specific question you were hoping I was able to answer? My understanding is that the sendQuery
implementation lives in DOM/IPC land, and I would be... surprised... if messages could be returned to the wrong process / frame / actor, as I would have expected this to have been a problem before now, and discovered and fixed. But perhaps there's some kind of edge case...
Are there exceptions in the browser console when this problem occurs? Can't you also stop the browser debugger in the parent process side, ie in AutoscrollParent, and find out what is happening there?
Comment 17•4 years ago
|
||
Has this ever worked with Fission? 2019-12-29 is broken, yet it should have bug 1597765.
Testing https://output.jsbin.com/tojemet from comment 12.
Assignee | ||
Comment 18•4 years ago
•
|
||
I think there's something weird going on with preallocated processes here. Autoscrolling the frame works fine in these conditions:
- Open https://output.jsbin.com/tojemet while I have an https://example.com tab open
- Open https://output.jsbin.com/tojemet and refresh the page
Assignee | ||
Comment 19•4 years ago
•
|
||
In the broken case, when we attempt to find a target apzc, we appear to have two identical guids: one for the root and one for the subframe. The root is ahead of the subframe in the node list, so we return that, and then autoscroll the root instead of the subframe (adjust the window to be super short to see this). (edit: after debugging some more, I realized this is wrong: we grab the toplevel LayersID from the RemoteLayerTreeOwner, and use that to find the target apzc, which is a problem, but don't actually have duplicate ScrollableLayerGuids).
Now in the working case from comment #18, we just don't find an apzc, so we're not actually using apz to do the scroll, which feels like a separate problem? Either way, in both cases it sounds like there's some bugs with how we're finding the scrollable layer for out of process frames.
Comment 20•4 years ago
|
||
That actually matches what I was suspecting during skimming the front-end code. In Fission, to identify a scrollable frame we need a pair of a layersId (https://searchfox.org/mozilla-central/rev/77f0b36028b2368e342c982ea47609040b399d89/dom/interfaces/base/nsIDOMWindowUtils.idl#2215) and a scrollId, but the frond-end code doesn't use layersId at all.
Updated•4 years ago
|
Comment 21•4 years ago
|
||
(In reply to Hiroyuki Ikezoe (:hiro) from comment #20)
In Fission, to identify a scrollable frame we need a pair of a layersId (https://searchfox.org/mozilla-central/rev/77f0b36028b2368e342c982ea47609040b399d89/dom/interfaces/base/nsIDOMWindowUtils.idl#2215) and a scrollId, but the frond-end code doesn't use layersId at all.
Gijs, this sounds like a front-end Fission bug (front-end needs to track APZ's layersId
?). How do I flag this bug for the Front-end team to triage and prioritize?
Do you think this bug should be fixed for Fission MVP?
Comment 22•4 years ago
|
||
(In reply to Chris Peterson [:cpeterson] from comment #21)
(In reply to Hiroyuki Ikezoe (:hiro) from comment #20)
In Fission, to identify a scrollable frame we need a pair of a layersId (https://searchfox.org/mozilla-central/rev/77f0b36028b2368e342c982ea47609040b399d89/dom/interfaces/base/nsIDOMWindowUtils.idl#2215) and a scrollId, but the frond-end code doesn't use layersId at all.
Gijs, this sounds like a front-end Fission bug (front-end needs to track APZ's
layersId
?).
I'm not an APZ expert, but (so?) I don't think this would be fixed any faster by front-end people... It looks like the API for autoscroll's layout support ( https://searchfox.org/mozilla-central/rev/a14ecd829fdb1e9780b7c100016c6a3d951cf7da/dom/interfaces/base/nsIRemoteTab.idl#94 ) doesn't take a layers ID, but takes a presshell id. Should that be replaced with the layers ID, or is this an addition? How should the implementation of that method change? All those are probably still questions for the dom/layout/apz folks, and if these APIs need to be made fission-aware, I'm not sure that's a job for the front-end folks. Passing the relevant ID around from the JS in the child to the JS in the parent on the "front-end" side (either in addition or replacing the presshell ID) is probably the trivial part of such a change. In fact, it also seems possible to me that the presshell ID should already be mappable to a layers ID on the apz implementation side? But then, that's because I don't know what a layersId
really is...
How do I flag this bug for the Front-end team to triage and prioritize?
I'm sorry but we don't have a good way of doing this - there is no one "front-end team" and autoscroll isn't actively owned by anyone.
Do you think this bug should be fixed for Fission MVP?
I don't know what the criteria are for Fission MVP, or how to find out. I also don't know how frequently autoscroll is used, and of that usage, how frequently it is used in iframes. I checked, and AFAICT we don't have any telemetry that would help us judge this, either. I also don't know what the severity of other fission MVP or other milestone bugs is and how this should be prioritized in a relative fashion. :-(
Bouncing the needinfo to let you make a call on priority.
Comment 23•4 years ago
|
||
The layersId basically identifies the child process; it's not implied by the presShellId, it needs to be provided separately.
I had a brief look, and it looks like the current flow for starting an autoscroll is:
- AutoScrollChild
- (cross child process --> parent process boundary)
- AutoScrollParent
- browser-custom-element.js
- nsIRemoteTab::StartApzAutoscroll (implemented in BrowserHost)
- BrowserParent::StartApzAutoscroll
- nsIWidget::StartAsyncAutoscroll
- (additional steps to get to APZ)
Currently, the layersId
is picked up at step 6, here. Note that this is in the parent process.
I think that may have been correct before Fission, but with Fission, I think the information about which child process in the affected tab started the autoscroll is lost by that point.
So, I think the solution here is to pick up the layersId
at step 1, using nsIDOMWindowUtils.getLayersId() and propagate it through to step 6.
Comment 24•4 years ago
|
||
And I agree with Gijs that based on the investigation so far, this issue is APZ-ish :)
My 2c on prioritization is that we should try to fix this for MVP. If the incorrect layersId is the only issue, the fix should be fairly straightforward.
Updated•4 years ago
|
Comment 25•4 years ago
•
|
||
Assigning this bug to Kashav because he says he has a patch.
Assignee | ||
Comment 26•4 years ago
|
||
And propagate it through to BrowserParent::StartApzAutoscroll.
Assignee | ||
Comment 27•4 years ago
|
||
These don't have a PBrowser actor, so we have to fetch the layersId and start
the autoscroll directly on the widget.
Depends on D120766
Updated•4 years ago
|
Updated•4 years ago
|
Comment 28•4 years ago
|
||
This bug is a soft blocker for Fission MVP. We'd like to fix it before our Release channel rollout, but we won't delay the rollout waiting for it.
Comment 29•4 years ago
|
||
Comment 30•4 years ago
|
||
Backed out for causing clang build bustages
Backout link: https://hg.mozilla.org/integration/autoland/rev/cbc19180f9005d0f8254fef315401fc52180a346
Updated•4 years ago
|
Comment 31•4 years ago
|
||
Comment 32•4 years ago
|
||
bugherder |
Description
•