Closed Bug 1698693 Opened 3 years ago Closed 3 years ago

Audit APZ-related callers of GetCrossDocParentFrame()

Categories

(Core :: Panning and Zooming, task)

task

Tracking

()

RESOLVED FIXED
91 Branch
Fission Milestone M7a
Tracking Status
firefox91 --- fixed

People

(Reporter: botond, Assigned: botond)

References

(Blocks 1 open bug)

Details

Attachments

(3 files)

In this bug, I'm going to audit the APZ-related callers of GetCrossDocParentFrame(), and migrate them to GetCrossDocParentFrameInProcess() as appropriate.

This is... quite the rabbit hole. For instance, should SetZeroMarginDisplayPortOnAsyncScrollableAncestors() set displayports on cross-process ancestors? (If so, we'd need new IPC protocols to support that.)

(In reply to Botond Ballo [:botond] from comment #1)

For instance, should SetZeroMarginDisplayPortOnAsyncScrollableAncestors() set displayports on cross-process ancestors?

Is this function even needed anymore (in Fission mode), now that every async-scrollable frame gets a minimal displayport (since bug 1675547)?

Yeah, this can really rabbit hole.

(In reply to Botond Ballo [:botond] from comment #2)

(In reply to Botond Ballo [:botond] from comment #1)

For instance, should SetZeroMarginDisplayPortOnAsyncScrollableAncestors() set displayports on cross-process ancestors?

Is this function even needed anymore (in Fission mode), now that every async-scrollable frame gets a minimal displayport (since bug 1675547)?

Not really needed in fission mode no, the minimal displayports should make sure that we don't have to activate a scrollframe after descending into it.

I purposely didn't touch SetZeroMarginDisplayPortOnAsyncScrollableAncestors when adding minimal displayports to keep the scope reasonable. I think there are still a few open questions about the minimal display port approach so I want them to bake a bit more before we refactor in this area. It seems like all the regressions we've found in apz lately get reported 3-4 months after landing, and fission is still getting rolled out to a wider audience, so I'm not ready to stamp the minimal displayport approach with the "working well" stamp yet.

Component: Layout → Panning and Zooming

Let's track this for M7a, like the bug it blocks.

Fission Milestone: --- → M7a
Blocks: apz-fission

Botond, gentle reminder about this M7a audit bug assigned to you.

Flags: needinfo?(botond)

I've spent some time analyzing the APZ-related callers of this function. Often the callers were fairly generic so I had to analyze their callers in turn, going back several levels. I've also looked at some non-APZ-related callers that were in code I was familiar with.

This analysis is not exhaustive (some "branches" of the tree remain to be investigated), but here's what I've got so far.

The comments make somewhat frequent reference to BrowserChild::mChildToParentConversionMatrix, which I'll abbreviate as the "CTP".

  • nsDOMWindowUtils.cpp
  • nsDisplayList.cpp
  • nsIFrame.cpp
    • GetTransformMatrix (and other call sites in that function): I think this is fine (transforms between processes should be handled by the CTP), but it wouldn't hurt to audit the callers of this function which pass aStopAncestor.mFrame == null.
  • nsLayoutUtils.cpp
    • IsAncestorFrameCrossDoc: This is fine, the result is compared to a frame provided as input (and therefore in-process).
    • FindNearestCommonAncestorFrame: Likewise.
    • GetDisplayRootFrame: This is fine in the sense that the "display-root frame" is by definition in-process. However, it wouldn't hurt to audit the callers of this and make sure we don't have any that check something like <is content> && <is display root> and assume that therefore they have a cross-process RCD.
    • GetEventCoordinatesRelativeTo: This is fine, transforms between processes are handled by the CTP.
    • GetCumulativeApzCallbackTransform: This should be handled by the CTP but we should double-check that the callback transforms are included in the CTP.
    • GetNearestScrollableFrameForDirection: The call sites of this function take a focused element, and want to know what directions the nearest enclosing scrollframe can be scrolled via keyboard input. So, we need to check whether an element focused inside an OOP iframe can keyboard-scroll a scrollframe enclosing the iframe. If so, these call sites may need fixing.
    • GetParentOrPlaceholderFrameCrossDoc
      • SetZeroMarginDisplayPortOnAsyncScrollableAncestors: This function has largely been obsoleted by the "minimal displayport" work, in that the scroll frames on which is sets zero-margin displayports should already have minimal displayports (in Fission mode), and changing those to zero-margin displayports should not be necessary. There is some cleanup to be done here (to not call this function at all in Fission mode), but I don't think the cleanup needs to block anything (in the intermediate state, we'll get zero-margin displayports on in-process ancestors and minimal displayports on out-of-process ancestors, which should be harmless).
      • ScrollFrameWillBuildScrollInfoLayer: This is fine in that it faithfully mimics the behaviour of ShouldBuildScrollInfoItemsForHoisting like the comment in the function says it should. (The latter operates on the display list, and therefore only on in-process content.) However, it may not hurt to review whether ShouldBuildScrollInfoItemsForHoisting behaves correctly with Fission.
      • (note: this function has a number of other non-APZ call sites which I didn't examine)
    • GetNearestScrollableOrOverflowClipFrame
  • DisplayPortUtils.cpp
  • PresShell.cpp
    • ScrollFrameRectIntoView: This has already been ported to Fission (the function uses PBrowser a few lines below to message the enclosing content process to scroll if necessary).
Flags: needinfo?(botond)

What a great diagnose and summary!

(In reply to Botond Ballo [:botond] from comment #6)

CanThrottleOverflowChangesInScrollable: This is animation-related, and it looks to me like it may need adjustment for Fission.

This was done in bug 1541705, and helper_fission_animation_styling_in_transformed_oopif.html is an automated test.

(In reply to Botond Ballo [:botond] from comment #6)

  • GetBoundingFrameRect: The callers of this are in zoom-to-focused-input code (Android-only, see above), and double-tap-to-zoom code. For the latter, we have known Fission work to do (bug 1713715) and calls to this function are one of the things we should examine.

On second thought, I don't think there's any reason this function should ever be considering scroll frames in enclosing documents. The purpose is to find the nearest enclosing scroll frame that clips the input frame. If nothing else, the current document's root scroll frame will do that, we just have to pass the right flags to make sure it's always found (SCROLLABLE_INCLUDE_HIDDEN and SCROLLABLE_FIXEDPOS_FINDS_ROOT).

(In reply to Botond Ballo [:botond] from comment #6)

  • GetCurrentItemAndPositionForElement: This could probably use a second look from someone more familiar with the code, but it looks fairly benign to me (the result is used to compute a line-height; seems unlikely that correctness depends on getting the line-height from an enclosing-process scroll frame).

I think similar reasoning to comment 8 applies here: the code just wants an nsIScrollableFrame to call GetLineScrollAmount() on. The current document's root scroll frame should work fine for that, even if it's overflow: hidden or if the starting frame is in fixed content (the two conditions under which the current call might ascend into the enclosing document to look for a scroll frame).

(In reply to Botond Ballo [:botond] from comment #6)

  • SetZeroMarginDisplayPortOnAsyncScrollableAncestors: This function has largely been obsoleted by the "minimal displayport" work, in that the scroll frames on which is sets zero-margin displayports should already have minimal displayports (in Fission mode), and changing those to zero-margin displayports should not be necessary. There is some cleanup to be done here (to not call this function at all in Fission mode), but I don't think the cleanup needs to block anything (in the intermediate state, we'll get zero-margin displayports on in-process ancestors and minimal displayports on out-of-process ancestors, which should be harmless).

Filed bug 1714717 to track this cleanup.

See Also: → 1714720

(In reply to Botond Ballo [:botond] from comment #6)

  • GetNearestScrollableFrameForDirection: The call sites of this function take a focused element, and want to know what directions the nearest enclosing scrollframe can be scrolled via keyboard input. So, we need to check whether an element focused inside an OOP iframe can keyboard-scroll a scrollframe enclosing the iframe. If so, these call sites may need fixing.

This is indeed an affected scenario; filed bug 1714720 for it.

I would say this is the first actual bug turned up by this audit.

(In reply to Botond Ballo [:botond] from comment #6)

I've verified this by checking the following STR:

  • Load https://hsivonen.fi/fission-host.html in Fission mode.
  • Pinch-zoom in, to create an offset between the visual and layout viewports.
  • Click one of the buttons inside the first iframe.

That the click correctly targets the button via the main-thread hit-test in the nested content process tells us that the callback transform (which stores the offset between the visual and layout viewports) is correctly reflected in the CTP.

This also matches my understanding of the code: the CTP originates on the compositor side in SendSubtreeTransformsToChromeMainThread, which uses GetTransformToGecko, which includes GetTransformToLastDispatchedPaint, which includes the offset between the visual and layout viewports here.

See Also: → 1714721

(In reply to Botond Ballo [:botond] from comment #6)

Spun these cases out into bug 1714721.

(In reply to Botond Ballo [:botond] from comment #6)

  • GetTransformMatrix (and other call sites in that function): I think this is fine (transforms between processes should be handled by the CTP), but it wouldn't hurt to audit the callers of this function which pass aStopAncestor.mFrame == null.

This function has quite a call tree as well, but I think I've tracked down the handful of callers that pass aStopAncestor.mFrame == null:

  • nsComboboxControlFrame::GetCSSTransformTranslation(): This seems to be fine based on bug 1525561 comment 5.
  • ScrollFrameRectIntoView, both directly and indirectly via TransformFrameRectToAncestor(): I believe these are fine as they're called in the IsTransformed() case, so the loop in GetTransformMatrix() is not going to ascend far in the frame tree.
  • This call (recursive via GetTransformToAncestor()): Not quite sure what to make of this one; has to do with popups. Could use a second opinion.
  • GetEventCoordinatesRelativeTo (via TransformRootPointToFrame and other intermediaries): This corresponds to the GetCrossDocParentFrame() call in this function, so if that's fine then so is this.

I should add that, while thinking about some of the callers of this function, I ended up playing around with Henri's Fission test page, and discovered bug 1715187. I don't know that this bug necessarily implicates GetTransformMatrix() or any of its callers, but it might; more generally, it might implicate some other functions which I've described above as being "handled by the CTP".

(In reply to Botond Ballo [:botond] from comment #6)

  • GetDisplayRootFrame: This is fine in the sense that the "display-root frame" is by definition in-process. However, it wouldn't hurt to audit the callers of this and make sure we don't have any that check something like <is content> && <is display root> and assume that therefore they have a cross-process RCD.

This funtion has a lot of callers, most of them are not APZ-related, and this bug already has a fairly large scope, so I spun this out into bug 1715190.

(In reply to Botond Ballo [:botond] from comment #14)

  • GetTransformMatrix (and other call sites in that function): I think this is fine (transforms between processes should be handled by the CTP), but it wouldn't hurt to audit the callers of this function which pass aStopAncestor.mFrame == null.

This function has quite a call tree as well, but I think I've tracked down the handful of callers that pass aStopAncestor.mFrame == null:

[...]

  • This call (recursive via GetTransformToAncestor()): Not quite sure what to make of this one; has to do with popups. Could use a second opinion.

Timothy, would you like to offer an opinion on this?

Flags: needinfo?(tnikkel)

(In reply to Botond Ballo [:botond] from comment #16)

(In reply to Botond Ballo [:botond] from comment #14)

  • GetTransformMatrix (and other call sites in that function): I think this is fine (transforms between processes should be handled by the CTP), but it wouldn't hurt to audit the callers of this function which pass aStopAncestor.mFrame == null.

This function has quite a call tree as well, but I think I've tracked down the handful of callers that pass aStopAncestor.mFrame == null:

[...]

  • This call (recursive via GetTransformToAncestor()): Not quite sure what to make of this one; has to do with popups. Could use a second opinion.

Timothy, would you like to offer an opinion on this?

That code is only used for popup's that are list control frames, this should only exist when using the old in-process implementation of comboboxes. So we don't need to fix or audit it.

Flags: needinfo?(tnikkel)
See Also: → 1715693

(In reply to Botond Ballo [:botond] from comment #6)

  • ScrollFrameWillBuildScrollInfoLayer: This is fine in that it faithfully mimics the behaviour of ShouldBuildScrollInfoItemsForHoisting like the comment in the function says it should. (The latter operates on the display list, and therefore only on in-process content.) However, it may not hurt to review whether ShouldBuildScrollInfoItemsForHoisting behaves correctly with Fission.

Scenarios that need to take the hoisting codepath are indeed broken with Fission. However:

  • With WebRender, only some edge cases are affected. Basically, you need a page with an OOP iframe inside an SVG filter which is not supported by WebRender in the compositor. (An example I found with Jeff's help was filters using the <feMorphology> filter effect.)
  • The breakage for the affected cases seems to go beyond scrolling-related issues; rather, the rendering is broken completely. Which is to say, even if ShouldBuildScrollInfoItemsForHoisting() magically returned the correct value in such cases (i.e. inside the nested content process, it was somehow aware of the filter in the enclosing content process), the rendering would still be broken.

I filed bug 1715693 with a testcase.

See rationale in bug 1698693 comment 8.

The added flags ensure we stop at the RSF even if it's overflow:hidden,
or if the starting frame is in fixed content.

See rationale in bug 1698693 comment 9.

The added flags ensure we stop at the RSF even if it's overflow:hidden,
or if the starting frame is in fixed content.

Depends on D117387

(In reply to Botond Ballo [:botond] from comment #6)

  • IsAncestorFrameCrossDoc: This is fine, the result is compared to a frame provided as input (and therefore in-process).

Theoretically we could have two frames in the same process, but two different documents that are in the same document tree, that are separated by an oop document in between them.

(In reply to Timothy Nikkel (:tnikkel) from comment #22)

(In reply to Botond Ballo [:botond] from comment #6)

  • IsAncestorFrameCrossDoc: This is fine, the result is compared to a frame provided as input (and therefore in-process).

Theoretically we could have two frames in the same process, but two different documents that are in the same document tree, that are separated by an oop document in between them.

Good catch!

That said, it does look like the callers of that function have been audited separately in bug 1523439, so I don't think we need to do anything further here.

Pushed by bballo@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9868ef8a028e
Make the GetNearestScrollableFrame() call in GetBoundingFrameRect() stop at the document's root scroll frame. r=tnikkel
https://hg.mozilla.org/integration/autoland/rev/68cd2e076d61
Make the GetNearestScrollableFrame() call in GetCurrentItemAndPositionForElement() stop at the root scroll frame. r=tnikkel
https://hg.mozilla.org/integration/autoland/rev/0c119e60471c
Convert audited calls to GetCrossDocParentFrame() to GetCrossDocParentFrameInProcess(). r=tnikkel
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: