Open Bug 1941024 Opened 1 month ago Updated 14 days ago

fixed-position content is incorrectly clipped, when placed in a sticky-position container with `overflow` set on ancestor

Categories

(Core :: Web Painting, defect, P3)

defect

Tracking

()

Tracking Status
firefox-esr115 --- wontfix
firefox-esr128 --- fix-optional
firefox134 --- wontfix
firefox135 --- wontfix
firefox136 --- fix-optional

People

(Reporter: dholbert, Assigned: botond)

References

(Depends on 1 open bug, Regression)

Details

(Keywords: leave-open, regression, webcompat:platform-bug)

Attachments

(6 files, 1 obsolete file)

bug 1940990 is a webcompat site-report that's essentially fixed-position content being clipped by virtue of having a sticky-position ancestor.

STR:

  1. Load either of these testcases from bug 1940990:
    testcase 2: https://bugzilla.mozilla.org/attachment.cgi?id=9446913
    testcase 3: https://bugzilla.mozilla.org/attachment.cgi?id=9446916

ACTUAL RESULTS:
The blue thing is clipped. (It's got the correct fixed-position containing block, as you can see from its positioning; but its ancestor that has nondefault overflow inadvertently clips it.)

EXPECTED RESULTS:
The blue thing should not be clipped.

Notes:

  • The regression range here is when we turned on APZ, as noted in bug 1940990 comment 6, so this seems to be a somewhat-fundamental bug in our APZ painting path.
  • WebKit has the same bug, likely because we and WebKit are both performing the same faulty optimimzation.
  • This is specific to fixed-position content with a position:sticky ancestor -- if the ancestor is position:relative instead, then the bug doesn't happen, as shown by e.g. this reference case for testcase 3 which renders properly (no clipping).
Attached file interactive testcase

Here's an interactive testcase based on https://bug1940990.bmoattachments.org/attachment.cgi?id=9446913

This starts out with a tall element that's initially display:none. There's a button you can toggle to show it, and also a button you can toggle to hide/show the scroller (to see what happens if you regenerate frames).

Observations:

  • Everything's fine on first load (when there's no tall element).
  • If you click the first button, to show the tall element, then the bug manifests (the blue thing gets clipped)
  • The bug then stays broken, even if you hide the tall element again, and even if you take the additional step of hiding/showing the scroller (which I would intuitively expect to get us back to some sort of clean slate).

So: when the scroller gets some distance to scroll, that causes some sort of persistent change in the document's APZ state that makes things remain broken, even if I reconstruct frames for the scroller (or even the whole body).

Botond, do you know if this is a known issue with APZ?

If possible, it'd be good to look into investigating fixes, since this is causing clipping on Jira, as noted in the associated webcompat site-report (bug 1940990).

Flags: needinfo?(botond)

(In reply to Daniel Holbert [:dholbert] from comment #3)

Botond, do you know if this is a known issue with APZ?

I can't rule out there being an open bug on file with the same underlying cause / a similar testcase, but I haven't come across such a bug recently.

If possible, it'd be good to look into investigating fixes, since this is causing clipping on Jira, as noted in the associated webcompat site-report (bug 1940990).

Thanks for flagging. I'll mark this as an S2 since the bug it's blocking is an S2.

Severity: -- → S2
Flags: needinfo?(botond)
Priority: -- → P2

While I haven't confirmed this, the symptoms suggest an issue in Gecko or WebRender display list building, i.e. a Web Painting issue.

Whiteboard: [apz-needsdiagnosis]

(In reply to Botond Ballo [:botond] from comment #5)

While I haven't confirmed this, the symptoms suggest an issue in Gecko or WebRender display list building, i.e. a Web Painting issue.

That was my initial guess too.

Note that the regression range points to bug 1143856 as having caused this (that was the pref-flip that enabled APZ), per bug 1940990 comment 6. That's why I filed this as an APZ bug to start. [--> making that official via regressed-by field] But yup, this could still be a Web Painting bug that just only manifests in a world with APZ.

Keywords: regression
Regressed by: apz-linux

(In reply to Daniel Holbert [:dholbert] from comment #6)

this could still be a Web Painting bug that just only manifests in a world with APZ.

Yeah, that's what I was thinking; there are display-list building codepaths that are conditional on APZ being enabled.

A notable one relevant to fixed-position content is here: if APZ is disabled, no one sets a displayport, so the DisplayPortUtils::IsFixedPosFrameInDisplayPort(this) condition will be false, and we may not build a FixedPosition display item at all.

I'm leaving this in Panning and Zooming for now. We'll do some initial diagnosis based on display list dumps, at which point we can move to Web Painting if appropriate.

Set release status flags based on info from the regressing bug 1143856

:kats, since you are the author of the regressor, bug 1143856, could you take a look?

For more information, please visit BugBot documentation.

Flags: needinfo?(kats)

I had a look around for previous bugs concerning the rendering of a fixed element inside a sticky element, but the ones I could find (bug 1214151, bug 1288210, bug 1425565) all concerned the rendering behaviour after (async-)scrolling, whereas in this bug the "static" rendering (i.e. immediately after page load, without any scrolling) is already wrong.

(There is also bug 1791232 which we are actively working on, but that concerns a sticky element inside overflow: clip or clip-path, with no fixed element involved, and also concerns scrolling behaviour rather than static rendering.)

Attached file Gecko display list

Attached is a Gecko display list for the first testcase mentioned in the bug description (https://bugzilla.mozilla.org/attachment.cgi?id=9446913).

The FixedPosition item does not have the offending clip in its own clip chain. However, it is a descendant of the StickyPosition item whose clip chain includes the offending clip.

It's not immediately clear to me whether that alone explains the observed rendering, i.e. whether it's expected that display items are clipped by clips in the clip chains of their ancestor display items.

(In reply to Botond Ballo [:botond] from comment #10)

It's not immediately clear to me whether that alone explains the observed rendering, i.e. whether it's expected that display items are clipped by clips in the clip chains of their ancestor display items.

Talked about this with Markus; the answer is yes, it's expected that the rendering of a display item is subject to clips present in the clip chains of its ancestors.

I think that narrows down the issue to Gecko display list building: for the rendering to be different, we need the Gecko display list for this page to be different.

One potential lead suggested by Markus is to look at how we handle this for Opacity items. Opacity is a container item like StickyPosition, but it doesn't have the clip from the overflow in its own clip chain; rather, that's propagated down to the descendants of the Opacity item (and we avoid propagating it to the FixedPosition descendant). We could try handling StickyPosition like that as well and see if anything breaks.

Attached are a modified testcase that uses opacity: 0.5 rather than position:sticky -- which does not exhibit the clipped rendering bug -- and its Gecko display list.

Note the Opacity item is an ancestor of the FixedPosition item, but its clip chain its empty -- the clips are instead pushed down to its (non-fixed) descendants.

(In reply to Botond Ballo [:botond] from comment #11)

One potential lead suggested by Markus is to look at how we handle this for Opacity items. Opacity is a container item like StickyPosition, but it doesn't have the clip from the overflow in its own clip chain; rather, that's propagated down to the descendants of the Opacity item (and we avoid propagating it to the FixedPosition descendant). We could try handling StickyPosition like that as well and see if anything breaks.

A naive attempt to handle StickyPosition more like Opacity regresses position-sticky-scrolled-clip-1.html.

There may actually be some overlap with bug 1730749 here, in that giving sticky items their own ASR will likely allow modelling the scenario in position-sticky-scrolled-clip-1.html more correctly such that it continues working even with this change.

However, while Dan and I are actively working on bug 1730749, it's likely to still be a while before that will be ready to land. Given that this bug is an S2, it may be worth doing some more investigation to see if we can fix this in a more targeted way that keeps position-sticky-scrolled-clip-1.html working without relying on bug 1730749.

See Also: → 1730
See Also: 17301730749

(In reply to Botond Ballo [:botond] from comment #11)

I think that narrows down the issue to Gecko display list building: for the rendering to be different, we need the Gecko display list for this page to be different.

(This also confirms Web Painting is the right component for this bug --> moving.)

Component: Panning and Zooming → Web Painting
Whiteboard: [apz-needsdiagnosis]

(In reply to Botond Ballo [:botond] from comment #14)

Given that this bug is an S2, it may be worth doing some more investigation to see if we can fix this in a more targeted way that keeps position-sticky-scrolled-clip-1.html working without relying on bug 1730749.

I've done some additional investigation to understand how position-sticky-scrolled-clip-1.html is currently passing, and how the contemplated change to put the clip onto the descendants of the StickyPosition item rather than the item itself regresses it. Here's a summary of my understanding:

  • The sticky element in position-sticky-scrolled-clip-1.html has a "scrolled clip", i.e. a clip that is scrolled by the scroll frame that the sticky element is stuck to, even if the sticky element itself is in the "stuck" position. (This means the clip can move relative to the sticky element during compositing.)
  • In the WebRender display list, clips are associated with a spatial node. Spatial nodes are a superset of ASRs, such that every distinct ASRs gets a distinct spatial node, but there are also spatial nodes that do not correspond to any ASR.
    • Scroll frames get their own spatial node and ASR.
    • Sticky items gets their own spatial node but not their own ASR (until bug 1730749 is resolved).
  • The key to the "scrolled clip" behaving correctly is for the clip to produce a RectClip item in the WebRender display list which is associated with the spatial node of the scroll frame, not the spatial node of the sticky item.
  • When creating a RectClip item corresponding to a clip in a Gecko DisplayItemClipChain, the item's spatial node is computed in two steps.
    • First, the ASR in the DisplayItemClipChain is mapped to a spatial node. The result after this step will always be a spatial node that corresponds to an ASR (e.g. it will never be a sticky spatial node).
    • Second, an "override" is applied to potentially substitute the spatial node from the first step with another. The result after this step may be a spatial node that does not correspond to any ASR (e.g. a sticky spatial node).
  • Sticky items register an override that maps the spatial node of their enclosing scroll frame, to the spatial node of the sticky item itself. This happens in CreateWebRenderCommands(), and is in effect while processing the sticky item's descendants.
  • Clips for a display item are created in ClipManager::SwitchItem(), which is called before CreateWebRenderCommands().
  • So, when the clip is on the sticky item itself, the override has not yet been installed, and the associated spatial node remains the one for the scroll frame. As a result, the clip can correctly move relative to the sticky item.
  • However, when the clip is on the descendants on the sticky item, by that point the override has been installed, and the associated spatial node is now the sticky spatial node. This is the same spatial node as the sticky item itself, and so the clip cannot scroll relative to the item.

Given the above, I'm not sure how we can get both the testcase in this bug and the testcase in position-sticky-scrolled-clip-1.html to work, without fixing bug 1730749. I'm open to ideas/suggestions.

(In reply to Botond Ballo [:botond] from comment #16)

Given the above, I'm not sure how we can get both the testcase in this bug and the testcase in position-sticky-scrolled-clip-1.html to work, without fixing bug 1730749. I'm open to ideas/suggestions.

Discussed this with Markus. We would like to try the following resolution:

  • Make the contemplated change to fix the test case in this bug, accepting the regression to position-sticky-scrolled-clip-1.html temporarily.
  • Accelerate efforts to complete the work in bug 1730749 which would resolve the regression. (We were largely already planning to do this.)
Assignee: nobody → botond

(In reply to Botond Ballo [:botond] from comment #17)

Discussed this with Markus. We would like to try the following resolution: [...]

For what it's worth: the known webcompat impact here seems to have been mitigated on the site itself. (At least, Jeff can no longer repro the original issue he was hitting on Jira, per bug 1940990 comment 12).

That reduces the benefit of taking the short-term fix here... Depending on the anticipated likelihood of trouble from the temporary regression that you mentioned (to position-sticky-scrolled-clip-1.html), maybe at this point it'd be best to hold off on the short-term fix and just directly attack bug 1730749? (until/unless we run into other sites that are broken in the same way that Jira was)

Flags: needinfo?(botond)

(In reply to Daniel Holbert [:dholbert] from comment #19)

For what it's worth: the known webcompat impact here seems to have been mitigated on the site itself. (At least, Jeff can no longer repro the original issue he was hitting on Jira, per bug 1940990 comment 12).

That reduces the benefit of taking the short-term fix here... Depending on the anticipated likelihood of trouble from the temporary regression that you mentioned (to position-sticky-scrolled-clip-1.html), maybe at this point it'd be best to hold off on the short-term fix and just directly attack bug 1730749? (until/unless we run into other sites that are broken in the same way that Jira was)

Yeah, that sounds reasonable. Let's bump this down to an S3 and mark it as depending on bug 1730749.

I'll revise the patch to just add a failing testcase for now, which we'll green up in bug 1730749.

Severity: S2 → S3
Depends on: 1730749
Flags: needinfo?(botond)
Priority: P2 → P3
See Also: 1730749
Attachment #9460617 - Attachment is obsolete: true

That makes sense - thanks for constructing/adding the failing testcase!

Keywords: leave-open
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: