Closed Bug 1280344 Opened 3 years ago Closed 3 years ago

[e10s] Inline-level "position:sticky" element has its lines stomp on each other when scrolled

Categories

(Core :: Layout, defect, P2)

48 Branch
Unspecified
All
defect

Tracking

()

RESOLVED FIXED
mozilla50
Tracking Status
e10s + ---
firefox47 --- unaffected
firefox48 --- fixed
firefox49 --- fixed
firefox50 --- fixed

People

(Reporter: dholbert, Assigned: coyotebush)

References

Details

(Keywords: regression, Whiteboard: [gfx-noted])

Attachments

(5 files, 2 obsolete files)

Attached file testcase 1
STR:
 1. Load attached testcase.
 2. Scroll far down enough so that the text is offscreen; scroll back up.

EXPECTED RESULTS:
The lines of text ("First line", etc) should stay three separate lines, with consistent line spacing.

ACTUAL RESULTS:
* When you scroll down (for the first time at least), the lines of text all collapse into a single superimposed line.
* When you scroll back up, sometimes the lines of text end up separating to be further apart than when they started out.


I can reproduce in Nightly 50.0a1 (2016-06-15) with a fresh Firefox profile. If I disable e10s, the bug goes away. So, I think this is a buggy interaction between position:sticky and our Async Pan & Zoom code.

(This issue was originally reported in bug 1280183 comment 5, but I spun it off to a new bug, in part because that bug's about "contain:paint" whereas there's no contain:paint dependence here.)
Summary: Inline-level "position:sticky" element has its lines stomp on each other when scrolled, with e10s enabled → [e10s] Inline-level "position:sticky" element has its lines stomp on each other when scrolled
NOTE: When the page has painted something wrong, I can force it to fix itself if I do something that triggers a more thorough repaint (e.g. clicking on the page, or switching tabs away and back)
Here's a screencast of testcase 1.

One more note here -- if I change the position:sticky thing to "display:block", then the bug goes away (or becomes less noticeable at least).  So I think it's something to do with the fact that it's an inline-level thing, and we lose track of the per-line offsets when doing async-pan-zoom updates of the scroll position, or something.
I can repro on 48+ on Fennec and OS X. Definitely APZ-related.
Blocks: apz-desktop
Keywords: regression
OS: Unspecified → All
tracking-e10s: --- → ?
I see a similar result on Android while panning. Sounds like another poor interaction between bug 897105 (async sticky) and bug 904197 (continuation-frame sticky).

I guess each sticky frame gets its own layer (which is probably excessive) and then StickyScrollContainer::GetScrollRanges computes incorrect limits for continuations after the first.
Flags: needinfo?(bugs)
Blocks: 1214146
Blocks: 1253860
No longer blocks: 1214146
Thanks for tracking down that regression range! Moving to APZ component so that I don't lose track of it.
Component: Layout → Panning and Zooming
Priority: -- → P2
Whiteboard: [gfx-noted]
Version: Trunk → 48 Branch
Attached patch WIPSplinter Review
I debugged this a bit, and it looks like the normalPosition being added to the inner scroll range is the source of the problem. Removing that (as in the attached patch) fixes the problem. Maybe this normalPosition isn't needed, or should be zero for continuation frames, or something? I don't really know this code very well - Corey, do you know what the right fix here is?
Flags: needinfo?(corey)
Also for the record, bug 1253860 just exposed the pre-existing issue by repainting less frequently. However it would probably have manifested on sufficiently heavy pages even without bug 1253860. Moving this back to Layout because I think the fix needs to be applied in StickyScrollContainer's computation of the scroll ranges, as Corey already pointed out in comment 4. Sorry for the churn.
Component: Panning and Zooming → Layout
Daniel, can you help find an owner for this bug now that it's back over to Layout? I'm not sure how common it is to hit this or whether there are workarounds. But, at this point it sounds like we will ship e10s with this issue in 48 though we could still take a patch.
Flags: needinfo?(dholbert)
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #7)
> Maybe this normalPosition isn't
> needed, or should be zero for continuation frames, or something?

If I use this patch but make the normalPosition non-zero (for all frames) by adding some content above the sticky element, the frames stay together but the end limits of the sticky behavior are wrong. These calculations are a bit tricky and I haven't yet figured out the right fix here.
It occurred to me that the continuation frames should get identical scroll ranges, since the ranges are relative to the layer's position at the start of the scroll. These ranges can be computed using the first continuation (because ComputeStickyLimits takes the others into account). Maybe it would even be possible to compute the ranges once and share them among layers, much as the reflow calculations for continuations are optimized with the PositionContinuations method?

The behavior with this patch looks good to me, although I do still see a 1px shift of all three lines when reflowing after scrolling from one end to the other.
Attachment #8769933 - Flags: review?(dholbert)
Assignee: nobody → corey
Status: NEW → ASSIGNED
Flags: needinfo?(corey)
(In reply to Corey Ford [:coyotebush] from comment #11)
> although I do still see a 1px
> shift of all three lines when reflowing after scrolling from one end to the
> other.

That sounds like bug 1280917.
Comment on attachment 8769933 [details] [diff] [review]
Use first continuation's normal position in computing sticky scroll ranges for APZ

Review of attachment 8769933 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good -- thanks, Corey!

If at all possible, we should include an automated test.  Could you add one to this patch before we get it landed?  (If you've still got Try access, we should give it a Try run to be sure it passes on all platforms, too.)
Attachment #8769933 - Flags: review?(dholbert) → review+
Flags: needinfo?(dholbert)
Flags: needinfo?(corey)
Flags: needinfo?(bugs)
Here's a quick possible piece of a reftest -- this reproduces the buggy rendering for me, when I load it locally.  (though it might need a bit of simplification/cleanup when being shoehorned into a reftest.  E.g. as noted in the testcase, it should use MozReftestInvalidate to trigger the scripts.  Anyway, hopefully this is a good starting point, if not a final testcase.)

(One thing I noticed: the testcase only shows the bug if I use "overflow:scroll" on the scrollable element.  It doesn't happen inside of an "overflow:hidden" element. I assume that's because we don't activate APZ for overflow:hidden elements, since they're not interactively scrollable.)
Attachment #8770017 - Attachment description: strawman automated testcase, for reftest → strawman automated testcase, for reftest (load in a profile with e10s enabled)
(In reply to Daniel Holbert [:dholbert] from comment #14)
> (One thing I noticed: the testcase only shows the bug if I use
> "overflow:scroll" on the scrollable element.  It doesn't happen inside of an
> "overflow:hidden" element. I assume that's because we don't activate APZ for
> overflow:hidden elements, since they're not interactively scrollable.)

That's correct, though you can use the reftest-async-scroll attribute to force APZ for an overflow:hidden element.

You might find the machinery we use for async scrolling reftests useful for your test (see [1], and any reftest in layout/reftests/async-scrolling for an example).

[1] https://dxr.mozilla.org/mozilla-central/rev/88bebcaca249aeaca9197382e89d35b02be8292e/layout/tools/reftest/README.txt#523
Thanks for the help. I was worried it would be hard to test this, and the async scrolling attributes turned out to be necessary -- script scrolling didn't work for me in the reftest environment.

Locally, I still get a slight mismatch in scrollbar rendering. It looks like my hg access has expired, so I'd appreciate it if someone else would run this past Try (presumably for reftest and reftest-e10s?).
Attachment #8770205 - Flags: review?(dholbert)
Attachment #8769933 - Attachment is obsolete: true
Flags: needinfo?(corey)
(In reply to Corey Ford [:coyotebush] from comment #16)
> Locally, I still get a slight mismatch in scrollbar rendering. It looks like
> my hg access has expired, so I'd appreciate it if someone else would run
> this past Try (presumably for reftest and reftest-e10s?).

Passes for me locally. Try push:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=723e2e788b62
Thanks, Corey & Botond!

> Passes for me locally.

(Same for me -- and just as important: it fails without the fix.)
Comment on attachment 8770205 [details] [diff] [review]
Use first continuation's normal position in computing sticky scroll ranges for APZ (with reftest)

Review of attachment 8770205 [details] [diff] [review]:
-----------------------------------------------------------------

Anyway -- the reftest looks good to me, if it passes Try. r=dholbert

(One possible tweak: we probably don't need "overflow:scroll" -- we could use the simpler "overflow:hidden", since the test is already using the APZ reftest machinery.  But, no need to change that at this point, unless the Try run shows scrollbar-related failures.)
Attachment #8770205 - Flags: review?(dholbert) → review+
Unsurprisingly, this test will need a skip-if(!asyncPan) annotation.
Yup, just noticed that as well. I'll trigger another Try run with that change.
Looks like there's a trivial fuzzy mismatch on Android R36, too:
REFTEST TEST-UNEXPECTED-FAIL | http://10.0.2.2:8854/tests/layout/reftests/position-sticky/inline-4.html == http://10.0.2.2:8854/tests/layout/reftests/position-sticky/inline-4-ref.html | (LOAD ONLY), max difference: 1, number of differing pixels: 2

I'll add an annotation for that as well in the second-round try push here.  (The only relevant jobs left from the first-round try push are Win7 R-e10s; hopefully those aren't fuzzy in their own special way.)
(In reply to Daniel Holbert [:dholbert] from comment #22)
> Looks like there's a trivial fuzzy mismatch on Android R36, too:
> REFTEST TEST-UNEXPECTED-FAIL |
> http://10.0.2.2:8854/tests/layout/reftests/position-sticky/inline-4.html ==
> http://10.0.2.2:8854/tests/layout/reftests/position-sticky/inline-4-ref.html
> | (LOAD ONLY), max difference: 1, number of differing pixels: 2
> 
> I'll add an annotation for that as well in the second-round try push here. 
> (The only relevant jobs left from the first-round try push are Win7 R-e10s;
> hopefully those aren't fuzzy in their own special way.)

(I note the mismatch is in the scrollbar. In the async scrolling reftests, we usually avoid these by using overflow:hidden as you described in comment 19.)
Here's the patch again, now with "overflow:hidden" and some other now-necessary reftest APZ attributes to work around the android failure, and with skip-if(!asyncPan) added as well. Try run:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=53cda6efa400
Attachment #8770205 - Attachment is obsolete: true
Attachment #8770271 - Flags: review+
The modifications and the Try results look good to me.
Pushed by dholbert@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/599262eb62c8
Use first continuation's normal position in computing sticky scroll ranges for APZ. r=dholbert
Comment on attachment 8770271 [details] [diff] [review]
fix v3:  Use first continuation's normal position in computing sticky scroll ranges for APZ (with more apz-using reftest)

We should take this on Aurora/Beta, to avoid shipping this as a regression to users when e10s ships.  Requesting uplift.

Approval Request Comment
[Feature/regressing bug #]: This is a bug in our "position:sticky" CSS layout feature, but it's in a piece that's only activated when APZ is enabled. So, regression-wise, this will manifest as a regression "caused" by enabling e10s.

[User impact if declined]: Broken layout on sites that use "position:sticky", when we ship e10s. 

[Describe test coverage new/current, TreeHerder]: We have pretty good test coverage of our "position:sticky" code, and the patch includes a new test, to be sure this bug is fixed.

[Risks and why]: This is low-risk -- just switching which rect we're using for a piece of position:sticky layout. The worst that could happen here is that we'd break position:sticky layout in a different unforseen way (and in that unlikely scenario, the backout would be trivial).

[String/UUID change made/needed]: None.
Attachment #8770271 - Flags: approval-mozilla-beta?
Attachment #8770271 - Flags: approval-mozilla-aurora?
Flags: in-testsuite+
https://hg.mozilla.org/mozilla-central/rev/599262eb62c8
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
Comment on attachment 8770271 [details] [diff] [review]
fix v3:  Use first continuation's normal position in computing sticky scroll ranges for APZ (with more apz-using reftest)

Review of attachment 8770271 [details] [diff] [review]:
-----------------------------------------------------------------

This patch fixes a UI regression. Take it in 48 beta 8 and aurora.
Attachment #8770271 - Flags: approval-mozilla-beta?
Attachment #8770271 - Flags: approval-mozilla-beta+
Attachment #8770271 - Flags: approval-mozilla-aurora?
Attachment #8770271 - Flags: approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.