Fennec displays blank page on Gitter.im content, after scrolling up/down several times

RESOLVED FIXED in Firefox 52

Status

()

Firefox for Android
Toolbar
P2
normal
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: TeoVermesan, Assigned: kats)

Tracking

50 Branch
Firefox 52
ARM
Android
Points:
---

Firefox Tracking Flags

(firefox49 wontfix, firefox50 wontfix, firefox51 wontfix, firefox52 verified)

Details

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(2 attachments)

(Reporter)

Description

a year ago
Tested using:
Build: Firefox for Android 50 beta 1
Device: ONE A2001 (Android 5.1.1)

Steps to reproduce:
1. Go to https://gitter.im/RigsOfRods/rigs-of-rods
2. Scroll up/down several times

Actual results:
- A blank page is displayed

Expected results:
- Page should be correctly displayed

Note: 
-See also: Bug 1278390 - Fennec doesn't display Gitter.im content, after scrolling up/down
-Please see the following video: https://www.youtube.com/watch?v=txeUc2OtGJk&feature=youtu.be
And just to confirm, does this affect Aurora/Nightly as well?
status-firefox49: --- → wontfix
status-firefox50: --- → affected
status-firefox51: --- → ?
status-firefox52: --- → ?
Priority: -- → P2
(Reporter)

Comment 2

a year ago
Yes. I reproduced this on latest Nightly and Aurora, too
status-firefox51: ? → affected
(Reporter)

Updated

a year ago
status-firefox52: ? → affected
I can reproduce this. It seems to happen when you fling upwards such that you reach the top of the page and new content gets loaded, the second time. That is:

- scroll down
- fling to the top and wait for some new content to load
- scroll down
- fling to the top and wait for new content to load

With the minimap on it looks like the scroll position/viewport is at y=0 but the displayport is farther down. I think we should be requesting a new repaint at some point in this process so I'll investigate to see what's happening.
Assignee: nobody → bugmail
It looks like we're going into the codepath at [1], which doesn't trigger a repaint and so leaves the displayport at an inconsistent spot relative to where APZ thinks the scroll position is. I suspect the real problem is that we're getting an eRestore type scroll update when we shouldn't be, because the main-thread scroll position actually changed.

[1] http://searchfox.org/mozilla-central/rev/3411332d29d377bf86405fc3ea72461ea0cb0295/gfx/layers/apz/src/AsyncPanZoomController.cpp#3357
Created attachment 8797253 [details] [diff] [review]
WIP

Something like this seems to fix the problem. I want to try and write a test first though.
Try push with new test at https://treeherder.mozilla.org/#/jobs?repo=try&revision=d8d7660229a5e3834447df66c53e455b62910a2f
That's failing some tests. I'll have to try something else.
I investigated the test failure and I think I understand why it's happening. My patch assumed that mLastScrollOrigin would get cleared when the scroll origin was flushed out to the compositor, but that's not the case. It actually gets cleared only after the APZ code sends back an acknowledgement of the scroll position update. This means we cannot, on the main thread, easily distinguish between two scenarios:
a) JS sets scroll position and before the scroll update is flushed to the compositor, there is a frame reconstruction
b) JS sets scroll position, the scroll update is flushed to the compositor, but before the acknowledgement is sent/processed, there is a frame reconstruction

In case (a) we don't want to update the scroll origin to nsGkAtoms::restore but in case (b) we do. Fixing this would require adding additional state which I'm a little reluctant to do right now, even though it would fix a real bug (which the test in the try push above demonstrates).

Another fix for the user-visible issue in comment 0 is to simply trigger a repaint in the scenario in question, so that we're not left in a perma-checkerboard state. That will at least improve the user experience even though it doesn't fix the whole problem - I can defer that to a followup bug.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #8)
> Another fix for the user-visible issue in comment 0 is to simply trigger a
> repaint in the scenario in question, so that we're not left in a
> perma-checkerboard state. That will at least improve the user experience
> even though it doesn't fix the whole problem - I can defer that to a
> followup bug.

I tried this and it also requires additional state and hacking around so it's not really much better. (My fix was to set needsContentRepaint=true in the condition at [1], but that failed because the scroll-updated flag isn't set on the repaint request, causing it to get ignored at [2].

Given that my simpler solution isn't really simpler I'll go back to my first approach and try to find a way to do it properly.

[1] http://searchfox.org/mozilla-central/rev/8071f94c4d4b66833ad08db39565669faad94dfd/gfx/layers/apz/src/AsyncPanZoomController.cpp#3355
[2] http://searchfox.org/mozilla-central/rev/8071f94c4d4b66833ad08db39565669faad94dfd/gfx/layers/apz/util/APZCCallbackHelper.cpp#84
Got one approach working: https://treeherder.mozilla.org/#/jobs?repo=try&revision=fb3669947b96d5666443b0761bf7421d59c5697a but I'm going to see if I can simplify it a little.
New simplified version: https://treeherder.mozilla.org/#/jobs?repo=try&revision=99f05c409776e34c9455d85355b05c9b3f328acd
The simplified version causes a failure in test_interrupted_reflow.html. After giving it some thought I think the failure is legitimate, and the simplified version has a flaw where APZ might remain at an invalid scroll offset during an interrupted reflow. That's not good so I think we'll have to go with the un-simplified version. Will think about it a bit more and put up for review after the weekend.
Comment hidden (mozreview-request)
I wouldn't want to uplift this, it should be ok riding the trains.
status-firefox50: affected → wontfix
status-firefox51: affected → wontfix
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #8)
> I investigated the test failure and I think I understand why it's happening.
> My patch assumed that mLastScrollOrigin would get cleared when the scroll
> origin was flushed out to the compositor, but that's not the case.

Can we do something like this instead? Either clear mLastScrollOrigin or set some state on the scrollframe saying that it has been sent (but not necessarily acknowledged by apzc)? Seems like it would lead to easier to understand code.
(In reply to Timothy Nikkel (:tnikkel) from comment #15)
> (In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #8)
> > I investigated the test failure and I think I understand why it's happening.
> > My patch assumed that mLastScrollOrigin would get cleared when the scroll
> > origin was flushed out to the compositor, but that's not the case.
> 
> Can we do something like this instead? Either clear mLastScrollOrigin

We can't clear mLastScrollOrigin because there might be an inflight repaint request from the APZ (which is stale, sent before the main-thread scroll position change). If we cleared mLastScrollOrigin then the APZ repaint request would clobber the main thread.

> or set
> some state on the scrollframe saying that it has been sent (but not
> necessarily acknowledged by apzc)?

That's exactly what my patch is doing. The state is "mAllowScrollOriginDowngrade" - it is generally false, but gets set to true once the the scroll origin has been sent (but not necessarily acknowledged by apzc). It gets reset to false again the next the scroll origin is set. I can call the flag something else if that would help - I went with a name specific to the use site, but I could call it something like "ScrollOriginFlushedToCompositor" which is more specific to the call site instead, if you think it would be easier to understand. Functionally it's exactly the same.

Updated

a year ago
Duplicate of this bug: 1309181
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #16)
> (In reply to Timothy Nikkel (:tnikkel) from comment #15)
> > (In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #8)
> > > I investigated the test failure and I think I understand why it's happening.
> > > My patch assumed that mLastScrollOrigin would get cleared when the scroll
> > > origin was flushed out to the compositor, but that's not the case.
> > 
> > Can we do something like this instead? Either clear mLastScrollOrigin
> 
> We can't clear mLastScrollOrigin because there might be an inflight repaint
> request from the APZ (which is stale, sent before the main-thread scroll
> position change). If we cleared mLastScrollOrigin then the APZ repaint
> request would clobber the main thread.
> 
> > or set
> > some state on the scrollframe saying that it has been sent (but not
> > necessarily acknowledged by apzc)?
> 
> That's exactly what my patch is doing. The state is

I see that now.

Comment 19

a year ago
mozreview-review
Comment on attachment 8799745 [details]
Bug 1304689 - Ensure frame reconstructions don't clobber a 'stronger' scroll origin with a 'weaker' one.

https://reviewboard.mozilla.org/r/84876/#review84048
Attachment #8799745 - Flags: review?(tnikkel) → review+

Comment 20

a year ago
Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/752382b58cc4
Ensure frame reconstructions don't clobber a 'stronger' scroll origin with a 'weaker' one. r=tnikkel

Comment 21

a year ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/752382b58cc4
Status: NEW → RESOLVED
Last Resolved: a year ago
status-firefox52: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 52
Verified as fixed in build:
 - Nightly 52.0a1 (2016-10-31)
Devices:
 - LG G4 (Android 6.0)
 - LG G4 (Android 5.1)
status-firefox52: fixed → verified
You need to log in before you can comment on or make changes to this bug.