Scroll restoration code in nsGfxScrollFrame gets confused when dealing with shortened scrollframes

RESOLVED FIXED in Firefox 48

Status

()

RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: kats, Assigned: kats)

Tracking

Trunk
mozilla49
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox46 wontfix, firefox47 wontfix, firefox48 fixed, firefox49 fixed)

Details

Attachments

(2 attachments, 4 obsolete attachments)

Created attachment 8746151 [details]
Testcase

Spinoff from bug 1266833, this is a pre-existing bug that happens in release Firefox (45.0.2) and onwards. Backstory is in bug 1266833 comment 6 and comment 7, but in a nutshell:

- there's some code in nsGfxScrollFrame.cpp to save/restore the scroll position across a frame reconstruction
- the aforementioned code tries to work even in the face of incremental loading. it does this by trying to restore the scroll position repeatedly until it is successful
- the aforementioned code also does the same thing if the scrollable rect legitimately gets shorter and longer, without incremental loading happening
- therefore the aforementioned code "restores" a scroll position that it shouldn't be restoring.

The attached testcase demonstrates the problem; compare behaviour in Firefox and Chrome to see how they're different.
Note that with APZ enabled, in the testcase, the scroll position stays at the top but the displayport gets moved to the bottom, so you end up looking at a blank screen (checkerboarding) rather than the bottom of the page. It's still the same underlying bug though.
(In reply to Timothy Nikkel (:tnikkel) from bug 1266833 comment 18)
> I think there are two different uses of this scroll restoration code. (1)
> When a scroll frame gets reconstructed, and (2) when we are restoring a page
> from history and want to restore the scroll position. In (1) waiting for
> more content to load makes no sense, because all the content is already
> there and we are just reconstructing the scroll frame, we're not going to
> get anymore content by waiting. (2) seems to be what this code is designed
> for.

Thanks, this is helpful info.

> We should figure out how to distinguish these two cases and then for (1) we
> should try to restore exactly once, then give up. For (2) we should continue
> trying to restore but maybe give up after the document is finished loading
> (so that the restore pos is guaranteed to get cleared)

> I'm not sure how to do that off the top of my head, examining the stacks for
> ScrollFrameHelper::RestoreState in both cases might tell us how to find out
> (or where to add code that lets us find out).

I looked at the stacks, and AFAICT scenario (1) always goes through the highlighted line at http://mxr.mozilla.org/mozilla-central/source/layout/base/nsCSSFrameConstructor.cpp?rev=de6c6c719ef5&mark=9585-9585#9576 - according to the XXXmats comment the "aAsyncInsert" branch should be doing a restore too but doesn't. Adding a counter around that line and querying it over in nsGfxScrollFrame seems to work, I'll post a very rough WIP in a bit.
Created attachment 8746556 [details] [diff] [review]
Very rough WIP

Very rough WIP to touch a minimum of header files, sorry it's so ugly. Mostly I want to get Mats' feedback on this bug and if this seems like a viable approach to solving the problem.
Attachment #8746556 - Flags: feedback?(tnikkel)
Attachment #8746556 - Flags: feedback?(mats)
Created attachment 8746688 [details] [diff] [review]
Patch

I cleaned up the patch and turned the testcase into a mochitest.
Assignee: nobody → bugmail.mozilla
Attachment #8746556 - Attachment is obsolete: true
Attachment #8746556 - Flags: feedback?(tnikkel)
Attachment #8746556 - Flags: feedback?(mats)
Attachment #8746688 - Flags: review?(tnikkel)
Attachment #8746688 - Flags: review?(mats)
Looking at the code now it wasn't hard to find where the "history restore" comes from: they should all come from PresShell::RestoreRootScrollPosition(). This seems easier and less error prone than tracking the reconstruct case.
Hmm.. There's also calls to ScrollToRestoredPosition in nsGfxScrollFrame::ReflowFinished() which I assumed was part of the "incremental loading" case. i.e. At some point early in page load, the scroll position is restored by invoking ScrollFrameHelper::RestoreState, and then the ScrollToRestoredPosition gets called from various places including ReflowFinished() and RestoreRootScrollPosition(). Those calls will all try to make the scroll position go to the restored value, and I'd have to catch all of them to catch the "history restore" case. Or is the ReflowFinished call site something else? I saw it getting hit with a breakpoint.
Comment on attachment 8746688 [details] [diff] [review]
Patch

This looks fine to me, but tn's suggestion might be better.

A few minor nits:

>layout/base/nsCSSFrameConstructor.cpp
>+class MOZ_STACK_CLASS ReconstructionTracker {

MOZ_RAII might be better?
'{' on separate line please.

I think ReconstructionTracker is a bit too generic name since we're
only interested in frame reconstruction in RecreateFramesForContent
specifically.  RecreateFramesForContentTracker might be a bit too
long perhaps, but RecreateFramesTracker is better?

>+  static int sReconstructionDepth;

I don't think we use the natural integer types, please use uint32_t instead.

>layout/generic/nsGfxScrollFrame.cpp
>+  , mRestoreWasReconstruction(false)

Likewise, s/Reconstruction/RecreateFrames/
Perhaps mRestoreStateForRecreateFrames ?

>+      if (mRestoreWasReconstruction) {

Do we need to set it false again?

>+extern bool IsReconstructionActive();

Dead code?
Attachment #8746688 - Flags: review?(mats) → review+
Fixed all the nits you mentioned.

(In reply to Mats Palmgren (:mats) from comment #8)
> >+      if (mRestoreWasReconstruction) {
> 
> Do we need to set it false again?

I don't think we need to, since we set it unconditionally whenever we restore a value into mRestorePos, and if we ever hit this code with it true it will clear mRestorePos, so it should never hit twice. That being said, it doesn't hurt to reset it here, and might be more robust to changes elsewhere in the code, so I can do that.

I'll discuss with tn to see if there's a cleaner approach along the lines he suggested.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #7)
> Hmm.. There's also calls to ScrollToRestoredPosition in
> nsGfxScrollFrame::ReflowFinished() which I assumed was part of the
> "incremental loading" case. i.e. At some point early in page load, the
> scroll position is restored by invoking ScrollFrameHelper::RestoreState, and
> then the ScrollToRestoredPosition gets called from various places including
> ReflowFinished() and RestoreRootScrollPosition(). Those calls will all try
> to make the scroll position go to the restored value, and I'd have to catch
> all of them to catch the "history restore" case. Or is the ReflowFinished
> call site something else? I saw it getting hit with a breakpoint.

Can we just make PresShell::RestoreRootScrollPosition set a flag on the scroll frame? mIsTryingToDoHistoryRestoreScroll (that might be too short of a variable name, we should add to it if we can). We could clear the flag from PresShell::LoadComplete and when we successfully restore the scroll position. Then the scroll frame has all the state info it needs. ScrollFrameHelper::ScrollToRestoredPosition would try to restore the scroll position once, then clear the restore pos unless mIsTryingToDoHistoryRestoreScroll is true. I think that'll do what we want.

ReflowFinished happens in both cases, reconstruct and history restore.
That sounds like it could work - I'll give it a shot.
So both the patch currently posted on this bug and my updated version regress a mochitest, layout/base/tests/test_bug851445.html. What's happening there is that there's frame reconstructions during the page load itself, before nsPresShell calls RestoreRootScrollPosition. By the time that call happens, we've already cleared mRestorePos back to -1 after trying and failing to scroll to it. I think I might need to persist the mIsTryingToDoHistoryRestoreScroll flag across frame reconstructions?
Persisting mIsTryingToDoHistoryRestoreScroll across reconstructions doesn't work either, because the flag is false before test's iframe gets reloaded. So the order of stuff in the test is like so:

- iframe is scrolled to (0, 100)
- iframe is reloaded
- the y=100 is saved from the old frame and restored into the new frame
- the iframe forces a frame reconstruction by setting display:none, so again the y=100 is saved from the old frame and restored into the new frame. ScrollToRestoredPosition is called, is unable to scroll to y=100, but clears mRestorePos back to -1 because it's a frame reconstruction
- then nsPresShell calls RestoreRootScrollPosition, and at this point mRestorePos is empty so it does nothing
- the iframe forces another frame reconstruction by setting display:'', there's another frame reconstruction which saves/restores the "0" scroll position
- test ends, with the scroll position 0.

> (In reply to Timothy Nikkel (:tnikkel) from bug 1266833 comment 18)
> > I think there are two different uses of this scroll restoration code. (1)
> > When a scroll frame gets reconstructed, and (2) when we are restoring a page
> > from history and want to restore the scroll position. In (1) waiting for
> > more content to load makes no sense, because all the content is already
> > there and we are just reconstructing the scroll frame, we're not going to
> > get anymore content by waiting.

So going back to this - I think that last sentence is incorrect. It should say "In (1) waiting for more content to load makes no sense, EXCEPT if it is happening during page load". The question is - how to detect that we are in page load? The ScrollToRestoredPosition signal is too late, we need something earlier.
Created attachment 8747107 [details] [diff] [review]
Patch v2

Ok, I fixed it by just initializing that field to true when the scrollframe is created. That seems to get the desired behaviour and both tests pass. Hopefully the variable name is long enough for you, I can add more words if needed :)

Try push at https://treeherder.mozilla.org/#/jobs?repo=try&revision=216567cedca1
Attachment #8747107 - Flags: review?(tnikkel)
Attachment #8746688 - Attachment is obsolete: true
Attachment #8746688 - Flags: review?(tnikkel)
Comment on attachment 8747107 [details] [diff] [review]
Patch v2

... and of course this breaks more tests than the previous version. *sigh*
Attachment #8747107 - Flags: review?(tnikkel)
Created attachment 8747230 [details] [diff] [review]
Patch v3

Yay, things are looking ok.
Attachment #8747107 - Attachment is obsolete: true
Attachment #8747230 - Flags: review?(tnikkel)
Created attachment 8747274 [details] [diff] [review]
Patch v4

Functionally equivalent to v3, but simplified as per our discussion.
Attachment #8747230 - Attachment is obsolete: true
Attachment #8747230 - Flags: review?(tnikkel)
Attachment #8747274 - Flags: review?(tnikkel)
Attachment #8747274 - Flags: review?(tnikkel) → review+

Comment 20

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/a12cdfe43e92
Status: NEW → RESOLVED
Last Resolved: 2 years ago
status-firefox49: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla49

Updated

2 years ago
Depends on: 1269539
Comment on attachment 8747274 [details] [diff] [review]
Patch v4

Approval Request Comment
[Feature/regressing bug #]: unknown; this is a pre-existing issue but made worse by APZ
[User impact if declined]: in some cases the user can end up with a blank browser window until they take some action that triggers a repaint, such as scrolling
[Describe test coverage new/current, TreeHerder]: patch includes a test
[Risks and why]: low/medium risk. note that there was a regression from this patch (bug 1269539) which will need to be uplifted along with this. oddly, this actually *reduces* the risk, because the fix for that bug was partly a backout of the changes in this bug, and so the patches combined have a smaller overall footprint. the other bug has its own test as well.
[String/UUID change made/needed]: none
Attachment #8747274 - Flags: approval-mozilla-aurora?
wontfix'ing for 47 because even though it's a pre-existing issue, it's the combination with APZ that makes it really bad, and APZ won't be shipping in 47 (at least that's how things stand at the moment).
status-firefox47: affected → wontfix
Do the uplifts here and in bug 1269539 need to happen in any particular order?
Flags: needinfo?(bugmail.mozilla)
Comment on attachment 8747274 [details] [diff] [review]
Patch v4

Fix scrolling issue + crash, includes a test. 
Please uplift to aurora along with patch from bug 1269539
Attachment #8747274 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #23)
> Do the uplifts here and in bug 1269539 need to happen in any particular
> order?

In the order they landed on m-c is fine. They won't apply properly otherwise, I think.
Flags: needinfo?(bugmail.mozilla)

Comment 26

2 years ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-aurora/rev/9069d3d726e2
status-firefox48: affected → fixed

Updated

2 years ago
Depends on: 1274571
You need to log in before you can comment on or make changes to this bug.