Closed Bug 1517895 Opened 5 years ago Closed 5 years ago

Update session store / session history code to use APIs that scroll the visual viewport where appropriate

Categories

(Firefox :: Session Restore, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
Firefox 67
Tracking Status
firefox67 --- fixed

People

(Reporter: botond, Assigned: botond)

References

(Blocks 2 open bugs)

Details

Attachments

(7 files)

In bug 1498812, we are making session store / session history save and restore the visual scroll offset.

Bug 1498812 uses window.scrollTo() to restore the visual scroll offset. This is problematic because window.scrollTo() is intended to modify the layout scroll offset.

In particular, the intention is for window.scrollTo() to restrict scrolling to the layout scroll range. It doesn't currently do that, due to bug 1516056, but we'd like to fix that.

Therefore, we'll need to update the session store / session history code to use a different API intended to scroll the visual viewport. This API is internal-only, and will be added in bug 1507279.
Blocks: 1516056
Blocks: ss-feature
Priority: -- → P3

(This blocks P2 bugs via bug 1516056.)

Priority: P3 → P2
Summary: Update session store / session history code to use scrollToVisual() instead of scrollTo() → Update session store / session history code to use APIs that scroll the visual viewport where appropriate

I've been looking at this, to try to get bug 1516056 and bug 1519013 unblocked.

One of the required changes is changing the setScrollPosition() function used by Android session store tests to use nsIDOMWindowUtils.scrollToVisual() instead of window.scrollTo() (since, after bug 1516056, the latter will not be able to scroll to positions outside the layout scroll range).

However, this is causing a number of failures in test_session_scroll_position.html, including the very first set of assertions.

The problem is that scrollTo() would take effect right away, while scrollToVisual() only takes effect after the next repaint. In between calling scrollToVisual() and making assertions about the resulting visual scroll position, we need to wait for a mozvisualscroll event.

Now, I think the test is already intending to do this. In particular, the test waits for an SSTabScrollCaptured event, and one of the things that triggers that is a mozvisualscroll.

However, in this case, we get an SSTabScrollCaptured event without a mozvisualscroll: it's triggered from onTabLoad which in turn is triggered from DOMTitleChanged. As a result, we make our assertions before the visual scroll happens.

Jan, do you have a suggestion for how we can revise the test to wait for the right event? Should we explicitly wait for a mozvisualscroll instead of an SSTabScrollCaptured?

Flags: needinfo?(jh+bugzilla)

Hmm, good question. Actually it seems you'd get SSTabScrollCaptured thrice during page load - once at pageshow, which was the original idea in order to capture the scroll position after the page finished loading completely, the second at DOMTitleChanged as a side effect of fixing bug 1338088, and nowadays also when we're notified of the changed session history. The problem is that the session store has historically been using DOMTitleChanged as a proxy for "page loading has progressed far enough to capture some useful state about the page" because it fires quite a bit earlier than the real "load"/"DOMContentLoaded"/"pageshow" events. At the same time, DOMTitleChanged can of course also be fired outside of pageloading, when the page genuinely updates its title, and then I didn't have a good idea to distinguish between the two cases. Plus originally the session store didn't listen for session history events, but now it does and it triggers a state save, too. I really ought to rethink this whole area, but that might be some time and won't help with your immediate problem...

So, what you could do for now

  • modify promiseBrowserEvent to optionally wait for an event to fire n times, and then wait three times for SSTabScrollCaptured to be sure that all the initial page loading activity has settled down before proceeding with setting a scroll position.
  • Or alternatively, while the session store does it's own scroll update throttling (hence theoretically the need to wait for SSTabScrollCaptured in order to be sure that the new scroll position actually made it into the session store), I've rediscovered that before closing a tab, we flush any pending scroll/form data saves. So you could indeed just wait for mozvisualscroll and then proceed with the test. I'm not sure, but it might be necessary to use the resolveAtNextTick option in that case, so that the test doesn't start closing the tab synchronously in response to the mozvisualscroll event, with bad luck possibly before the session store received the event, too, and had a chance to record a pending scroll save.

On the whole, I think I'd prefer the second option, provided it actually works and I didn't overlook anything.

Flags: needinfo?(jh+bugzilla)
Blocks: ss-SM

Made some progress here, we now get past the mentioned spot in the test, but there are still failures later in the test:

https://treeherder.mozilla.org/#/jobs?repo=try&tier=1&revision=1465402261b273aff0e0c932f4e0f9d4485ad1ec

Flags: needinfo?(jh+bugzilla)

In my local run, the test hangs here.

It's not entirely clear to me what is supposed to be generating three mozvisualscroll events. Should they be triggered by ScrollFrameHelper::RestoreState()? Is that in turn expected to be triggered from this undoCloseTab() call?

(In reply to Botond Ballo [:botond] from comment #5)

It's not entirely clear to me what is supposed to be generating three mozvisualscroll events. Should they be triggered by ScrollFrameHelper::RestoreState()? Is that in turn expected to be triggered from this undoCloseTab() call?

(If so, that's not happening: RestoreState() does not get called at all during the test_sessionStoreScrollPositionForFrames task.)

I guess the other possibility is that the scroll frames stay alive across the tab close / undo-close, and the visual scroll offset restoration is expected to happen using the first-paint mechanism added in bug 1509575. However, I'm not seeing any first-paints during that task, either.

I just discovered that there's actually a third scroll offset restoration path: SessionStore._restoreScrollPosition, which calls into Window::ScrollTo.

I'm a bit confused as to what this one is for. Is it for tests only? Or, if it's for production use, what is the relationship between that mechanism, and RestoreState()?

Btw., here is my latest Try push. There's an extra patch there compared to comment 4, implementing step 2 from bug 1519621 comment 11, which I had initially forgotten about.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=8ca4f18795f72c8357fb26d94ee8f43da7f5e764

(In reply to Botond Ballo [:botond] from comment #7)

I guess the other possibility is that the scroll frames stay alive across the tab close / undo-close, and the visual scroll offset restoration is expected to happen using the first-paint mechanism added in bug 1509575.

They don't, closing the tab really gets rid of everything except for the serialised session store data, so all (scroll) frames will most definitively be gone.

(In reply to Botond Ballo [:botond] from comment #8)

I just discovered that there's actually a third scroll offset restoration path: SessionStore._restoreScrollPosition, which calls into Window::ScrollTo.

To answer your question from comment #5, yes, this is what is expected to restore the scroll position here.

I'm a bit confused as to what this one is for. Is it for tests only? Or, if it's for production use, what is the relationship between that mechanism, and RestoreState()?

RestoreState() is part of session history as a core Gecko feature. I.e. this is what remembers your scroll position when you use the back and forward buttons to navigate through the history of a tab, and it'll work even in something as simple like the GeckoViewExampleApp. As such, the PresState is only updated when you navigate away from a history entry, and we don't keep track of the current scroll position on the current page because there's no need to - the document representing the current history entry is open and alive and already knows its own scroll position.
The session store (which is somewhat more of a front-end feature, and while the core state serialisation/deserialisation logic is mostly shared between Desktop/Fennec/GeckoView [1], the rest of the logic around it has considerable implementation differences between those three platforms) on the other hand is what preserves the state even if the tab as a whole is closed for whatever reasons (tab/window/browser is closed/restarted/crashes/...), and as such it also needs to additionally preserve the scroll position of the current session history entry and this uses the mechanism you've discovered above.

If for whatever reasons (because you navigated back and then forward again for example) the current session history entry has both a PresState and the scroll position stored by the session store, then the latter should win [2], which maybe should be kept in mind here in case any planned changes would affect that, i.e. once the session store calls the new scrollTo-replacement, the ScrollFrame should stop doing its own scroll restoration, just like it would for a user scroll (for the resolution it already doesn't work, though, as per bug 1498830).

Longer-term, I'm also thinking about feeding the session store's scroll position directly into the ScrollFrameHelper's restore logic instead (see bug 1498657), but in the short-/medium-term you'll probably need to deal with the current mechanism, and make it use the new visual viewport-aware helper method, too.

I'm not sure if I can get around to taking a closer look at what goes wrong in that test until the end of this week, but in the meantime - I can't say whether this will actually provide any additional helpful information for this case, but note that you can enable some debug logging in the mobile session store via the browser.sessionstore.debug_logging pref. For Try, it's probably easiest to just hardcode it to true here.

[1] SessionHistory.jsm and things like the scroll position/form data helpers, respectively SessionStoreUtils since very recently.
[2] Though I don't think any of the current tests covers that scenario.

Thanks for the explanation. The distinction between the session store and the session history was not clear to me before.

It sounds like SessionStoreUtils.RestoreScrollPosition() should be using the new visual scrolling API as well. I will give that a try, and see if it helps with the failures.

(In reply to Botond Ballo [:botond] from comment #11)

It sounds like SessionStoreUtils.RestoreScrollPosition() should be using the new visual scrolling API as well. I will give that a try, and see if it helps with the failures.

So with this change, the test passes for me locally. However, it's still failing in automation:

https://treeherder.mozilla.org/#/jobs?repo=try&tier=1&revision=8feb36dc9173c071d0d6c009d1371a8e4ac39b0d

(There are a bunch of other failures which I believe are unrelated.)

(In reply to Botond Ballo [:botond] from comment #12)

However, it's still failing in automation:

Part of the issue here is just that the fix from comment 3 needs to be applied in other places in the test, too.

I've now gotten to the point where test_session_scroll_position.html is passing in automation:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=6295c2a06e0fdb1da443fb5844fd5aba0d07d528

test_session_scroll_visual_viewport.html is still timing out here. It looks like something is calling scrollTo(0,0) or equivalent, which is taking precedence over RestoreState()'s attempt to restore the visual scroll position from the session history; as a result, the visual viewport offset does not change from (0,0), and we don't get the mozvisualscroll event.

I have not yet tracked down what is making the problematic scrollTo(0,0) call.

(In reply to Botond Ballo [:botond] from comment #15)

I have not yet tracked down what is making the problematic scrollTo(0,0) call.

No one is making a scrollTo(0,0) call; rather, the scroll frame is constructed with mLastScrollOrigin == nsGkAtoms::other [1], which basically puts it into a state where the first transaction for it will carry a main-thread scroll offset update that clobbers any restore visual scroll offset update (such as the one ScrollToRestoredPosition() is trying to make).

I tried changing the initial value of mLastScrollOrigin to nsGkAtoms::restore, but ran into assertions added in bug 1304689 (and looking over that patch, it seems like doing so might regress that bug).

It's not immediately clear me how to fix this.

[1] https://searchfox.org/mozilla-central/rev/03ebbdab952409640c6857d835d3040bf6f9e2db/layout/generic/nsGfxScrollFrame.cpp#1995

a third scroll offset restoration path

Which are those three paths? One is RestoreState() in the scroll frame, the other SessionStoreUtils.RestoreScrollPosition(), and the third...?

I.e. did you also come across the DocShell's scroll position restoring? Unfortunately I don't know whether that bit is sufficiently covered by any tests [1], but basically that code path should be encountered in conjunction with anchor link or pushState navigation happening inside the same document. I.e. the scroll frame and its PresState can naturally only save one scroll position per document, but through anchor links or pushState URL-rewriting you can get multiple Session History entries for the same document. So those additional scroll positions are saved as part of the respective SHEntry and restored on navigation through the DocShell.

[1] It certainly isn't in Android-specific tests, and any tests written with desktop in mind probably don't encounter the case where layout viewport != visual viewport.

(In reply to Jan Henning [:JanH] from comment #17)

a third scroll offset restoration path

Which are those three paths? One is RestoreState() in the scroll frame, the other SessionStoreUtils.RestoreScrollPosition(), and the third...?

The third one that I was thinking of was the first-paint based restoration added in bug 1509575, which happens when the scroll frame stays alive (in the bfcache?) and is reused.

(In reply to Botond Ballo [:botond] from comment #16)

It's not immediately clear me how to fix this.

I thought about this some more, and realized that changing the initial value of mLastScrollOrigin to nsGkAtoms::restore is the wrong approach for fixing the problem. Bug 1304689 already handles correct restoration of the origin across a frame reconstruction with the mAllowScrollOriginDowngrade mechanism. The problem is just that if ScrollToImpl() early-exits [1], we don't execute the scroll origin downgrade logic [2].

Prior to these patches, that wasn't a problem, since if we're early-exiting ScrollToImpl(), the layout scroll offset is already what we want it to be, so it doesn't matter whether or not we downgrade the origin to restore. With these patches, though, it's important that we still downgrade the origin to restore so that a visual restore update riding the same transaction doesn't get clobbered.

[1] https://searchfox.org/mozilla-central/rev/01b4b3830ea3cae2e9e431019afa6391b471c6da/layout/generic/nsGfxScrollFrame.cpp#2710
[2] https://searchfox.org/mozilla-central/rev/01b4b3830ea3cae2e9e431019afa6391b471c6da/layout/generic/nsGfxScrollFrame.cpp#2760

We scroll the visual viewport in addition to the layout viewport to make sure
that the layout viewport also ends up at the desired position; note that the
layout viewport's desired position will change in bug 1516056 as we start
clamping the layout viewport offset to the layout scroll range.

SSTabScrollCaptured can sometimes be fired for other reasons, causing us to
query the visual scroll position before it has been updated.

Depends on D19874

Assignee: nobody → botond
Flags: needinfo?(jh+bugzilla)

While as I said above that particular scenario is probably not properly tested at the moment, the Docshell needs the same treatment, doesn't it?

Flags: needinfo?(botond)

This gets frame-reconstruction-scroll-clamping.html passing again.

Depends on D19877

(In reply to Jan Henning [:JanH] from comment #26)

While as I said above that particular scenario is probably not properly tested at the moment, the Docshell needs the same treatment, doesn't it?

I added a patch to do this.

Flags: needinfo?(botond)

The changes here made frame-reconstruction-scroll-clamping.html, which was marked as failing in bug 1504659, to start passing intermittently. I debugged it, and got it passing reliably; I added the fix for that as a new patch as well.

Pushed by bballo@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ee808a9cfb1b
Scroll the visual viewport in ScrollToRestoredPosition(). r=tnikkel
https://hg.mozilla.org/integration/autoland/rev/02e879149e47
Use scrollToVisual() in the session store tests. r=JanH
https://hg.mozilla.org/integration/autoland/rev/f41b081164d2
Wait for mozvisualscroll explicitly in the session store tests. r=JanH
https://hg.mozilla.org/integration/autoland/rev/a0bf4a689da4
Scroll the visual viewport in SessionStoreUtils::RestoreScrollPosition(). r=JanH
https://hg.mozilla.org/integration/autoland/rev/f363798d6481
Scroll the visual viewport in nsDocShell::SetCurScrollPosEx(). r=qdot
https://hg.mozilla.org/integration/autoland/rev/aee83b1af5f1
Execute scroll origin downgrade after frame reconstruction even if early-exiting ScrollToImpl(). r=kats
https://hg.mozilla.org/integration/autoland/rev/056391cf2e01
Only allow APZ to clobber the visual viewport offset if it can clobber the layout viewport offset. r=kats

Thanks to Botond and colleagues for solving the issue, but I wonder if a bug I see is rooted in the same thing or not (I do not want to litter BugZilla unnecessarily):

In FF version 66 (and earlier) fullscreen mode makes mouseover event target glitchy returning either much smaller coordinate numbers or even zeros instead of their true values (e.g. in YouTube when you first go fullscreen on a video and then scroll down without going out from fullscreen to normal mode), so to make the code work I even have to use relatedTarget's coordinates as everything else is useless:

        eTargetBounds = event.target.getBoundingClientRect(); 
        eRelatedTargetBounds = event.relatedTarget.getBoundingClientRect();
        calculatedTop = Math.max(event.layerY, // this has a correct value in a normal mode
                                 event.pageY, // this has a correct value in a normal mode
                                 event.target.offsetTop,
                                 eTargetBounds.y,
                                 eTargetBounds.top,
                                 event.relatedTarget.offsetTop,
                                 eRelatedTargetBounds.y,
                                 eRelatedTargetBounds.top)
                      - Number(document.body.style.top.replace('px','')) + 10;

-- is this the same thing in the root that you fixed in FF67 (so the glitch will go away once I update to FF67) or it is a different phenomenon and needs a separate bug?

Thanks in advance.

Flags: needinfo?(botond)

I don't see a readily apparent connection between this bug and the issue described in comment 35. Please do file a new bug for it.

Flags: needinfo?(botond)
You need to log in before you can comment on or make changes to this bug.