Closed Bug 1846935 Opened 2 years ago Closed 1 year ago

Enable MSD physics by default on the nightly channel

Categories

(Core :: Panning and Zooming, task, P2)

task

Tracking

()

RESOLVED FIXED
122 Branch
Tracking Status
firefox122 --- fixed

People

(Reporter: botond, Assigned: botond)

References

(Blocks 1 open bug, Regressed 2 open bugs)

Details

Attachments

(11 files, 1 obsolete file)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
45.07 KB, image/png
Details
No description provided.

Update on this:

The patch to enable MSD physics has not landed yet because it was found to regress helper_mainthread_scroll_bug1662379.html. I looked into that test failure.

The test scrolls a scrollable element, then performs an operation on it which triggers frame reconstruction, and checks that the frame reconstruction has the effect of resetting the scroll offset to zero. [Correction: the test actually reparents the scrollable element, and it's the reparenting which has the effect of resetting the scroll offset.]

The mechanism by which the frame reconstruction resets the scroll offset to zero is sending a scroll position update to APZ with origin ScrollOrigin::None. This origin cannot clobber APZ scrolling, so there is a possibility of the scroll position update being rejected if the user has scrolled the element in the meantime.

In the test, no user scrolling occurs at this time, but with MSD physics it seems that a small rounding error causes the check for user scrolling to have a false positive.

I fixed this in D193393 by using a more permissive check (using COORDINATE_EPSILON) for user scrolling.

Unfortunately, MSD physics seems to regress another test as well, helper_relative_scroll_smoothness.html. I'm continuing to investigate that one.

(In reply to Botond Ballo [:botond] from comment #3)

Unfortunately, MSD physics seems to regress another test as well, helper_relative_scroll_smoothness.html. I'm continuing to investigate that one.

I guess we should also tweak the msd physics parameters for the test.

(In reply to Hiroyuki Ikezoe (:hiro) from comment #4)

(In reply to Botond Ballo [:botond] from comment #3)

Unfortunately, MSD physics seems to regress another test as well, helper_relative_scroll_smoothness.html. I'm continuing to investigate that one.

I guess we should also tweak the msd physics parameters for the test.

I did try that along with some other fixes to the test (here is my latest Try push) but it hasn't solved the failures so far.

In the try run, only motionBeginSpringConstant is changed, it does just make the animation slow down on a single key press. The test rather presses the key repeatedly. With slowdownMinDeltaRatio=0.1 and slowdownSpringConstant=20 (with motionBeginSpringConstant=50), scrolling on multiple key presses also slows down the end of the animation.

Depends on: 1864554

(In reply to Hiroyuki Ikezoe (:hiro) from comment #6)

In the try run, only motionBeginSpringConstant is changed, it does just make the animation slow down on a single key press. The test rather presses the key repeatedly. With slowdownMinDeltaRatio=0.1 and slowdownSpringConstant=20 (with motionBeginSpringConstant=50), scrolling on multiple key presses also slows down the end of the animation.

Thanks for the suggestion!

Unfortunately, the test still fails for me intermittently in local runs with those values.

I'm starting to suspect the failures may be related to a bug in the MSD scroll animation code; I filed bug 1864554 for it and will circle back to this once that's fixed.

(In reply to Botond Ballo [:botond] from comment #7)

I'm starting to suspect the failures may be related to a bug in the MSD scroll animation code; I filed bug 1864554 for it and will circle back to this once that's fixed.

I'm unsure whether bug 1864554 had an impact on the helper_relative_scroll_smoothness failures, because with the fixes the test was still failing for me intermittently in a local run.

I reduced the spring constants further and now it's passing locally on 50/50 runs.

On Try, I'm still seeing an intermittent failure, but I'm hoping it's just the pre-existing intermittent failure of the test (bug 1781383).

During local testing, I observed the test failing because we were still
collecting samples after the animations had completed. Since we are
only synthesizing wheel events that trigger animations for a certain
duration (1500 ms), it's safer to only check samples for 1500 ms.

Depends on D194018

Pushed by bballo@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/3d4fe7f92237 Enable MSD physics by default on the nightly channel. r=hiro https://hg.mozilla.org/integration/autoland/rev/eeee628a5706 Use FuzzyEqualsCoordinate in the computation of userScrolled. r=hiro https://hg.mozilla.org/integration/autoland/rev/bafee3429b03 In APZ tests that need to slow down smooth scroll animations, slow them down with MSD physics too. r=hiro https://hg.mozilla.org/integration/autoland/rev/1816eff04e52 Make helper_relative_scroll_smoothness more robust. r=hiro
Backout by csabou@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a314efc08ae3 Backed out 4 changesets for mochitest plain failures on test_group_keyboard.

The failing subtest in test_group_keyboard is also helper_relative_scroll_smoothness, it has a variant which exercises keyboard scrolling, which involves some different codepaths than wheel scrolling.

I can reproduce the failure locally at a very low frequency. I've been investigating it and it looks like the MSD animation can sometimes unexpectedly overshoot its destination.

We ran into something similar in bug 1720240, where the overscroll animation (whose implementation shares the same AxisPhysicsMSDModel component) could overshoot its destination, and we had to add an explicit clamping step to avoid it (bug 1720240 comment 19). We may need to do that here as well.

Flags: needinfo?(botond)

I implemented a clamping step locally, but that means that when the overshoot would have happened, the animation now ends, and helper_relative_scroll_smoothness still fails because it expects a strictly increasing scroll position on every frame.

I'm starting to think that helper_relative_scroll_smoothness is difficult to get to work with MSD physics.

Here's my understanding of what the test, which is a regression test for bug 1692707, is trying to do:

  1. It sets up a scenario where user scrolling causes the scroll offset to be strictly increasing in a given direction (on every frame, the composited scroll offset is strictly greater than on the previous frame) through a sequence of wheel or keyboard scroll events.
  2. It then adds script-driven scrolling (e.g. scrollBy) in the same direction.
  3. It checks that the script-driven scrolling does not interfere with the "scroll offset is strictly increasing" invariant.

The difficulty with MSD physics is achieving the "scroll offset is strictly increasing" property for user scrolling (step 1) to begin with. The behaviour is quite sensitive to the exact timing of the subsequent input events arriving at the compositor, which the test doesn't have much control over. The current velocity of the MSD animation at the time the new event arrives becomes the initial velocity of the updated animation, so if the events arrive at just the right time, the velocity of the animation can keep increasing with each event, resulting in the animation reaching its destination very quickly [this is after locally fixing an issue where the animation would actually overshoot in such cases], and thus having a few composited frames (until the next event arrives) with the scroll offset not changing.

I've considered two courses of action:

  1. Run this test with Bezier physics even if MSD physics is otherwise the default. This is not ideal in that the fix bug 1692707 (which the test is intending to test) has a part that's specific to the physics (the ScrollAnimationPhysics::ApplyContentShift implementation), so this would mean the test wouldn't be effective at catching regressions in the MSD implementation of this.
  2. Relax the test's expectation from "scroll offset is strictly increasing" to "scroll offset is nondecreasing". This is also not ideal because one of the possible symptoms of bug 1692707 is that the scroll offset stays constant when it should be increasing (e.g. because an animation was incorrectly cancelled by a script-driven scroll update), so this would again reduce the test's ability to catch some regressions.

Markus, Hiro, do you have ideas for other solutions, and if not, do you have a preference between the above two?

Flags: needinfo?(mstange.moz)
Flags: needinfo?(hikezoe.birchill)

I'd suggest a mixture of 1 and 2. Run the test both with MSD physics and bezier physics respectively, and allow no position change for MSD case. And I guess we could have a cumulative scroll delta and assert the cumulative delta - the total amount of scroll delta by JS > 0?

Flags: needinfo?(hikezoe.birchill)

(In reply to Hiroyuki Ikezoe (:hiro) from comment #17)

I'd suggest a mixture of 1 and 2. Run the test both with MSD physics and bezier physics respectively, and allow no position change for MSD case.

Thanks for the idea! I implemented this and with it, helper_relative_scroll_smoothness seems to be stable on Try.

And I guess we could have a cumulative scroll delta and assert the cumulative delta - the total amount of scroll delta by JS > 0?

I didn't implement this because I'm concerned it would be racy, since the total amount of scroll delta by JS is a main thread measurement that can lag behind the scrolling delta measured on the compositor. (Also, calculating the expected delta would involve replicating various delta-related calculations in the test JS code.)


Unfortunately, the helper_relative_scroll_smoothness failure was not the only reason for the backout. helper_mainthread_scroll_bug1662379.html is also failing with a high frequency, but only on Windows. This will be thornier to debug as I can't reproduce it locally.

Flags: needinfo?(mstange.moz)
See Also: → 1866904

(In reply to Botond Ballo [:botond] from comment #14)

it looks like the MSD animation can sometimes unexpectedly overshoot its destination.

We ran into something similar in bug 1720240, where the overscroll animation (whose implementation shares the same AxisPhysicsMSDModel component) could overshoot its destination, and we had to add an explicit clamping step to avoid it (bug 1720240 comment 19). We may need to do that here as well.

Markus says that AxisPhysicsMSDModel exhibiting overshoot behaviour when the damping ratio is >= 1 (which it is in our case) is a bug. I filed bug 1866904 for it.

(In reply to Botond Ballo [:botond] from comment #18)

Unfortunately, the helper_relative_scroll_smoothness failure was not the only reason for the backout. helper_mainthread_scroll_bug1662379.html is also failing with a high frequency, but only on Windows. This will be thornier to debug as I can't reproduce it locally.

Some investigation suggests that the reason for the failure is the same as in comment 3 ("a small rounding error causes the check for user scrolling to have a false positive"), but on Windows the magnitude of the rounding error is larger (in some sample failures I observed 1 or 2 app units, which is larger than COORDINATE_EPSILON which is about 0.6 app units).

Rather than blindly increasing the epsilon further, I should probably track down the underlying reason for the rounding error.

(In reply to Botond Ballo [:botond] from comment #20)

Rather than blindly increasing the epsilon further, I should probably track down the underlying reason for the rounding error.

I did investigate the reason for the rounding error that I observed on Linux before any patches. It turned out to be related to overshoot:

  • The test helper_mainthread_scroll_bug1662379.html does trigger overshoot as well
  • The amount of the overshoot is some large sub-pixel amount, say 0.4 pixels
  • Since, in this test, the destination of the animation is the end of the scroll range, the overshoot amount does get clamped by the AdjustDisplacement() calls here
  • However, the clamped amount is still very slightly larger than the destination, e.g. by 0.00003 pixels, due to rounding errors in the calculation

This means that the rounding error introduced in this way is (1) definitely not larger than COORDINATE_EPSILON, and (2) fixed by avoiding overshoot.

Unfortunately, this means the Windows failure has a different cause, since it occurs even after the overshoot is fixed, and the discrepancy is larger than COORDINATE_EPSILON.

(In reply to Botond Ballo [:botond] from comment #21)

Unfortunately, this means the Windows failure has a different cause, since it occurs even after the overshoot is fixed, and the discrepancy is larger than COORDINATE_EPSILON.

I think I tracked this down.

This time, the issue occurs when the timing of the wheel events happens to give the animation a somewhat slower velocity, such that the position spends a couple of frames being < 1 pixel away from the destination.

The test checks scrollTop < scrollTopMax to determine when the element has scrolled to the bottom. scrollTop is rounded to the nearest integer, so if e.g. scrollTopMax is 100, this condition can become true when the scroll position before rounding is e.g. 99.5, and the animation is still in progress on the compositor side.

The test then proceeds to do the reparenting, triggering the scroll position update back to 0. However, meanwhile the animation has made additional progress from 99.5 to 100, which gets picked up as userScrolled = true.

To fix this, I'm going to change the test to wait for a scrollend to be sure the animation is complete before proceeding.

Attachment #9364377 - Attachment description: Bug 1846935 - Make helper_relative_scroll_smoothness more robust. r=hiro → Bug 1846935 - In helper_relative_scroll_smoothness, limit the duration for which samples are checked. r=hiro
Attachment #9364376 - Attachment is obsolete: true
Pushed by bballo@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/5e74dd198ced Enable MSD physics by default on the nightly channel. r=hiro https://hg.mozilla.org/integration/autoland/rev/fcb407df509c Use FuzzyEqualsCoordinate in the computation of userScrolled. r=hiro https://hg.mozilla.org/integration/autoland/rev/e8ffe9ee5915 Use 'scrollend' to wait for the animation to complete in helper_mainthread_scroll_bug1662379.html. r=hiro https://hg.mozilla.org/integration/autoland/rev/9304f7473762 Factor out a getSmoothScrollPrefs helper. r=hiro https://hg.mozilla.org/integration/autoland/rev/38758ff98011 Factor out a buildRelativeScrollSmoothnessVariants helper. r=hiro https://hg.mozilla.org/integration/autoland/rev/bc66004ed425 Run helper_relative_scroll_smoothness in both Bezier and MSD physics variants. r=hiro https://hg.mozilla.org/integration/autoland/rev/47840d3e14bd In APZ tests that need to slow down smooth scroll animations, slow them down with MSD physics too. r=hiro https://hg.mozilla.org/integration/autoland/rev/7c56678e047b In helper_relative_scroll_smoothness, relax the expectation on scroll offsets from strictly increasing to non-decreasing for MSD physics. r=hiro https://hg.mozilla.org/integration/autoland/rev/10d57f761e02 In helper_relative_scroll_smoothness, limit the duration for which samples are checked. r=hiro https://hg.mozilla.org/integration/autoland/rev/f198e511d598 Limit the starting velocity of an MSD animation to ensure it doesn't overshoot. r=mstange

:botond anything you would like to include in a release note? (Process info)
We could include it in the nightly release notes.

Flags: needinfo?(botond)

Release Note Request (optional, but appreciated)
[Why is this notable]:
This change in scrolling physics makes scroll animations triggered using a mouse wheel or keyboard feel different, and brings our behaviour more in line with platform conventions, especially on Windows.

[Affects Firefox for Android]:
Not for most users (it could affect users who attach an external keyboard or scrollwheel mouse to an Android device)

[Suggested wording]:
"Scroll animations triggered using a mouse wheel or using the keyboard (e.g. arrow keys, Page Down) are now based on a Mass-Spring-Damper physics model, giving the animations a more realistic feel."

Note that this feature is Nightly-only for now, so I'm looking for a "nightly+" flag.

relnote-firefox: --- → ?
Flags: needinfo?(botond)

(In reply to Botond Ballo [:botond] from comment #33)

[Suggested wording]:
"Scroll animations triggered using a mouse wheel or using the keyboard (e.g. arrow keys, Page Down) are now based on a Mass-Spring-Damper physics model, giving the animations a more realistic feel."

Markus, if you'd like to expand on or otherwise suggest a change to the "giving the animations a more realistic feel" part, please feel free.

Flags: needinfo?(mstange.moz)

Thanks, added a slightly reworded note to the Fx122 nightly release notes, please allow 30 minutes for the site to update.
I can edit the note later if needed based on a response to Comment 34

(In reply to Will from comment #36)

I found some docs about Chromium's smooth scrolling that may be useful here

https://docs.google.com/document/d/1JQ6jLy-r7vw_I9s3rtWAIK137fMpF_OFVtfVsgOaoTA/edit

https://docs.google.com/document/d/1ygLvYSnRprmgOcTlezdAEHls2v0NI3Q3UFwgsAu1ISE/edit#heading=h.5hciy2ir8emz

https://docs.google.com/document/d/1A3VmlY3ZR6UtJt3QQ5uuqaCOgPjV6vCMxvkpvPBe0g0/edit#heading=h.jb2u9gncyxd3

Thanks for the links! It looks like Chromium has made a number of decisions similar to us, including starting with a smooth scroll animation modelled using a Bezier curve, and then later making adjustments to the animation's feel with a motivation of making it more like Edge.

One difference is that Chrome implemented the Edge-like physics by still using a Bezier curve but with different constants to approximate the intended shape, while our implementation simulates the actual physical forces involved.

I've been digging into this more and it turns out the impulse-style animation is something Microsoft brought to Chromium for Edge. Google decided not to use it for Chrome, saying it made 'perceived smoothness' worse for mouse wheel scrolling. And even Edge doesn't use it on macOS https://bugs.chromium.org/p/chromium/issues/detail?id=1076127#c2

I found a bunch more Chromium bugs where Google folks talk about refining/tuning their smooth scrolling, I can link them or email them to you if you'd like. I had no idea the amount of attention and work went into smooth scrolling!

But yes this MSD style seems to be an improvement for Firefox, at least to me + on my laptop. What's difficult is so many people hate any change so much that they just immediately reject it without giving it a chance, it makes making the right call even more difficult for things like this (switching to MSD)

Sorry but is MSD physics supposed to apply to the tabstrip scrolling too (click the buttons as well as mouse wheel scrolling in the tab strip). I'm getting inconsistent results in Nightly and was just wondering if this is intended to be included or not

(In reply to Will from comment #39)

Sorry but is MSD physics supposed to apply to the tabstrip scrolling too (click the buttons as well as mouse wheel scrolling in the tab strip). I'm getting inconsistent results in Nightly and was just wondering if this is intended to be included or not

I believe tabscript scrolling is implemented in JS using something equivalent to element.scrollBy({top, left, behavior: "smooth"}).

This type of script-driven scrolling has used MSD physics since the introduction of behaviour: "smooth" (though a slightly different implementation with different parameters).

Therefore, I don't expect tabstrip scrolling behaviour to have changed as a result of this bug.

Regressions: 1875627

This has been enabled on Nightly for 3 cycles now. Per our policy, removing this from the Nightly release notes. Feel free to nominate again for train-specific release notes when this is ready to ride the trains to Release.

Regressions: 1877972
Flags: needinfo?(mstange.moz)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: