Closed Bug 1671284 Opened 4 years ago Closed 4 years ago

Pinching in and out quickly on ebay.com can result in perma-checkerboarding

Categories

(Core :: Panning and Zooming, defect, P2)

defect

Tracking

()

VERIFIED FIXED
84 Branch
Tracking Status
firefox81 --- disabled
firefox82 --- disabled
firefox83 --- verified
firefox84 --- verified

People

(Reporter: kats, Assigned: kats)

References

(Blocks 1 open bug)

Details

Attachments

(3 files)

On macOS, with WR off if I pinch in and out quickly on ebay.com, I see that sometimes after finishing my zoom out, the displayport is just a small area on the top-left and not the full visible area. This happens somewhat rarely but I was able to trigger it to happen much more frequently with a WIP for bug 1654290 that triggers more repaints from AsyncPanZoomController::OnScale.

It looks like the culprit might be the resolution code circa here because I was getting this output with additional logging statements added:

2020-10-14 20:32:39.426963 UTC - [Child 19858: Main Thread]: V/apz.displayport Base rect (x=0, y=0, w=76800, h=44220) -> ScreenRect (x=0, y=0, w=2560, h=1474), res=2.34517, parentRes=1
2020-10-14 20:32:39.426969 UTC - [Child 19858: Main Thread]: V/apz.displayport Final screen rect (x=-256, y=-1071.25, w=3584, h=3584) -> result (x=-3275, y=-13704, w=45847, h=45847)
2020-10-14 20:32:39.426977 UTC - [Child 19858: Main Thread]: V/apz.displayport SetDisplayPortMargins {(t=889.867, r=309.099, b=889.867, l=309.099),(103.258,1343.95),(0,1320)} on scrollId=2, newDp=(x=0, y=-13704, w=45847, h=45847)

This shows that the base rect gets converted from w=76800 app units to w=2560 screen pixels on this line. Then it goes through the margin expansion and whatnot, and gets converted from w=3584 screen pixels to w=45847 app units on this line. This makes it end up narrower than the base rect it started out as, which seems quite wrong.

The mismatch between parentRes and res seems quite suspicious here.

Further investigation seems to indicate that the main-thread resolution is actually not getting updated properly. APZ sends a series of repaint requests as the zoom out is happening. For the last few requests, we hit this branch and so don't set the resolution, because one of the earlier requests changed the MT resolution and APZ hasn't gotten the update yet. And then when APZ does get the update with the "new" resolution, it doesn't trigger a repaint request based on that. So the final zoom level never gets properly synced to the main thread.

I guess we need to trigger a repaint request from APZ if it gets a NLU call where the resolution changes, because previous repaint requests may have failed to update the MT with the latest async zoom.

Further refinement: sometimes we do trigger a repaint request, by virtue of going into this code. But in some of those cases (particularly at the end of the zoom) the scroll offset hasn't changed and so the repaint request fails the GetScrollOffsetUpdated() check here and fails to update the MT resolution. That check seems kind of silly - we should probably be checking for whether or not there is an async zoom change, rather than if the scroll position updated. I'm guessing this code has fallen out of sync with other changes over the years.

Actually it seems like most of the time we do trigger a repaint request. I'm not sure under what conditions we wouldn't, and I can't reproduce that scenario any more. Probably still happens but I don't know why. Anyway the patch at https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&revision=dcaf2182b067fe2fdb382a2719c76715acbc01c0 seems to take care of the problem in comment 2. Not sure if I can write a test for it though.

Wrote a test. https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&revision=6bd3926c57402bd10e76e57b5df07e3c2e7d9fdc

I still see glitchy behaviour when zooming in and out rapidly on ebay, but it's rare enough that I haven't been able to pin it down. I'm going to punt on the rest of the glitches for now and go back to bug 1654290.

Summary: Pinching in and out quickly on ebay.com can result in checkerboarding → Pinching in and out quickly on ebay.com can result in perma-checkerboarding

I want to refactor a couple of the pinch utilities so I need to migrate this
code to stop using pinchZoomOutTouchSequenceAtCenter, and use
pinchZoomOutWithTouchAtCenter instead. As a bonus this migrates the test from
being a continuation-style to async/await-style.

Instead of a having a yield-style pinchZoomOutTouchSequenceAtCenter function
and a async pinchZoomOutWithTouchAtCenter wrapper around it, we now have an
general async wrapper synthesizeNativeTouchAndWaitForTransformEnd around the
main touch synthesization function, with pinchZoomOutWithTouchAtCenter being
a helper that uses it with a specific touch sequence.

In particular this allows reusing this code with other custom touch sequences,
which will happen in the next patch.

Depends on D93685

The APZCCallbackHelper code would only enter the codepath to set the presShell
resolution if the repaint request had a scroll position update. This seems silly
because really we care about whether or not the async zoom changed. The included
test exposes this silliness by demonstrating how the presShell resolution can
get stuck at an incorrect value because the necessary repaint requests get
ignored.

The patch allows the SetResolutionAndScaleTo codepath to also be entered if
there is an async zoom on the repaint request, so that we make sure to update
the presShell resolution even if the scroll position hasn't changed.

Depends on D93686

On https://phabricator.services.mozilla.com/D93687 Botond pointed out the last patch here could regress bug 1358771 and indeed it does. I spent some time digging into exactly what's happening and am posting a bit of a braindump here.

With the current code, we only ever call SetResolutionAndScaleTo if aRequest.GetScrollOffsetUpdated() is true. This means that as part of handling the repaint request, we also explicitly scroll the scrollframe via ScrollToCSSPixelsApproximate which is guarded by the same condition here.

The call to SetResolutionAndScaleTo triggers a reflow and, as part of the post-reflow actions, the scrollframe re-clamps the scroll position. With the current code this is basically a no-op, because the scroll position we set with ScrollToCSSPixelsApproximate (prior to reflow) already gets clamped and aligned, so in the post-reflow phase no additional work needs to happen. Importantly, when we set the scroll position with ScrollToCSSPixelsApproximate we use a ScrollOrigin::Apz which means that any changes to the scroll position don't need to trigger a ScrollPositionUpdate back to APZ.

With the patch, we now sometimes call SetResolutionAndScaleTo in cases where we don't call ScrollToCSSPixelsApproximate. This means that during the post-reflow phase, if the scroll position gets mutated as part of the re-clamping step, it happens with ScrollOrigin::Clamp, and causes the main thread to send an instant/absolute ScrollPositionUpdate to APZ. The scroll position mutation does happen, specifically due to the alignment computation. The changes are sub-CSS-pixel, usually on the order of a few app units. Here's an example: curPos = (0,406741) and pt = (0,406740) and original aPt = (0,406741). Even though the change is visually negligible it does trigger the ScrollPositionUpdate which, on the APZ side, causes an abort of animations and reset of in-progress zooming actions. To the user it appears that the zoom abruptly got interrupted, and they have to lift their fingers and start the zoom over.

It seems to me that the right way to fix this problem is to ensure we re-clamp the scroll position with ScrollOrigin::Apz any time we are calling SetResolutionAndScaleTo as a result of APZ zooming. Right now that happens because the two bits of code are guarded by the same condition, but we can do it more explicitly instead.

https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&revision=57fb81718f3a9e62cf872fc78b07f5fe93341de3 has updated patches that test the scenario in the above comment as well fixes the main patch to not trigger that bug.

Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a7616ad1f3d5
Reduce callers of pinchZoomOutTouchSequenceAtCenter. r=botond
https://hg.mozilla.org/integration/autoland/rev/4c9d685ee3f5
Refactor zoom-out helpers to allow custom touch sequences. r=botond
https://hg.mozilla.org/integration/autoland/rev/1171c5ea3c82
Don't drop APZ requests that have changed the zoom but not the scroll. r=botond

Comment on attachment 9181848 [details]
Bug 1671284 - Reduce callers of pinchZoomOutTouchSequenceAtCenter. r?botond

Beta/Release Uplift Approval Request

  • User impact if declined: Sometimes doing pinch-zooming rapidly when at the ends of a page can result in a state of checkerboarding that doesn't clear by itself. Additional scrolling does clear it.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: Go to ebay.ca and use trackpad to pinch zoom in and out, particularly when in the corners of the page. Sometimes the page ends up in a state of "permanent" checkerboarding, with only a small part of the page visible, and it remains that way until the user interacts with scrolling further.
  • List of other uplifts needed: None
  • Risk to taking this patch: Medium
  • Why is the change risky/not risky? (and alternatives if risky): This code has been traditionally somewhat brittle with changes causing subtle regressions. However in this case I spent a lot of time understanding the code and think I have a good understanding of what's going on. Also I added test coverage for both this scenario and a previous related bug which wasn't tested. So that should reduce the risk as well. Would prefer to bake this on Nightly for 2-3 days before uplifting to beta, just in case.
  • String changes made/needed: none
Attachment #9181848 - Flags: approval-mozilla-beta?
Attachment #9181849 - Flags: approval-mozilla-beta?
Attachment #9181850 - Flags: approval-mozilla-beta?
Flags: qe-verify+
QA Whiteboard: [qa-triaged]

I was able to reproduce this bug using an affected Nightly build from 2020-10-14, and by following the STR from comment 13.

The issue is verified as fixed on latest Nightly 84.0a1, under macOS 10.14.

Status: RESOLVED → VERIFIED

Comment on attachment 9181848 [details]
Bug 1671284 - Reduce callers of pinchZoomOutTouchSequenceAtCenter. r?botond

Desktop zooming is an 83 feature, QA verified on nightly and we don't have bug reports since it landed 3 days ago, let's take it in beta 4, thanks.

Attachment #9181848 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9181849 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9181850 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

This is also verified as fixed on Beta 83.0b4, under macOS 10.14.

Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: