Closed Bug 1669625 Opened 4 years ago Closed 4 years ago

Lots of picture cache invalidations with desktop zooming scrollbars enabled + certain DPI ratios

Categories

(Core :: Graphics: WebRender, defect)

defect

Tracking

()

RESOLVED FIXED
83 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox81 --- unaffected
firefox82 --- unaffected
firefox83 --- fixed

People

(Reporter: gw, Assigned: tnikkel)

References

(Regression, )

Details

(Keywords: regression)

Attachments

(2 files, 1 obsolete file)

With the patch in https://bugzilla.mozilla.org/show_bug.cgi?id=1657822 enabled, there is a lot of picture cache tile invalidation occurring when scrolling on machines with certain DPI ratios.

In my case, the AppUnitsPerDevPixel() value is 43.0 (from a zoom value of 140%).

That zoom value in combination with the patch above appears to trigger some kind of small offsets in positions and clips of elements that get sent to the WR display list.

Regressed by: 1657822
Has Regression Range: --- → yes

With gw's help we tracked this down to this line

https://searchfox.org/mozilla-central/rev/1a973762afcbc5066f73f1508b0c846872fe3952/layout/generic/nsGfxScrollFrame.cpp#5477

I think something about rounding GetScrollPosition but not rounding GetVisualViewportOffset might be the problem.

When the compositor gets back to the main thread with a scroll position update it first calls SetVisualViewportOffset and then tries to scroll the scroll frame (in ScrollFrame in APZCallbackHelper). If we use GetScrollPosition for the scrollbar position in UpdateScrollbarPosition then the UpdateScrollbarPosition call that happens from SetVisualViewportOffset is a no-op because the scroll position hasn't changed. Then when we get to ScrollToImpl it calls UpdateScrollbarPosition except it has a AutoScrollbarRepaintSuppression on the stack so this SchedulePaint call is prevented

https://searchfox.org/mozilla-central/rev/1a973762afcbc5066f73f1508b0c846872fe3952/layout/xul/nsSliderFrame.cpp#764

If we use the visual viewport offset for the scrollbar position in UpdateScrollbarPosition, then the UpdateScrollbarPosition call that happens from SetVisualViewportOffset is no longer a no-op and it gets to that SchedulePaint call above, except this time there is nothing to stop it from SchedulePaint.

When the app unit per dev pixel ratio is a nice round number we still get these SchedulePaint calls, but it doesn't cause the picture cache to be invalid. I'm haven't debugged why that is, but I think it might have to do with webrender specifics.

Attached patch afix (obsolete) — Splinter Review

This fixes the extra schedule paint this is a regression from bug 1657822.

Assignee: nobody → tnikkel

The attached patch should fix the regression from bug 1657822 (but it smushes together a different fix but to the same code that should be separated out cleanly).

So the question remains, do we expect scheduling an extra main thread paint to expose the existing known issues that webrender has when app units per dev pixel isn't a round value that Andrew is working on or are there more bugs here?

Flags: needinfo?(gwatson)
Flags: needinfo?(aosmond)

Set release status flags based on info from the regressing bug 1657822

Also, question for gw/others: how can we write a test for this so that we don't break it again? It seems rather brittle in that seemingly-unrelated changes in faraway code can trigger the picture caching to silently break, so we should try to add tests to catch this. If WR can expose a mechanism to know if picture caching is working or not then we can probably have a mochitest that does some scrolling and checks the WR-exposed API to ensure it is still working.

I should be able to write a test for the regression part that I caused (checking that we don't get a main thread paint in this situation) and we would want a test for that regardless of any webrender related issue, so I'll try to do that. But I also agree that having tests that we don't completely wreck picture caching in common scenarios is something we want.

(Just a note that the non round app units per dev pixel I am referring to come about simply by using the standard full zoom steps)

WR does have functionality in the returned results structure from render to list the number of invalidated tiles. We could probably expose that and create a test case that sets a non-round DPI ratio, scrolls, and ensures that no tiles were invalidated?

Flags: needinfo?(gwatson)

It does seem like there is some other underlying webrender issue going on here that is different from bug 1642308. I tested all relevant combinations of bug 1657822 (new scrollbar code), bug 1642308 (wr bug fixing fractional offset things), and my fix for the extra SchedulePaint call (this bug). I tested scrolling the url testcase of this bug with picture cache debugging enable to see if it was almost all green, or very red. In all cases the patch from bug 1642308 had no effect on the outcome.

I don't think doing a main thread paint (if nothing has changed) should invalidate the picture cache. I'll file a bug for this for webrender people to look into, but I will use this bug solely to handle the direct regression caused by enabling the new scrollbar code (bug 1657822), ie for eliminating the extra SchedulePaint.

Oh, and the fact that the SchedulePaint call still happens when the app units per dev pixel value is nice and round but the webrender picture caching problem doesn't happen in that situation suggests that it is a pre-existing webrender problem and not some other change or problem with the new scrollbar code.

We don't want these calls anyway, but they seem to trigger a pre-existing webrender issue.

When the compositor gets back to the main thread with a scroll position update it calls ScrollFrame in APZCCallbackHelper (https://searchfox.org/mozilla-central/rev/f82d5c549f046cb64ce5602bfd894b7ae807c8f8/gfx/layers/apz/util/APZCCallbackHelper.cpp#160). ScrollFrame first calls SetVisualViewportOffset and then tries to scroll the scroll frame (ie update the layout/main thread scroll position).

With the old scrollbar code (ie before bug 1657822) we update the scrollbar position (with a call to UpdateScrollbarPosition) in ScrollFrameHelper::ScrollToImpl (https://searchfox.org/mozilla-central/rev/f82d5c549f046cb64ce5602bfd894b7ae807c8f8/layout/generic/nsGfxScrollFrame.cpp#3155), ie after we update the layout scroll position. Note that we have a AutoScrollbarRepaintSuppression on the stack during this call, this prevents a SchedulePaint call from happening in nsSliderFrame as a result (https://searchfox.org/mozilla-central/rev/f82d5c549f046cb64ce5602bfd894b7ae807c8f8/layout/xul/nsSliderFrame.cpp#764).

With the new scrollbar code we new update the scrollbar position in the call to SetVisualViewportOffset. But there is no AutoScrollbarRepaintSuppression on this path, so we avoid the SchedulePaint call.

We want to avoid this SchedulePaint call anyways, but it seems to tickle some pre-existing webrender issue when the app units per dev pixel value (affected by full zoom and system zoom) isn't something nice and round (like 30 or 60).

Attachment #9180124 - Attachment is obsolete: true

Filed bug 1669864 for webrender to get a test for picture caching.

Filed bug 1669865 to the underlying webrender issue.

Further, if I back out the regressing bug (bug 1657822) and just add a SchedulePaint call in PresShell::SetVisualViewportOffset then I can reproduce the picture caching problem. So it seems this truly is a pre-existing webrender issue and has nothing to do with bug 1657822, except that bug 1657822 added an extra SchedulePaint call.

Bug 1651332 added the code in question (but it was preffed off).

Blocks: 1651332
Pushed by tnikkel@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9a36ab15f9da
Avoid extra, unneeded SchedulePaint calls with the new scrollbar code. r=kats
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 83 Branch
Regressions: 1742274
Flags: needinfo?(aosmond)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: