Open Bug 1595680 Opened 5 years ago Updated 1 year ago

Slightly different values in display lists causing extra picture cache invalidations.

Categories

(Core :: Graphics: WebRender, defect, P3)

defect

Tracking

()

Performance Impact high

People

(Reporter: gw, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: perf:animation, perf:responsiveness, top50)

Attachments

(1 file)

On certain pages, I see unexpected picture cache invalidations while scrolling.

This seems to be caused by coordinates in clips that are slightly different between display lists, triggering WR to think the content has changed. I think it only happens on items that have fractional values (but I'm not certain of this).

For example, during scrolling https://reddit.com/r/rust, I see the following clip items being interned on subsequent frames:

RoundedRectangle(RectangleKey { x: 292.0, y: 6355.0, w: 74.0, h: 73.0 }, BorderRadiusAu { top_left: 36.5px×36.5px, top_right: 36.5px×36.5px, bottom_left: 36.5px×36.5px, bottom_right: 36.5px×36.5px })

RoundedRectangle(RectangleKey { x: 292.0, y: 6355.0, w: 74.0, h: 74.0 }, BorderRadiusAu { top_left: 36.983333333333334px×36.983333333333334px, top_right: 36.983333333333334px×36.983333333333334px, bottom_left: 36.983333333333334px×36.983333333333334px, bottom_right: 36.983333333333334px×36.983333333333334px })

RoundedRectangle(RectangleKey { x: 292.0, y: 6354.0, w: 74.0, h: 75.0 }, BorderRadiusAu { top_left: 37px×37px, top_right: 37px×37px, bottom_left: 37px×37px, bottom_right: 37px×37px })

From the parameters and position, we can assume these are the same primitive, but with slightly different values being produced each time a display list with a different scrolling offset is generated.

It looks like the size and/or rectangles of the primitives are being snapped inconsistently. Does this sound plausible? I can dig into this further tomorrow but if you have any ideas or hints on where to look, let me know.

Flags: needinfo?(aosmond)
Blocks: 1536360

I usually run with an effective layout.css.devPixelsPerPx value of 1.5 (due to my OS display settings), which I thought might be relevant here. But the same issues occur if I use a whole number for layout.css.devPixelsPerPx (I tried 1.0 and 2.0).

On a slightly related note, there is https://phabricator.services.mozilla.com/D52673 which shouldn't make things worse (given the radii are changing as well).

As far as I know, we don't snap the complex clip radii anywhere in WR and non-WR. Just the bounding rect associated with the rounded clip. Presumably that would be enough to cause a picture cache invalidation?

Flags: needinfo?(aosmond)
Priority: -- → P3
Blocks: wr-perf

I traced this back as far as ClipManager::DefineClipChain and can see that the border radii are changing during scrolling.

For example, scrolling on https://reddit.com/r/amiga, I see:

x=295.000000,y=3161.000000 -> 34.200001
x=295.000000,y=4214.000000 -> 33.700001
x=295.000000,y=4214.000000 -> 33.950001
x=295.000000,y=4214.000000 -> 34.174999
x=295.000000,y=5234.000000 -> 33.599998
x=295.000000,y=5233.000000 -> 33.849998
x=295.000000,y=5233.000000 -> 34.099998
x=295.000000,y=5233.000000 -> 34.325001
x=295.000000,y=928.000000 -> 33.599998
x=295.000000,y=928.000000 -> 33.849998
x=295.000000,y=927.000000 -> 34.325001
x=295.000000,y=3162.000000 -> 33.724998

Where the x/y coordinates are the local origin of a clip region inside DisplayItemClip::ToComplexClipRegions, and the last number is the radius being set inside the rounded rect clip region.

It's expected that the y coord is different (due to the way Gecko includes the scroll offset in local primitive coordinates).

However, the changing border radius is causing issues. WR sees the content as different, hashes it to a different value to intern, and thus invalidates various tiles during scrolling.

Questions:

  1. What is the cause of the change in the border radii in this case?
  2. Should they be getting rounded, and if so, where?
  3. Other backends don't seem to invalidate in this case - what is the difference here? Do they round after DL construction?
Flags: needinfo?(mstange)
Flags: needinfo?(jmuizelaar)

Andrew and I can look at this tomorrow.

Flags: needinfo?(jmuizelaar) → needinfo?(aosmond)
Attached video cache.mp4

Forgive the aspect ratio, but this recording shows some of the tiles being invalidated.

The tiles that are invalidating unexpectedly are the left most tiles that flash as they are being scrolled.

There are a couple of tiles where a video is playing - these are expected to invalidate every frame.

There are also some tiles on the right being invalidated - some of these are due to the fixed position elements on the right, but some of these might also be seeing the same issue as the tiles on the left.

So there are a bunch of things going on here:

  • There is a lot of JavaScript running whenever you scroll and on a timer basis. This primarily appears to be for loading/removing content from the DOM tree as you move around the page. If JavaScript is disabled, all the invalidations go away.

  • When the content is modified, we get a scene rebuild, and this subtly impacts the layout of the tree. It isn't just the border radii -- if I snap them like everything else, then I still see a similar number of cache invalidations.

  • There are scroll positions on the page where we oscillate -- best guess at this time is that due to the scroll position, it decides to fetch more content, and it inserts it. After it has inserted, our relative scroll position is different, and it decides to remove content. This gets us into a continuous repainting loop. This also happens without WebRender.

  • What doesn't happen without WebRender, and does happen with WebRender, is the same level of invalidations. Without WebRender, what becomes obvious is there is an animation in the upper left hand corner of each YouTube video linked. That contributes to some of the extra invalidations observed with WebRender, since it doesn't shrink the size of the picture cache entry if you are scrolling while that is animating.

  • Also of note is that this reminds me of bug 1541072 which has been reported with and without WebRender. If the visible layout is changing due to subtle changes in the underlying content off screen, then that could explain all of the issues above.

Miko - NI-ing you so we talk about this when you are back

Flags: needinfo?(mikokm)
Blocks: picture-cache-perf
No longer blocks: wr-perf
Flags: needinfo?(mikokm)
Flags: needinfo?(aosmond)
Performance Impact: --- → P1
No longer blocks: 1536360

The Performance Priority Calculator has determined this bug's performance priority to be P1. If you'd like to request re-triage, you can reset the Performance flag to "?" or needinfo the triage sheriff.

Platforms: [x] Windows [x] macOS [x] Linux
Impact on site: Causes noticeable jank
Websites affected: Major
[x] Affects animation smoothness

Severity: normal → S3

The severity field for this bug is set to S3. However, the Performance Impact field flags this bug as having a high impact on the performance.
:gw, could you consider increasing the severity of this performance-impacting bug? Alternatively, if you think the performance impact is lower than previously assessed, could you request a re-triage from the performance team by setting the Performance Impact flag to ??

For more information, please visit auto_nag documentation.

Flags: needinfo?(gwatson)
Flags: needinfo?(gwatson)
Flags: needinfo?(mstange.moz)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: