Closed Bug 1930507 Opened 10 months ago Closed 9 months ago

gfx performance regression caused by sidebar transform

Categories

(Core :: Graphics: WebRender, defect)

defect

Tracking

()

RESOLVED FIXED
135 Branch
Tracking Status
firefox-esr128 --- unaffected
firefox132 --- wontfix
firefox133 --- wontfix
firefox134 --- wontfix
firefox135 --- fixed

People

(Reporter: david.turner, Unassigned)

References

(Depends on 1 open bug, Regression)

Details

(Keywords: regression)

Attachments

(8 files, 1 obsolete file)

Steps to reproduce:

I have an automated test which plays back a 720p60 video on a Raspberry Pi 4 (Raspberry Pi OS bookworm, aarch64) and records the framerate. This test is GPU-limited on a Pi4.

Actual results:

In Firefox 131.0 I see an average framerate of around 50-51fps. In Firefox 132.0 I see an average framerate of around 43-44fps. I git bisected the issue down to git commit 18dc02a0, which is https://phabricator.services.mozilla.com/D220846 , Bug 1915230

The difference is very clear:

  • Running 18dc02a0^ gives FPS measurements: 50.4 49.8 50.0 50.0 49.8
  • Running 18dc02a0 gives FPS measurements: 43.3 44.1 43.4 43.6 42.7

I've confirmed the performance is still degraded in the latest nightly build.

I'm very confused why this change has caused this performance reduction given I don't even have the sidebar shown. But I've gone back and forth before/after this commit several times and the change is consistent.

Expected results:

No performance regression.

Keywords: regression
Regressed by: 1915230

The Bugbug bot thinks this bug should belong to the 'Core::Graphics: WebRender' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Graphics: WebRender
Product: Firefox → Core

:emilio, since you are the author of the regressor, bug 1915230, could you take a look? Also, could you set the severity field?

For more information, please visit BugBot documentation.

Flags: needinfo?(emilio)
Attached file video_player.html

Attached a simple reproducer (it's just a video player with an FPS display...), insert any 720p60 H.264 video. The main thing is the regression will only be visible on a platform GPU-constrained enough that it's dropping frames playing this video.

That is a bit surprising... Wild guess, if you remove this line, does it get fixed?

That forces the tabbox to be a stacking context, and I wonder if somehow WebRender is doing something not-too-clever there.

Flags: needinfo?(emilio) → needinfo?(david.turner)

Yep, confirmed that removing that line fixes the problem.

What's extra weird is that I see the same performance degradation (which is also fixed by removing that line) with fullscreen video, so there's definitely something weird going on.

Flags: needinfo?(david.turner)

Could you try to make that line translate: 0 instead? Does it also get fixed in that case?

Trying to see if it's will-change specific, or not. I'm suspecting this code is causing us to always consider the tab box animated, which isn't great!

Flags: needinfo?(david.turner)

Is that not the intention of will-change? If not, how should we treat will-change differently?

will-change doesn't mean it is changing. We already did the same thing
for opacity years ago.

In particular, will-change is useful to prevent side effects when the
property changes (like going from being a containing block for fixed-pos
descendants to not being it), but it doesn't seem useful to force
layerization, and it slows down things like video playback.

Remove some now-dead code, which also gives us a frame bit, yay.

Assignee: nobody → emilio
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true

(In reply to Emilio Cobos Álvarez (:emilio) from comment #6)

Could you try to make that line translate: 0 instead? Does it also get fixed in that case?

Trying to see if it's will-change specific, or not. I'm suspecting this code is causing us to always consider the tab box animated, which isn't great!

If we are hitting that code it means we hitting the blob fallback code when drawing our browser chrome? That doesn't seem good. Why is that happening?

Could you try to make that line translate: 0 instead? Does it also get fixed in that case?

I've just tested and that doesn't fix the problem. Also D228650 does not help.

Flags: needinfo?(david.turner)

(In reply to Timothy Nikkel (:tnikkel) from comment #7)

Is that not the intention of will-change? If not, how should we treat will-change differently?

It already forces a stacking context and all the other side effects of transform. I don't think we should treat it as active unconditionally.

In any case we should probably move it to a different bug if it doesn't help. Maybe just making it a stacking context is what's causing the regression?

Set release status flags based on info from the regressing bug 1915230

As a quick check I just enabled the tile-cache tree printer by making the print at picture.rs:5783 unconditional, but the trees look identical (aside from IDs) in the good and bad cases. I'd have thought this would show if the stacking context was resulting in the content getting put in another layer of picture, but my understanding of all this is very limited.

Yeah that was my guess too... Glenn / Nical, any idea on where to look next?

Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(gwatson)
Assignee: emilio → nobody
Status: ASSIGNED → NEW
See Also: → 1930674

Comment on attachment 9436827 [details]
Bug 1930507 - Don't consider will-change: transform an always-active layer. r=nical,#gfx-reviewers

Revision D228650 was moved to bug 1930674. Setting attachment 9436827 [details] to obsolete.

Attachment #9436827 - Attachment is obsolete: true

Could you try enabling gfx.webrender.debug.profiler and gfx.webrender.debug.gpu-time-queries and seeing if there's any obvious differences in either the graphs or the counters (or take a screenshot / video of them, paste here and I'll see if I can spot any obvious differences)?

Flags: needinfo?(gwatson)

(In reply to Emilio Cobos Álvarez (:emilio) from comment #11)

(In reply to Timothy Nikkel (:tnikkel) from comment #7)

Is that not the intention of will-change? If not, how should we treat will-change differently?

It already forces a stacking context and all the other side effects of transform. I don't think we should treat it as active unconditionally.

The reason it forces a stacking context and all the other side effects of a transform is so that it can animate without any extra layout work, ie the whole point of it is to animate.

Maybe you should share a profile using https://profiler.firefox.com/ in graphics preset?

I've uploaded the charts as I'm not too sure what to look for.

  • debug.profiler default UI: debug_profiler_{good,bad}.png - there's a little more time in Frame CPU total but nothing drastic. I've also captured separate screenshots with just the memory parts which got cropped off: gpu_mem_{good,bad}.png.
  • gpu-time-queries doesn't work on Raspberry Pi because our GPU drivers don't implement the timers extension. I've captured it on my x86-64 laptop (x86_time_queries_{good,bad}.png) where the bad case does spend significantly more time in "Composite"
  • Profiler captures, good case: https://share.firefox.dev/4fKp2xc, bad case: https://share.firefox.dev/40MbPjy

Overall there's definitely a difference in the render tree produced with will-change: translate but it's not obvious to me why it seems to be harder work. The "bad" case has fewer primitives/flips/pictures but more "images".

Attached image debug_profiler_bad.png
Attached image debug_profiler_good.png
Attached image gpu_mem_bad.png
Attached image gpu_mem_good.png

Set as S3, but might be S2 depending on how many users it has a noticeable effect on?

Severity: -- → S3

Surprisingly, renderdoc captures with (labeled bad) and without (labeled good) the will-change: translate; in browser/themes/shared/sidebar.css look pretty much identical.

These were made from wrench captures, forcing wrench to re-do the entire frame in both cases so it isn't quite what's happening in the browser (regarding invalidation/caching/etc.), but it's the best we can get in a tool like renderdoc.

Set release status flags based on info from the regressing bug 1915230

Bug 1933823 probably explains what's going on... I believe that will-change somehow messes up our opaque region tracking.

See Also: → 1933823

Yeah, this would do. will-change adds a wrapping item, so bye opaque region. That's rather unfortunate. We should consider ancestor items at least.

Ironically, bug 1908471 will fix this if my diagnostic is correct, because it moves the background to the container element (#browser). Can you confirm once that lands?

Flags: needinfo?(david.turner)

(Not that we should rely on that, but that'd confirm the diagnostic)

Depends on: 1908471

Bug 1933823 probably explains what's going on... I believe that will-change somehow messes up our opaque region tracking.

Aha, that makes perfect sense and explains what I've been seeing (why the webrender traces looked very similar between good/bad, and why things got even worse when I ran my compositor on sw rendering)

Ironically, bug 1908471 will fix this if my diagnostic is correct, because it moves the background to the container element (#browser). Can you confirm once that lands?

Yep, confirmed that applying D229676 / Bug 1908471 makes the badness go away.

Flags: needinfo?(david.turner)
Depends on: 1933952

Ok, I guess:

  • This is fixed by bug 1908471.
  • Poor / footgunny opaque region tracking is tracked in bug 1933952.
  • A test in bug 1933958 should prevent us from accidentally regressing this... At least once we start running our Linux tests in a more modern version of ubuntu which is in the works.

So I guess closing as a dupe for now as there's not much else to do here? Let me know if you disagree tho.

Status: NEW → RESOLVED
Closed: 9 months ago
Duplicate of bug: 1908471
Resolution: --- → DUPLICATE
Flags: needinfo?(nical.bugzilla)

That sounds good to me, thanks! My own tests will also catch if this regresses and I'll make some noise.

No longer duplicate of bug: 1908471
Resolution: DUPLICATE → FIXED
Target Milestone: --- → 135 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: