Closed Bug 1739908 Opened 3 years ago Closed 3 years ago

bad GPU performance regression on Intel since Firefox 94 with Canvas2D

Tracking

()

Status:

RESOLVED FIXED

Milestone:

97 Branch

Tracking Flags:

Tracking

Status

firefox-esr91

---

unaffected

firefox94

---

wontfix

firefox95

---

wontfix

firefox96

---

wontfix

firefox97

---

fixed

People

(Reporter: tempel.julian, Assigned: bobowen)

References

(Regression)

Details

(Keywords: perf, power, regression)

Attachments

(2 files)

Bug 1739908 p1: Remove PersistentBufferProviderShared::mTextureLockIsUnreliable. r=lsalzman! 3 years ago Bob Owen (:bobowen) 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1739908 p2: Only use PersistentBufferProviderShared::mPermanentBackBuffer when first needed. r=lsalzman! 3 years ago Bob Owen (:bobowen) 48 bytes, text/x-phabricator-request		Details \| Review

walmartguy

Reporter

Description

•

3 years ago

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:95.0) Gecko/20100101 Firefox/95.0

Steps to reproduce:

With an Intel Gemini Lake mobile device, open www.vsynctester.com

Actual results:

Since Firefox 94, it drops lots of frames, as according to GPU driver sensors (e.g. use the tool Core Temp for convenience reasons), GPU power consumption exploded to ~8W, and thus the whole SoC starts to throttle clocks.

Expected results:

With Firefox 93, GPU power consumption was reported at ~3W and the result was much smoother accordingly. In both cases, Webrender with GPU acceleration was used.

It is 100% reproducible. Latest Windows 11 and Intel graphics drivers used, display is 1080p 60Hz.

I've created GPU performance profiler traces for Firefox 93 and latest Nightly:
93:
https://share.firefox.dev/3bLA7PU

nightly:
https://share.firefox.dev/3kdZajb

Please don't just switch to Webrender software as a workaround. Gemini Lake only has two weak CPU cores and putting more load on the CPU would be bad too. Linux with native OpenGL instead of ANGLE btw. is affected too.

walmartguy

Reporter

Updated

•

3 years ago

Component: Untriaged → Graphics: WebRender

Product: Firefox → Core

Darkspirit

Updated

•

3 years ago

Blocks: gfx-triage

status-firefox94: --- → affected

Keywords: perf, power, regression

OS: Unspecified → Windows 10

Hardware: Unspecified → x86_64

Ryan VanderMeulen [:RyanVM]

Comment 1

•

3 years ago

Are you able to bisect this with mozregression?

Flags: needinfo?(tempel.julian)

walmartguy

Reporter

Comment 2

•

3 years ago

2021-11-09T15:00:48.060000: DEBUG : Found commit message:
Bug 1709603: Use a separate permanent canvas back buffer when texture has synchronization. r=lsalzman

Differential Revision: https://phabricator.services.mozilla.com/D125201

2021-11-09T15:00:48.060000: DEBUG : Did not find a branch, checking all integration branches
2021-11-09T15:00:48.076000: INFO : The bisection is done.
2021-11-09T15:00:48.076000: INFO : Stopped

Flags: needinfo?(tempel.julian)

Ryan VanderMeulen [:RyanVM]

Comment 3

•

3 years ago

Excellent, thank you!

Status: UNCONFIRMED → NEW

status-firefox95: --- → affected

status-firefox96: --- → affected

status-firefox-esr91: --- → unaffected

Ever confirmed: true

Flags: needinfo?(bobowencode)

Regressed by: 1709603

BMO Automation

Updated

•

3 years ago

Has Regression Range: --- → yes

Bob Owen (:bobowen)

Assignee

Comment 4

•

3 years ago

Hi, thanks for reporting this.

The nightly performance trace doesn't seem to have any GPU Process tracks, would you be able to create another one please.

Flags: needinfo?(bobowencode) → needinfo?(tempel.julian)

walmartguy

Reporter

Comment 5

•

3 years ago

Bigger new trace: https://share.firefox.dev/30acbTu

Flags: needinfo?(tempel.julian)

Ryan VanderMeulen [:RyanVM]

Comment 6

•

3 years ago

Bob, does the new trace help?

status-firefox94: affected → wontfix

Flags: needinfo?(bobowencode)

Bob Owen (:bobowen)

Assignee

Comment 7

•

3 years ago

(In reply to Ryan VanderMeulen [:RyanVM] from comment #6)

Bob, does the new trace help?

I've had a fairly quick look and it's a little difficult to tell, but there are some longer spikes in the later one, although I'm not sure if that is because it contains more of the overall run. It is quite a bit longer.

The wait time in the canvas code actually seems smaller in the Nightly trace, but then it seems like the issue is more load on the GPU.
The change could cause extra coping of surfaces, but only in the case where the first write in each frame covers the entire canvas.
Maybe vsynctester does that, I'll have to check when I get a chance.

Flags: needinfo?(bobowencode)

Kelsey Gilbert [:jgilbert]

Comment 8

•

3 years ago

It's unfortunately extremely common for canvas2d to "reset before drawing" by doing a full screen draw to clear at the beginning of the frame before proceeding to the rest of the frame.

Jim Mathies [:jimm]

Comment 9

•

3 years ago

Bob mentioned he'd follow up.

No longer blocks: gfx-triage

Flags: needinfo?(bobowencode)

Bob Owen (:bobowen)

Assignee

Comment 10

•

3 years ago

I remembered when talking to jgilbert that before the remote canvas we were using a PersistentBufferProviderBasic on Windows anyway.
I think this uses a permanent back buffer with a similar copy at the end as well.

So it would be interesting to know if you see the same performance issues with the pref gfx.canvas.remote set to false in about:config.
You'll need to restart the browser for it to take effect.

Flags: needinfo?(bobowencode) → needinfo?(tempel.julian)

walmartguy

Reporter

Comment 11

•

3 years ago

Indeed, with gfx.canvas.remote = false GPU power consumption is again at normal 2W.

Flags: needinfo?(tempel.julian)

BugBot [:suhaib / :marco/ :calixte]

Comment 12

•

3 years ago

The severity field is not set for this bug.
:jimm, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(jmathies)

Jeff Muizelaar [:jrmuizel]

Updated

•

3 years ago

Summary: bad GPU performance regression on Intel since Firefox 94 → bad GPU performance regression on Intel since Firefox 94 with Canvas2D

Jim Mathies [:jimm]

Comment 13

•

3 years ago

Maybe our current canvas work will help here?

Severity: -- → S4

Flags: needinfo?(jmathies) → needinfo?(lsalzman)

Bob Owen (:bobowen)

Assignee

Comment 14

•

3 years ago

I did notice one thing that we're doing extra from the GPU point of view with the remote verses non-remote.
When I switched to using a permanent back buffer, I didn't fix up PersistentBufferProviderShared::PreservesDrawingState, so we push and pop the clips and transforms every frame.
I'm doubtful that would cause such a big difference, but here is a test build with that fixed if you wouldn't mind testing:
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/ZpWsYuLPSga-hC5ZXaBCjQ/runs/0/artifacts/public/build/install/sea/target.installer.exe

Flags: needinfo?(tempel.julian)

walmartguy

Reporter

Comment 15

•

3 years ago

Yes, still bad with that build.

Flags: needinfo?(tempel.julian)

Lee Salzman [:lsalzman]

Updated

•

3 years ago

Flags: needinfo?(lsalzman)

Pascal Chevrel:pascalc

Updated

•

3 years ago

status-firefox95: affected → wontfix

Bob Owen (:bobowen)

Assignee

Comment 16

•

3 years ago

Thought some more about the difference from the GPU's point of view with current remote canvas and non-remote.
I realised that if we had a hidden canvas, where the texture isn't forwarded, we might keep copying into the front buffer for remote.
Whereas for non-remote that only happens on forwarding.

It'll need some more work, but here's a build that tries to work around that, would you mind trying it out:
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/ex-RYD1cTHmivZBRMxM4Dw/runs/0/artifacts/public/build/install/sea/target.installer.exe

Flags: needinfo?(tempel.julian)

walmartguy

Reporter

Comment 17

•

3 years ago

Yep, with that build, power consumption is back to normal. :)

Flags: needinfo?(tempel.julian)

Bob Owen (:bobowen)

Assignee

Comment 18

•

3 years ago

(In reply to walmartguy from comment #17)

Yep, with that build, power consumption is back to normal. :)

Great, thanks for testing.
Now I just need to see if it/make sure it doesn't break things in other ways. :-)

walmartguy

Reporter

Comment 19

•

3 years ago

Will the fix also improve situation for Linux/Android builds with native GL Webrender?

Bob Owen (:bobowen)

Assignee

Comment 20

•

3 years ago

(In reply to walmartguy from comment #19)

Will the fix also improve situation for Linux/Android builds with native GL Webrender?

No, I think this would only be affecting Windows, sorry.

Bob Owen (:bobowen)

Assignee

Comment 21

•

3 years ago

(In reply to Bob Owen (:bobowen) from comment #18)

(In reply to walmartguy from comment #17)

Yep, with that build, power consumption is back to normal. :)

Great, thanks for testing.
Now I just need to see if it/make sure it doesn't break things in other ways. :-)

Unfortunately that change did have other problems.
I hope I've come up with an alternative though.
This one doesn't start using the permanent back buffer unless we try to read lock a front buffer (for copy or snapshot), that is still in use by the compositor.
Hopefully this will mean that we only use one buffer (and no copies) for the unforwarded case (like on vsynctester) and also for the case that the JS always writes to the full canvas at the start of frames and doesn't need a snapshot between frames.
Anyway, if you wouldn't mind testing another test build, sorry:
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Q06ArVpvTse5YOWTst_UMw/runs/0/artifacts/public/build/install/sea/target.installer.exe

Flags: needinfo?(tempel.julian)

walmartguy

Reporter

Comment 22

•

3 years ago

Power consumption is still low with that build.

I'll recheck situation on Linux, just to be sure.

Flags: needinfo?(tempel.julian)

Bob Owen (:bobowen)

Assignee

Comment 23

•

3 years ago

(In reply to walmartguy from comment #22)

Power consumption is still low with that build.

I'll recheck situation on Linux, just to be sure.

Excellent, thanks for all the testing.
I'll get those changes up for review.

Bob Owen (:bobowen)

Assignee

Comment 24

•

3 years ago

Attached file Bug 1739908 p1: Remove PersistentBufferProviderShared::mTextureLockIsUnreliable. r=lsalzman! — Details

This measure was originally put in to help with what was believed to be an issue
with ClearCachedResources, but we now think it was down to textures being
re-forwarded on tab switch when already read locked.
A change in bug 1717209 fixed this, so I think we can safely remove
mTextureLockIsUnreliable, which would cause some compilcations with the
following patch.

Phabricator Automation

Updated

•

3 years ago

Assignee: nobody → bobowencode

Status: NEW → ASSIGNED

Bob Owen (:bobowen)

Assignee

Comment 25

•

3 years ago

Attached file Bug 1739908 p2: Only use PersistentBufferProviderShared::mPermanentBackBuffer when first needed. r=lsalzman! — Details

This removes some of the changes that meant we started using
mPermanentBackBuffer straight away and we now wait until we actually try and
lock a read locked texture.
While this might still give a very small risk of contention, it gives
improvements in the following two circumstances.

If a canvas texture is never forwarded and never read locked, it means we will
only use one texture with no copies.
If a canvas is always fully overwritten at the start of the frame (and a
snapshot is not taken between frames), then we avoid a copy on each frame.

This also adds back in code so that on an OPEN_READ_WRITE lock we cache the data
surface if required, because that texture will be the new front buffer and we
won't be using mPermanentBackBuffer at that point.

Depends on D132601

Bob Owen (:bobowen)

Assignee

Comment 26

•

3 years ago

https://treeherder.mozilla.org/jobs?repo=try&revision=eb4b7a80971aaccd7ecfeef4df76e4dbd2d38366

walmartguy

Reporter

Comment 27

•

3 years ago

Tested that build, it's still fine.

I've retested Linux and situation is different than I initially thought. It hasn't regressed with 94, but also 93 was already relatively bad with ~8.2W SoC power consumption (Xorg EGL backend forced on, no Xorg compositor active). Well, anyway not that pressing matter as the >11W total SoC on Windows with 93.

Bob Owen (:bobowen)

Assignee

Comment 28

•

3 years ago

I'm going to wait and land this as soon as we merge and give it a little while on Nightly just in case we do see a re-emergence of the lock contention.

Pulsebot

Comment 29

•

3 years ago

Pushed by bobowencode@gmail.com: https://hg.mozilla.org/integration/autoland/rev/366bdd769b86 p1: Remove PersistentBufferProviderShared::mTextureLockIsUnreliable. r=lsalzman https://hg.mozilla.org/integration/autoland/rev/757d31ebc575 p2: Only use PersistentBufferProviderShared::mPermanentBackBuffer when first needed. r=lsalzman

Norisz Fay [:noriszfay]

Comment 30

•

3 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/366bdd769b86
https://hg.mozilla.org/mozilla-central/rev/757d31ebc575

Status: ASSIGNED → RESOLVED

Closed: 3 years ago

status-firefox97: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 97 Branch

BugBot [:suhaib / :marco/ :calixte]

Comment 31

•

3 years ago

The patch landed in nightly and beta is affected.
:bobowen, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(bobowencode)

Andra Esanu (needinfo me)

Comment 32

•

3 years ago

== Change summary for alert #32689 (as of Fri, 10 Dec 2021 11:29:07 GMT) ==

Improvements:

Ratio	Test	Platform	Options	Absolute values (old vs new)
7%	pdfpaint	windows10-64-shippable-qr	e10s stylo webrender	614.64 -> 573.73
5%	pdfpaint	windows10-64-shippable-qr	e10s stylo webrender	611.55 -> 579.65

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=32689

Bob Owen (:bobowen)

Assignee

Comment 33

•

3 years ago

(In reply to Release mgmt bot [:sylvestre / :calixte / :marco for bugbug] from comment #31)

The patch landed in nightly and beta is affected.
:bobowen, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.

It doesn't look like any issues have been introduced/re-introduced, but given the time of year I think uplifting a performance bug to Beta is probably not the right call.

status-firefox96: affected → wontfix

Flags: needinfo?(bobowencode)

You need to log in before you can comment on or make changes to this bug.