bad GPU performance regression on Intel since Firefox 94 with Canvas2D
Categories
(Core :: Graphics: WebRender, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr91 | --- | unaffected |
firefox94 | --- | wontfix |
firefox95 | --- | wontfix |
firefox96 | --- | wontfix |
firefox97 | --- | fixed |
People
(Reporter: tempel.julian, Assigned: bobowen)
References
(Regression)
Details
(Keywords: perf, power, regression)
Attachments
(2 files)
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:95.0) Gecko/20100101 Firefox/95.0
Steps to reproduce:
With an Intel Gemini Lake mobile device, open www.vsynctester.com
Actual results:
Since Firefox 94, it drops lots of frames, as according to GPU driver sensors (e.g. use the tool Core Temp for convenience reasons), GPU power consumption exploded to ~8W, and thus the whole SoC starts to throttle clocks.
Expected results:
With Firefox 93, GPU power consumption was reported at ~3W and the result was much smoother accordingly. In both cases, Webrender with GPU acceleration was used.
It is 100% reproducible. Latest Windows 11 and Intel graphics drivers used, display is 1080p 60Hz.
I've created GPU performance profiler traces for Firefox 93 and latest Nightly:
93:
https://share.firefox.dev/3bLA7PU
nightly:
https://share.firefox.dev/3kdZajb
Please don't just switch to Webrender software as a workaround. Gemini Lake only has two weak CPU cores and putting more load on the CPU would be bad too. Linux with native OpenGL instead of ANGLE btw. is affected too.
Reporter | ||
Updated•3 years ago
|
Updated•3 years ago
|
Comment 1•3 years ago
|
||
Are you able to bisect this with mozregression?
Reporter | ||
Comment 2•3 years ago
|
||
2021-11-09T15:00:48.060000: DEBUG : Found commit message:
Bug 1709603: Use a separate permanent canvas back buffer when texture has synchronization. r=lsalzman
Differential Revision: https://phabricator.services.mozilla.com/D125201
2021-11-09T15:00:48.060000: DEBUG : Did not find a branch, checking all integration branches
2021-11-09T15:00:48.076000: INFO : The bisection is done.
2021-11-09T15:00:48.076000: INFO : Stopped
Comment 3•3 years ago
|
||
Excellent, thank you!
Updated•3 years ago
|
Assignee | ||
Comment 4•3 years ago
|
||
Hi, thanks for reporting this.
The nightly performance trace doesn't seem to have any GPU Process tracks, would you be able to create another one please.
Reporter | ||
Comment 5•3 years ago
|
||
Bigger new trace: https://share.firefox.dev/30acbTu
Assignee | ||
Comment 7•3 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #6)
Bob, does the new trace help?
I've had a fairly quick look and it's a little difficult to tell, but there are some longer spikes in the later one, although I'm not sure if that is because it contains more of the overall run. It is quite a bit longer.
The wait time in the canvas code actually seems smaller in the Nightly trace, but then it seems like the issue is more load on the GPU.
The change could cause extra coping of surfaces, but only in the case where the first write in each frame covers the entire canvas.
Maybe vsynctester does that, I'll have to check when I get a chance.
Comment 8•3 years ago
|
||
It's unfortunately extremely common for canvas2d to "reset before drawing" by doing a full screen draw to clear at the beginning of the frame before proceeding to the rest of the frame.
Comment 9•3 years ago
|
||
Bob mentioned he'd follow up.
Assignee | ||
Comment 10•3 years ago
|
||
I remembered when talking to jgilbert that before the remote canvas we were using a PersistentBufferProviderBasic
on Windows anyway.
I think this uses a permanent back buffer with a similar copy at the end as well.
So it would be interesting to know if you see the same performance issues with the pref gfx.canvas.remote
set to false
in about:config
.
You'll need to restart the browser for it to take effect.
Reporter | ||
Comment 11•3 years ago
|
||
Indeed, with gfx.canvas.remote = false GPU power consumption is again at normal 2W.
Comment 12•3 years ago
|
||
The severity field is not set for this bug.
:jimm, could you have a look please?
For more information, please visit auto_nag documentation.
Updated•3 years ago
|
Comment 13•3 years ago
|
||
Maybe our current canvas work will help here?
Assignee | ||
Comment 14•3 years ago
|
||
I did notice one thing that we're doing extra from the GPU point of view with the remote verses non-remote.
When I switched to using a permanent back buffer, I didn't fix up PersistentBufferProviderShared::PreservesDrawingState
, so we push and pop the clips and transforms every frame.
I'm doubtful that would cause such a big difference, but here is a test build with that fixed if you wouldn't mind testing:
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/ZpWsYuLPSga-hC5ZXaBCjQ/runs/0/artifacts/public/build/install/sea/target.installer.exe
Updated•3 years ago
|
Updated•3 years ago
|
Assignee | ||
Comment 16•3 years ago
|
||
Thought some more about the difference from the GPU's point of view with current remote canvas and non-remote.
I realised that if we had a hidden canvas, where the texture isn't forwarded, we might keep copying into the front buffer for remote.
Whereas for non-remote that only happens on forwarding.
It'll need some more work, but here's a build that tries to work around that, would you mind trying it out:
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/ex-RYD1cTHmivZBRMxM4Dw/runs/0/artifacts/public/build/install/sea/target.installer.exe
Reporter | ||
Comment 17•3 years ago
|
||
Yep, with that build, power consumption is back to normal. :)
Assignee | ||
Comment 18•3 years ago
|
||
(In reply to walmartguy from comment #17)
Yep, with that build, power consumption is back to normal. :)
Great, thanks for testing.
Now I just need to see if it/make sure it doesn't break things in other ways. :-)
Reporter | ||
Comment 19•3 years ago
|
||
Will the fix also improve situation for Linux/Android builds with native GL Webrender?
Assignee | ||
Comment 20•3 years ago
|
||
(In reply to walmartguy from comment #19)
Will the fix also improve situation for Linux/Android builds with native GL Webrender?
No, I think this would only be affecting Windows, sorry.
Assignee | ||
Comment 21•3 years ago
|
||
(In reply to Bob Owen (:bobowen) from comment #18)
(In reply to walmartguy from comment #17)
Yep, with that build, power consumption is back to normal. :)
Great, thanks for testing.
Now I just need to see if it/make sure it doesn't break things in other ways. :-)
Unfortunately that change did have other problems.
I hope I've come up with an alternative though.
This one doesn't start using the permanent back buffer unless we try to read lock a front buffer (for copy or snapshot), that is still in use by the compositor.
Hopefully this will mean that we only use one buffer (and no copies) for the unforwarded case (like on vsynctester) and also for the case that the JS always writes to the full canvas at the start of frames and doesn't need a snapshot between frames.
Anyway, if you wouldn't mind testing another test build, sorry:
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Q06ArVpvTse5YOWTst_UMw/runs/0/artifacts/public/build/install/sea/target.installer.exe
Reporter | ||
Comment 22•3 years ago
|
||
Power consumption is still low with that build.
I'll recheck situation on Linux, just to be sure.
Assignee | ||
Comment 23•3 years ago
|
||
(In reply to walmartguy from comment #22)
Power consumption is still low with that build.
I'll recheck situation on Linux, just to be sure.
Excellent, thanks for all the testing.
I'll get those changes up for review.
Assignee | ||
Comment 24•3 years ago
|
||
This measure was originally put in to help with what was believed to be an issue
with ClearCachedResources, but we now think it was down to textures being
re-forwarded on tab switch when already read locked.
A change in bug 1717209 fixed this, so I think we can safely remove
mTextureLockIsUnreliable, which would cause some compilcations with the
following patch.
Updated•3 years ago
|
Assignee | ||
Comment 25•3 years ago
|
||
This removes some of the changes that meant we started using
mPermanentBackBuffer straight away and we now wait until we actually try and
lock a read locked texture.
While this might still give a very small risk of contention, it gives
improvements in the following two circumstances.
- If a canvas texture is never forwarded and never read locked, it means we will
only use one texture with no copies. - If a canvas is always fully overwritten at the start of the frame (and a
snapshot is not taken between frames), then we avoid a copy on each frame.
This also adds back in code so that on an OPEN_READ_WRITE lock we cache the data
surface if required, because that texture will be the new front buffer and we
won't be using mPermanentBackBuffer at that point.
Depends on D132601
Assignee | ||
Comment 26•3 years ago
|
||
Reporter | ||
Comment 27•3 years ago
|
||
Tested that build, it's still fine.
I've retested Linux and situation is different than I initially thought. It hasn't regressed with 94, but also 93 was already relatively bad with ~8.2W SoC power consumption (Xorg EGL backend forced on, no Xorg compositor active). Well, anyway not that pressing matter as the >11W total SoC on Windows with 93.
Assignee | ||
Comment 28•3 years ago
|
||
I'm going to wait and land this as soon as we merge and give it a little while on Nightly just in case we do see a re-emergence of the lock contention.
Comment 29•3 years ago
|
||
Comment 30•3 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/366bdd769b86
https://hg.mozilla.org/mozilla-central/rev/757d31ebc575
Comment 31•3 years ago
|
||
The patch landed in nightly and beta is affected.
:bobowen, is this bug important enough to require an uplift?
If not please set status_beta
to wontfix
.
For more information, please visit auto_nag documentation.
Comment 32•3 years ago
|
||
== Change summary for alert #32689 (as of Fri, 10 Dec 2021 11:29:07 GMT) ==
Improvements:
Ratio | Test | Platform | Options | Absolute values (old vs new) |
---|---|---|---|---|
7% | pdfpaint | windows10-64-shippable-qr | e10s stylo webrender | 614.64 -> 573.73 |
5% | pdfpaint | windows10-64-shippable-qr | e10s stylo webrender | 611.55 -> 579.65 |
For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=32689
Assignee | ||
Comment 33•3 years ago
|
||
(In reply to Release mgmt bot [:sylvestre / :calixte / :marco for bugbug] from comment #31)
The patch landed in nightly and beta is affected.
:bobowen, is this bug important enough to require an uplift?
If not please setstatus_beta
towontfix
.
It doesn't look like any issues have been introduced/re-introduced, but given the time of year I think uplifting a performance bug to Beta is probably not the right call.
Description
•