Closed Bug 1876506 Opened 2 years ago Closed 2 years ago

[macOS] Firefox Freezes When Rendering Multiple Canvas

Categories

(Core :: Graphics: Canvas2D, defect)

defect

Tracking

()

RESOLVED FIXED
124 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox122 --- unaffected
firefox123 + fixed
firefox124 --- fixed

People

(Reporter: jamie, Assigned: lsalzman)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: regression)

Attachments

(1 file)

When using Firefox on macoS, on a website that uses multiple canvas elements, the UI freezes and becomes unresponsive for up to 10 seconds when the canvas is redrawn. This makes some websites completely unusable.

The canvas elements in question are wide (about ~4000 pixels) and postioned inside a div with "overflow: scroll;" and a z-index, so most of the width is not visible.

So far I've not been able to reproduce this outside of our product, but I will try to create a simple page that reproduces the issue. In the mean time, I am very happy to provide access - please just drop me an email from a @mozilla address, and I'll provide you with credentials.

Using mozregression, I've traced the issue to the following revision. I can reliably reproduce it on every Mac I have available to test with (macOS 13 and macOS 10.15).

https://phabricator.services.mozilla.com/D195952

The Bugbug bot thinks this bug should belong to the 'Core::Graphics: Canvas2D' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: General → Graphics: Canvas2D
Product: Firefox → Core
Keywords: regression
Regressed by: 1829026

:lsalzman, since you are the author of the regressor, bug 1829026, could you take a look? Also, could you set the severity field?

For more information, please visit BugBot documentation.

Flags: needinfo?(lsalzman)

[Tracking Requested - why for this release]:

This seems like a bad regression in FF 123 from changes which helped a lot with performance in other cases, so we should try to avoid backing them out from beta.

Jamie, you can send the credentials to jmuizelaar@mozilla.com

Certainly! You should have received an email a few minutes ago.

The bug has a release status flag that shows some version of Firefox is affected, thus it will be considered confirmed.

Status: UNCONFIRMED → NEW
Ever confirmed: true

The bug is marked as tracked for firefox123 (beta). However, the bug still isn't assigned.

:bhood, could you please find an assignee for this tracked bug? Given that it is a regression and we know the cause, we could also simply backout the regressor. If you disagree with the tracking decision, please talk with the release managers.

For more information, please visit BugBot documentation.

Flags: needinfo?(bhood)

Set release status flags based on info from the regressing bug 1829026

Blocks: gfx-triage
Severity: -- → S3

This is hanging in RemoteTextureMap::GetRemoteTexture: https://share.firefox.dev/4bbk4s4

See Also: → 1874384

I can reproduce this. Sotaro, any idea what might be going wrong? I'll forward you the credentials so that you can try locally.

Flags: needinfo?(sotaro.ikeda.g)

I have a guess, I think that we may be using an old transaction ID when we destroy the remote texture:
https://searchfox.org/mozilla-central/rev/b75080bb8b11844d18cb5f9ac6e68a866ef8e243/gfx/layers/client/TextureRecorded.cpp#36

Because we don't update it if the canvas is not dirty:
https://searchfox.org/mozilla-central/rev/b75080bb8b11844d18cb5f9ac6e68a866ef8e243/gfx/layers/ShareableCanvasRenderer.cpp#91

And when we destroy the texture and mark it as wanting to wait for the transaction to complete, we early exit because it is old:
https://searchfox.org/mozilla-central/rev/b75080bb8b11844d18cb5f9ac6e68a866ef8e243/gfx/layers/RemoteTextureMap.cpp#772

Simplest patch would be to just always update the cached transaction ID, even if not dirty.

try with shippable builds: https://treeherder.mozilla.org/jobs?repo=try&revision=5e0c18c981718335848180a02d32d820e4081775

CopyToSwapChain was silently failing, causing no texture to get pushed
to RemoteTextureMap, so that when a wait on it was occurring, it would
timeout.

The failure occurred in DrawTargetWebgl::FlushFromSkia, because the
DT's size actually exceeded the value of the texture limit pref when
it was attempting to allocate a temporary texture to blend back a
Skia layer to the WebGL framebuffer. This is fixed by allowing layering
to bypass this limit, as it is always expected that layer blending
succeed.

To guard against future instances of this bug, CopyToSwapChain now returns
a boolean result so that it is fallible and can signal to CanvasTranslator
that it needs to take appropriate fallback measures on failure.

Assignee: nobody → lsalzman
Status: NEW → ASSIGNED
Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(lsalzman)
Flags: needinfo?(bhood)
Pushed by lsalzman@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/6f78746c35f4 Check for CopyToSwapChain failure. r=aosmond
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 124 Branch

With my changes landed in nightly, the pause no longer happens for me using the reported testcase. Can you verify with latest nightly build, please?

Flags: needinfo?(jamie)
Duplicate of this bug: 1874384

I can confirm the patch has resolved this for me in the latest Nightly! Thank you all for a quick fix for this :)

Given that this bug is going to have a wider-spread impact than perhaps realised, I'm worried that when the FF 123 release happens on macOS next month, many more users will start reporting this problem on various websites where large canvas objects are used.

I'm not familiar with the current uplifting process into mozilla-beta, as I know it needs Phabricator access, but if you're not already giving this consideration, could you kindly do so?

Flags: needinfo?(jamie)

Comment on attachment 9376781 [details]
Bug 1876506 - Check for CopyToSwapChain failure. r?aosmond

Beta/Release Uplift Approval Request

  • User impact if declined: Users may experience freezing of the browser when using canvas2d.
  • Is this code covered by automated tests?: Unknown
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): While we don't yet have a minimized testcase, it seems like we understand pretty well the cause of this and why this fixes the reported issue(s).
  • String changes made/needed:
  • Is Android affected?: Yes
Attachment #9376781 - Flags: approval-mozilla-beta?
See Also: 1874384

Comment on attachment 9376781 [details]
Bug 1876506 - Check for CopyToSwapChain failure. r?aosmond

Approved for 123 beta 5, thanks.

Attachment #9376781 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: