Closed Bug 1851432 Opened 1 year ago Closed 1 year ago

Crash in [@ webrender::renderer::upload::upload_to_texture_cache]

Categories

(Core :: Graphics, defect)

Firefox 119
x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
119 Branch
Tracking Status
firefox-esr102 --- unaffected
firefox-esr115 --- fixed
firefox117 --- wontfix
firefox118 --- fixed
firefox119 --- fixed

People

(Reporter: o2q2tcedsh0, Assigned: sotaro)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: regression)

Crash Data

Attachments

(4 files)

Crash report: https://crash-stats.mozilla.org/report/index/6b9aa9c1-1fa5-49d5-b423-4b5620230901

MOZ_CRASH Reason: Unexpected external texture 1007 for the texture cache update of ExternalImageId(1589)

Top 10 frames of crashing thread:

0  libxul.so  MOZ_Crash  mfbt/Assertions.h:281
0  libxul.so  RustMozCrash  mozglue/static/rust/wrappers.cpp:18
1  libxul.so  mozglue_static::panic_hook  mozglue/static/rust/lib.rs:96
2  libxul.so  core::ops::function::Fn::call  library/core/src/ops/function.rs:79
3  libxul.so  <alloc::boxed::Box<F, A> as core::ops::function::Fn<Args>>::call  library/alloc/src/boxed.rs:2007
3  libxul.so  std::panicking::rust_panic_with_hook  library/std/src/panicking.rs:709
4  libxul.so  std::panicking::begin_panic_handler::{{closure}}  library/std/src/panicking.rs:597
5  libxul.so  std::sys_common::backtrace::__rust_end_short_backtrace  library/std/src/sys_common/backtrace.rs:151
6  libxul.so  rust_begin_unwind  library/std/src/panicking.rs:593
7  libxul.so  core::panicking::panic_fmt  library/core/src/panicking.rs:67

The bug has a crash signature, thus the bug will be considered confirmed.

Status: UNCONFIRMED → NEW
Ever confirmed: true

Glenn, this is failing with Unexpected external texture in WebRender. Is there a way to catch this panic condition earlier such that we can see what might trigger this problem?

Flags: needinfo?(gwatson)
Severity: -- → S2

I don't think that's possible, since the assert is on the result of the handler.lock callback which calls in to gecko right at that point. We will probably need to be able to repro this one locally to be able to debug what's happening.

Flags: needinfo?(gwatson)

Before the crash I was on https://www.wetteronline.de/wettertrend/berlin?start=8

With Firefox 117 Snap in Ubuntu 23.04 the memory goes from 1.5GB up to 10.5GB. Mit Fedora 38 und Firefox 117 steigt die CPU-Auslastung auf 98%. Only on this website.

Sorry, one comment was in german. *With Fedora 38 and Firefox 117, the CPU utilization increases to 98%.

Maddi, to help us try to replicate this, would you please give us some more information on your specific hardware setup? To do this, navigate to "about:support", click "Copy text to clipboard" and then paste that text as an attachment to this Bug.

Flags: needinfo?(o2q2tcedsh0)
Attached file raw-firefox-117.txt
Flags: needinfo?(o2q2tcedsh0)
Attached file raw-firefox-117.txt

I was able to report the complete crash of the browser in Devuan 5.0.0 daedalus.
https://crash-stats.mozilla.org/report/index/070b9ea9-d99f-4488-8687-1cc5d0230906

For me this bug started happening after update from 115 to 116 when webgl.out-of-process.async-present.force-sync started defaulting to false, changing it back to true fixes it. Also couldn't reproduce on Windows but did on another Linux machine.

Report: https://crash-stats.mozilla.org/report/index/024b971e-5055-4a52-9e80-82a260230907

stdout:
[GFX1-]: unexpected remote texture size: Size(0,0) expected: Size(1144,128)
[Parent 606909, IPC I/O Parent] WARNING: Message needs unreceived descriptors channel:7ff4c8e6aab0 message-type:11337733 header()->num_handles:1 num_fds:0 fds_i:0: file /build/firefox/src/firefox-117.0/ipc/chromium/src/chrome/common/ipc_channel_posix.cc:467
Exiting due to channel error.

My way to repro is:

  1. Go to https://orteil.dashnet.org/cookieclicker/
  2. Click Options -> Import save and paste this text https://drive.google.com/file/d/1cQrKEFkWUCRrpxJ_X3zTFLSo-UCBIBMC/view?usp=sharing
  3. Click Options again to close and scroll max bottom (not sure why needed but seems to make it crash much faster)
  4. To speed up crashing much more go to console and enter with any higher number than default 30: Game.fps = 1000
  5. Refresh page if not crashed after a few minutes and try again

lubasowo0, can you try getting a regression window using https://mozilla.github.io/mozregression/?

Flags: needinfo?(lubasowo0)

(In reply to Jeff Muizelaar [:jrmuizel] from comment #12)

lubasowo0, can you try getting a regression window using https://mozilla.github.io/mozregression/?

I've done it but I doubt that it's very useful, it just bisected to the config change (on nighly it happened between 113 and 114, on release between 115 and 116, mozregression gave me a nighly change):
https://hg.mozilla.org/mozilla-central/rev/2b90b458178fa4de234b11771cb670c65c0cea03

I can rerun it with changing that config to false each time but I have a feeling that it was broken forever anyway. Let me know what you think.

Flags: needinfo?(lubasowo0)

Okay I was wrong that it was broken forever and with mozregression --bad 113 --good 2023-03-14 --pref "webgl.out-of-process.async-present.force-sync:false" bisected to this commit:
https://hg.mozilla.org/integration/autoland/rev/b1b50f07c34b820350697dcbc6086f8b4c336be1

This is the point when bug was introduced and then later changes to config revealed it.

(In reply to lubasowo0 from comment #14)

Okay I was wrong that it was broken forever and with mozregression --bad 113 --good 2023-03-14 --pref "webgl.out-of-process.async-present.force-sync:false" bisected to this commit:
https://hg.mozilla.org/integration/autoland/rev/b1b50f07c34b820350697dcbc6086f8b4c336be1

This is the point when bug was introduced and then later changes to config revealed it.

Adding 'regressed by' bug 1826280 based on the regression range.

Keywords: regression
Regressed by: 1826280

bug 1851377 has a user crashing on cookieclicker with Linux.

See Also: → 1851377

Set release status flags based on info from the regressing bug 1826280

:sotaro, since you are the author of the regressor, bug 1826280, could you take a look?

For more information, please visit BugBot documentation.

Assignee: nobody → sotaro.ikeda.g
Flags: needinfo?(sotaro.ikeda.g)

With the STR of Comment 11, I could reproduce the problems.

There were 2 symptoms.
[1] Tab crash by out of file descriptor
[2] Parent process crash of comment 1.

When [2] happened, SharedSurface_DMABUF::ToSurfaceDescriptor() returned nothing.

Then WebGLContext::PushRemoteTexture() did readback. RemoteTexture does not expect to change Texture type.

SharedSurface::ToSurfaceDescriptor() returns Nothing() only with the following.

Depends on: 1851377

The out of file descriptor problem is going to be handled by bug 1851377.

Pushed by sikeda.birchill@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/31ec6f05fd5d Add error handling to WebGLContext::PushRemoteTexture() r=gfx-reviewers,lsalzman
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 119 Branch

The patch landed in nightly and beta is affected.
:sotaro, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox118 to wontfix.

For more information, please visit BugBot documentation.

Flags: needinfo?(sotaro.ikeda.g)

Comment on attachment 9352859 [details]
Bug 1851432 - Add error handling to WebGLContext::PushRemoteTexture()

Beta/Release Uplift Approval Request

  • User impact if declined: Firefox might crash on Linux during WebGL/Canvas2D(accelerated)
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: none
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The change just adds error handling that is already used.
  • String changes made/needed: none
  • Is Android affected?: No
Flags: needinfo?(sotaro.ikeda.g)
Attachment #9352859 - Flags: approval-mozilla-beta?

Comment on attachment 9352859 [details]
Bug 1851432 - Add error handling to WebGLContext::PushRemoteTexture()

Approved for landing on mozilla-beta before the merge, will be in the 118.0 release candidate, thanks.

Attachment #9352859 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Do we need this on ESR115 also? It grafts cleanly.

Flags: needinfo?(sotaro.ikeda.g)

Comment on attachment 9352859 [details]
Bug 1851432 - Add error handling to WebGLContext::PushRemoteTexture()

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: Firefox might crash on Linux during WebGL/Canvas2D(accelerated)
  • User impact if declined: Firefox might crash on Linux during WebGL/Canvas2D(accelerated)
  • Fix Landed on Version: 119
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The change just adds error handling that is already used.
Flags: needinfo?(sotaro.ikeda.g)
Attachment #9352859 - Flags: approval-mozilla-esr115?

Comment on attachment 9352859 [details]
Bug 1851432 - Add error handling to WebGLContext::PushRemoteTexture()

Approved for 115.3esr

Attachment #9352859 - Flags: approval-mozilla-esr115? → approval-mozilla-esr115+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: