Closed Bug 1761233 Opened 2 years ago Closed 2 years ago

Crash in [@ webrender::render_api::RenderApi::send_transaction]

Categories

(Core :: Graphics: WebRender, defect)

defect

Tracking

()

VERIFIED FIXED
107 Branch
Tracking Status
firefox-esr91 --- wontfix
firefox-esr102 --- verified
firefox99 --- wontfix
firefox100 --- wontfix
firefox101 --- wontfix
firefox102 --- wontfix
firefox103 --- wontfix
firefox104 --- wontfix
firefox105 --- wontfix
firefox106 --- wontfix
firefox107 --- verified

People

(Reporter: aryx, Assigned: jfkthame)

References

(Depends on 1 open bug, Blocks 1 open bug, Regression)

Details

(5 keywords)

Crash Data

Attachments

(2 files)

Crash signature has been active for some time but got more frequent for Firefox 100.0a1 and v99 betas (16 and 21 crash reports for the last 2 betas).

Crash report: https://crash-stats.mozilla.org/report/index/1e3a845e-2c3e-4257-915b-e5ad20220322

Reason: EXCEPTION_BREAKPOINT

Top 10 frames of crashing thread:

0 xul.dll RustMozCrash mozglue/static/rust/wrappers.cpp:17
1 xul.dll mozglue_static::panic_hook mozglue/static/rust/lib.rs:91
2 xul.dll core::ops::function::Fn::call<void  ../9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/ops/function.rs:227
3 xul.dll std::panicking::rust_panic_with_hook ../9d1b2106e23b1abd32fce1f17267604a5102f57a//library/std/src/panicking.rs:610
4 xul.dll std::panicking::begin_panic_handler::closure$0 ../9d1b2106e23b1abd32fce1f17267604a5102f57a//library/std/src/panicking.rs:500
5 xul.dll std::sys_common::backtrace::__rust_end_short_backtrace<std::panicking::begin_panic_handler::closure$0, never$> ../9d1b2106e23b1abd32fce1f17267604a5102f57a//library/std/src/sys_common/backtrace.rs:139
6 xul.dll std::panicking::begin_panic_handler ../9d1b2106e23b1abd32fce1f17267604a5102f57a//library/std/src/panicking.rs:498
7 xul.dll core::panicking::panic_fmt ../9d1b2106e23b1abd32fce1f17267604a5102f57a//library/core/src/panicking.rs:116
8 xul.dll core::panicking::panic ../9d1b2106e23b1abd32fce1f17267604a5102f57a//library/core/src/panicking.rs:48
9 xul.dll webrender::render_api::RenderApi::send_transaction gfx/wr/webrender/src/render_api.rs:1257

Glenn, could evaluate this failure next week?

Flags: needinfo?(gwatson)

It's not clear to me what could cause this send to fail - perhaps when one side of the connection is closed unexpectedly or something like that. Nical or Sotaro may have some ideas on what could cause this.

Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(gwatson)

I found one possibility and created Bug 1765366.

Flags: needinfo?(sotaro.ikeda.g)
Depends on: 1765366

I found one possibility and created Bug 1765366.

Hmm, Bug 1765366 did not address the problem :(

Some crashes had the following reason in crash reports. https://crash-stats.mozilla.org/report/index/ffc105bb-c3c3-4f12-b512-c97160220420

MOZ_CRASH Reason (Sanitized) : assertion failed: self.fonts.templates.has_font(&shared_font_key)

The crash stack says that it happened at self.api_sender.send(). But actual assert should happen at self.resources.update(&mut transaction).

On Linux and on MacOS, creash reports showed "self.resources.update(&mut transaction)". Then self.resources.update() seemed to caused the crashes.

Since Bug 1749526, partial WR transactions are processed. It seemed to related to the problem.

See Also: → 1749526

:lsalzman, can you comment to comment 5?

Flags: needinfo?(lsalzman)

Some of the Windows stack traces seem to point the finger at self.api_sender.send() itself returning None as in here: https://hg.mozilla.org/mozilla-central/file/0d591d3bc99786bdb3cb057203a3831110d00800/gfx/wr/webrender/src/render_api.rs#l1259

The Linux and macOS traces don't seem to give us sane information, other than something in ApiResources::update trying to do an unwrap() on None? Inside ApiResources::update, I see some calls to unwrap() on blob_image_handler, could that be invalid there?

Flags: needinfo?(lsalzman)
Depends on: 1765725
Crash Signature: [@ webrender::render_api::RenderApi::send_transaction] → [@ core::option::expect_failed | webrender::render_api::RenderApi::send_transaction] [@ webrender::render_api::RenderApi::send_transaction]

By Bug 1765725, MOZ_CRASH Reason (Sanitized) "called Option::unwrap() on a None value" case seemed to be changed to "no blob image template"

:jrmuizel, :nical, can you comment to it?

Flags: needinfo?(jmuizelaar)

That would indicate that somehow the content process is not properly keeping track of the blob recordings and referring to an already deleted (or yet to be added) blob. It would be very hard to make any progress on this without a way to reproduce the crash.

Flags: needinfo?(nical.bugzilla)

Sotaro fixed related bug 1765366 in 101, but we're still seeing crashes with this signature in Nightly 102.

All recent crashes on Firefox nightly had "[GFX1-]: Failed sanitizing font" in GraphicsCriticalError. And MOZ_CRASH Reason (Sanitized) was "assertion failed: self.fonts.templates.has_font(&shared_font_key)".

Every time when I edit a page in Confluence and change the style of a paragraph Firefox crashes with this signature. This is reproducible across different machines, new profiles and release channels.

:lsalzman, can you comment to Comment 14?

Flags: needinfo?(lsalzman)

This signature is now ranked in the top50 for Firefox 101.0.1. I don't think that this is a startup crash, but if you restart Firefox after the crash and restore the same website it may crash again.

Keywords: top50, topcrash

(In reply to sjw from comment #14)

Every time when I edit a page in Confluence and change the style of a paragraph Firefox crashes with this signature. This is reproducible across different machines, new profiles and release channels.

Jonathan, are you able to reproduce this and see if there is something strange with font sanitization going on here?

Flags: needinfo?(lsalzman) → needinfo?(jfkthame)

I could see bug 1749526 being related, but I doubt it is the root cause. The purpose of the patch in bug 1749526 was to make crashes more consistent, rather than resolve them. What you might have seen before it was effectively random signatures with that entry in the critical log, because we crashed elsewhere due to missing some other resource that would have been added in the parent command list. Now the only things missing are what we failed to process specifically.

80% of the crashes are Windows 11. Adoption is probably why the crash rate is climbing.

URLs seem somewhat consistent, although many seem to be behind a login wall. Of interest is a recipe blog which might be useful for an STR:

https://cookieandkate.com/vegetarian-stuffed-peppers-recipe/

Depends on: 1777272

hxxps://derplayer.neocities.org/repo/other/firefox_crash.html causes this crash on Android bp-f27826e1-531b-44ea-8aa0-fb1bb0220916

(In reply to Kevin Brosnan [:kbrosnan] from comment #21)

hxxps://derplayer.neocities.org/repo/other/firefox_crash.html causes this crash on Android bp-f27826e1-531b-44ea-8aa0-fb1bb0220916

This started crashing on Bug 1749526.

6:41.01 INFO: No more integration revisions, bisection finished.
6:41.01 INFO: Last good revision: 1df57205c9076705f8812f88454d41fb9badffbb
6:41.01 INFO: First bad revision: 0e684da127ac3b068c2ff5086cf46d3e2abba592
6:41.01 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=1df57205c9076705f8812f88454d41fb9badffbb&tochange=0e684da127ac3b068c2ff5086cf46d3e2abba592

Regressed by: 1749526

The bug is linked to a topcrash signature, which matches the following criteria:

  • Top 20 desktop browser crashes on release (startup)
  • Top 5 GPU process crashes on release

For more information, please visit auto_nag documentation.

Set release status flags based on info from the regressing bug 1749526

Attached file ARIALBD.ttf

Font file from the site that is causing the crash. Since this involves a font maybe jfkthame should have a look?

Based on the topcrash criteria, the crash signatures linked to this bug are not in the topcrash signatures anymore.

For more information, please visit auto_nag documentation.

The bug is linked to a topcrash signature, which matches the following criteria:

  • Top 20 desktop browser crashes on release (startup)
  • Top 5 GPU process crashes on release
  • Top 5 desktop browser crashes on Windows on release (startup)

For more information, please visit auto_nag documentation.

(In reply to Kevin Brosnan [:kbrosnan] from comment #25)

Created attachment 9295649 [details]
ARIALBD.ttf

Font file from the site that is causing the crash. Since this involves a font maybe jfkthame should have a look?

The font is faulty, but OTS is failing to correctly sanitize it. I'll file a PR upstream.

Flags: needinfo?(jfkthame)

See https://github.com/khaledhosny/ots/pull/250. I'll put up a patch for Firefox shortly.

Assignee: nobody → jfkthame
Status: NEW → ASSIGNED
Pushed by jkew@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ddbd657d52dd
Apply VDMX sanitization fix from https://github.com/khaledhosny/ots/pull/250 to avoid generating invalid "sanitized" data. r=gfx-reviewers,lsalzman

Comment on attachment 9297344 [details]
Bug 1761233 - Apply VDMX sanitization fix from https://github.com/khaledhosny/ots/pull/250 to avoid generating invalid "sanitized" data. r=#gfx-reviewers

Beta/Release Uplift Approval Request

  • User impact if declined: Full-browser crash on visiting site with a particular kind of bad webfont
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: Visit site from comment 21.
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Small, well-understood fix from upstream; tested in upstream CI.
  • String changes made/needed:
  • Is Android affected?: Yes

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: Full-browser crash on visiting site with a particular kind of bad webfont
  • User impact if declined:
  • Fix Landed on Version: 107
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky):
Attachment #9297344 - Flags: approval-mozilla-esr102?
Attachment #9297344 - Flags: approval-mozilla-beta?
Flags: qe-verify+

Comment on attachment 9297344 [details]
Bug 1761233 - Apply VDMX sanitization fix from https://github.com/khaledhosny/ots/pull/250 to avoid generating invalid "sanitized" data. r=#gfx-reviewers

This is an old issue and we had no crash for 106 beta with these signature and given that we are building our last beta today and that the patch isn't in nightly yet, the benefit/risk ratio, doesn't seem good enough for an uplift ion our last beta. Thanks.

Attachment #9297344 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Flags: needinfo?(jmuizelaar)
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 107 Branch
QA Whiteboard: [qa-triaged]

Reproduced the issue on Firefox 100.0a1 (2022-03-24) on macOS 13 by using the link from Comment 21.

The issue is fixed on Firefox 107.0a1 (2022-10-06). Tests were performed on macOS 13, Windows 11 and Ubuntu 20.04.

(In reply to Pascal Chevrel:pascalc from comment #33)

This is an old issue and we had no crash for 106 beta with these signature and given that we are building our last beta today and that the patch isn't in nightly yet, the benefit/risk ratio, doesn't seem good enough for an uplift ion our last beta. Thanks.

Now that 106 is released there is a report of the signature in crashes with 106 in Element (#firefox:mozilla.org). The reporter believes there are crash reports from his 106 crashes.

Comment on attachment 9297344 [details]
Bug 1761233 - Apply VDMX sanitization fix from https://github.com/khaledhosny/ots/pull/250 to avoid generating invalid "sanitized" data. r=#gfx-reviewers

Approved for 102.5esr.

Attachment #9297344 - Flags: approval-mozilla-esr102? → approval-mozilla-esr102+

Verified the fix on Firefox 102.5ESR as well. Tests were performed on Ubuntu 22.04, Windows 11 and macOS 13.0.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: