Closed Bug 1820479 Opened 2 years ago Closed 2 years ago

Windows 32-bit Crash in [@ OOM | large | mozalloc_abort | std::alloc::_::__rg_oom]

Categories

(Core :: Graphics: WebRender, defect)

x86
Windows
defect

Tracking

()

RESOLVED INVALID
Tracking Status
firefox-esr102 --- unaffected
firefox110 --- unaffected
firefox111 --- wontfix
firefox112 --- wontfix

People

(Reporter: aryx, Unassigned)

Details

(Keywords: crash)

Crash Data

Frequent crash signature (200 crashes from 104 installations of 111.0b8 released 3 days ago) new in Firefox 111 branch (= not observed for v110 with this signature), all crashes on Windows with 32-bit Firefox builds, ~90% of crashes with Windows 7, many crashes with AMD cpus.

Is this a regression from bug 1808549?

Crash report: https://crash-stats.mozilla.org/report/index/e0c17d7c-6fa6-4c1e-b854-39e540230304

MOZ_CRASH Reason: out of memory: 0x0000000000100000 bytes requested

Top 10 frames of crashing thread:

0  mozglue.dll  MOZ_Crash  mfbt/Assertions.h:261
0  mozglue.dll  mozalloc_abort  memory/mozalloc/mozalloc_abort.cpp:26
1  mozglue.dll  mozalloc_handle_oom  memory/mozalloc/mozalloc_oom.cpp:51
2  xul.dll  mozglue_static::oom_hook::hook  mozglue/static/rust/lib.rs:115
3  xul.dll  std::alloc::rust_oom  library/std/src/alloc.rs:355
4  xul.dll  std::alloc::_::__rg_oom  library/std/src/alloc.rs:351
5  xul.dll  __rust_alloc_error_handler  
6  xul.dll  webrender::renderer::Renderer::render_impl  gfx/wr/webrender/src/renderer/mod.rs:1440
7  xul.dll  webrender::renderer::Renderer::render  gfx/wr/webrender/src/renderer/mod.rs:1197
8  xul.dll  webrender_bindings::bindings::wr_renderer_render  gfx/webrender_bindings/src/bindings.rs:614
Flags: needinfo?(nical.bugzilla)
Blocks: gfx-triage
Severity: -- → S2

The bug is marked as tracked for firefox111 (beta). We have limited time to fix this, the soft freeze is in 2 days. However, the bug still isn't assigned.

:bhood, could you please find an assignee for this tracked bug? If you disagree with the tracking decision, please talk with the release managers.

For more information, please visit auto_nag documentation.

Flags: needinfo?(bhood)
Flags: needinfo?(bhood)

Jeff, can you and Glenn have a direct chat about this to see if there's any course of action we can take here?

Flags: needinfo?(jmuizelaar)
Flags: needinfo?(nical.bugzilla)

This is a reminder regarding comment #1!

The bug is marked as tracked for firefox111 (release). We have limited time to fix this, the soft freeze is in 14 days. However, the bug still isn't assigned.

:bhood have there been any updates since Comment 2?
New crash in 111, any indicators on the regressor?

Flags: needinfo?(bhood)

I was absent from the Triage meeting last week; I will see if I can get some movement today.

Flags: needinfo?(bhood)

I think this is probably a signature change.

Flags: needinfo?(jmuizelaar)

Andrew, is this an example of something that should be considered an irrelevant report based on the comments in bug #1818038 ? And if so, what should I be doing with this, if anything?

Flags: needinfo?(continuation)

It isn't the crash reports that are irrelevant, it is the specific frame alloc::_::__rg_oom, because that is some frame that appears to be used in the Rust OOM reporting infrastructure.

Ideally that frame would get added to the irrelevant signature list, which would separate the crashes a bit into different signatures based on the particular Rust code that is OOMing. That would also reduce the volume of any specific signature which would make it look less alarming to release management.

Failing that, you can try faceting on proto signature in crash stats to try to get a sense of what the crashes are. The largest bucket looks like WebRender, but the second largest bucket is Glean (telemetry), so I'm sure it is a mix of random code that allocates lots of data in Rust.

Like Jeff said, this is probably a signature change, though of course it is possible it is a regression at the same time. But honestly the volume doesn't look super high to me for an OOM.

Flags: needinfo?(continuation)

The WR bucket does look to be a known / existing OOM location, when we need to resize the texture cache (and no code related to that has changed recently), so I think it's reasonable to assume that this is probably a signature change.

This is a reminder regarding comment #1!

The bug is marked as tracked for firefox111 (release). We have limited time to fix this, the soft freeze is in 13 days. However, the bug still isn't assigned.

Removing tracking and setting 111/112 to wontfix based on the comment 9 - 10

Since the crash volume is low (less than 15 per week), the severity is downgraded to S3. Feel free to change it back if you think the bug is still critical.

For more information, please visit auto_nag documentation.

Severity: S2 → S3

Deploying https://github.com/mozilla-services/socorro/pull/6380 has made this signature go away. Any other OOM work can happen in the individual signature bugs.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → INVALID
No longer blocks: gfx-triage
You need to log in before you can comment on or make changes to this bug.