Closed Bug 1840146 Opened 2 years ago Closed 1 month ago

Crash in [@ mozilla::ipc::UntypedEndpoint::Bind]

Categories

(Core :: IPC, defect)

defect

Tracking

()

RESOLVED DUPLICATE of bug 1993381
Tracking Status
firefox-esr140 --- unaffected
firefox143 --- unaffected
firefox144 --- unaffected
firefox145 --- affected

People

(Reporter: mccr8, Unassigned)

References

(Regression)

Details

(Keywords: crash, regression)

Crash Data

Crash report: https://crash-stats.mozilla.org/report/index/a89cdf88-7c13-4495-955a-4926c0230623

MOZ_CRASH Reason: MOZ_RELEASE_ASSERT(IsValid())

Top 9 frames of crashing thread:

0  libxul.so  mozilla::ipc::UntypedEndpoint::Bind  ipc/glue/Endpoint.h:77
1  libxul.so  mozilla::dom::ContentChild::Init  dom/ipc/ContentChild.cpp:758
2  libxul.so  mozilla::dom::ContentProcess::Init  dom/ipc/ContentProcess.cpp:138
3  libxul.so  XRE_InitChildProcess  toolkit/xre/nsEmbedFunctions.cpp:618
4  firefox-bin  content_process_main  ipc/contentproc/plugin-container.cpp:57
4  firefox-bin  main  browser/app/nsBrowserApp.cpp:375
5  libc.so.6  __libc_start_call_main  /usr/src/debug/glibc/glibc/sysdeps/nptl/libc_start_call_main.h:58
6  libc.so.6  __libc_start_main@GLIBC_2.2.5  /usr/src/debug/glibc/glibc/csu/libc-start.c:360
7  firefox-bin  _start  

These are all from ContentChild::Init, so maybe there's some kind of OOM early in startup? (Well, I found one where it is CompositorManagerChild instead.)

Whether or not an UntypedEndpoint is valid is generally not down to any sort of IPC, rather to whether or not it was initialized. This endpoint was returned from https://searchfox.org/mozilla-central/rev/217cc028cb388d786e564a765df90669358616ad/ipc/glue/ProcessChild.cpp#130-134 as newly constructed, so could only be invalid if TakeInitialPort() was an invalid port.

The only way that TakeInitialPort() should return an invalid port (assuming this is the first and only call which seems correct) is if it was never initialized in ChildThread::Init (https://searchfox.org/mozilla-central/rev/217cc028cb388d786e564a765df90669358616ad/ipc/chromium/src/chrome/common/child_thread.cc#43-44).

That seems like the likely cause here. Looking at the crash report, it is notable that there are no other threads present beyond the main thread. That suggests that the IPC I/O Thread failed to start, which on linux corresponds to pthread_create returning an error: https://searchfox.org/mozilla-central/rev/217cc028cb388d786e564a765df90669358616ad/ipc/chromium/src/base/platform_thread_posix.cc#124. The return value there is ignored, meaning this is probably the first place we'd notice the failure to start the IPC I/O Thread.

We should perhaps move the assertion closer to the actual failure to make it more clear what's going on here, but I'm not sure how feasible it is to recover from the IPC I/O thread failing to be started.

The severity field is not set for this bug.
:jld, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(jld)

The crash seems low-volume, and this may be one of those cases where some kind of crash is unavoidable due to resource exhaustion.

Severity: -- → S3
Flags: needinfo?(jld)

These have shot through the roof on macOS, in the GPU process, since bug 1985082 landed on trunk in build id 20251008091336.

Regressions: 1985082
Keywords: regression
Regressed by: 1985082
No longer regressions: 1985082

(In reply to Steven Michaud [:smichaud] (Retired) from comment #4)

These have shot through the roof on macOS, in the GPU process, since bug 1985082 landed on trunk in build id 20251008091336.

Note, though that almost all of these are on macOS 15.7.2, which has not yet been released. Presumably it's a beta version.

This may turn out to be some kind of Apple bug.

https://crash-stats.mozilla.org/search/?signature=%3Dmozilla%3A%3Aipc%3A%3AUntypedEndpoint%3A%3ABind&platform=Mac%20OS%20X&date=%3E%3D2025-10-05T19%3A04%3A00.000Z&date=%3C2025-10-12T19%3A04%3A00.000Z&_facets=signature&_facets=platform_version&_facets=process_type&_sort=-date&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform_version&_columns=process_type#facet-platform_version

Set release status flags based on info from the regressing bug 1985082

:bradwerth, since you are the author of the regressor, bug 1985082, could you take a look?

For more information, please visit BugBot documentation.

It's also puzzling that these macOS GPU process crashes are only in build id 20251008091336. There have been several trunk builds since then with the bug 1985082 patch.

These crashes were fixed by bug 1993381.

Flags: needinfo?(bwerth)

(In reply to Steven Michaud [:smichaud] (Retired) from comment #4)

These have shot through the roof on macOS

Here's a page which has a table with an "Installs" column: https://crash-stats.mozilla.org/topcrashers/?product=Firefox&version=145.0a1&process_type=gpu&platform=Mac%20OS%20X

It shows 138 crashes across only 2 installs. So the high count is just because it was a crash loop.

Status: NEW → RESOLVED
Closed: 1 month ago
Duplicate of bug: 1993381
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.