Open Bug 1909490 Opened 11 months ago Updated 10 months ago

Crash in [@ IPC::MessageReader::FatalError | IPC::ParamTraits<mozilla::ipc::BigBuffer>::Read | mozilla::dom::PWebGLParent::OnMessageReceived]

Categories

(Core :: Graphics: CanvasWebGL, defect)

Unspecified
Linux
defect

Tracking

()

Tracking Status
firefox-esr115 --- affected
firefox-esr128 --- affected
firefox128 --- wontfix
firefox129 --- fix-optional
firefox130 --- affected
firefox131 --- affected

People

(Reporter: lizzard, Assigned: aosmond, NeedInfo)

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

This signature has been spiking in Nightly 130. It first appeared 2024-07-08.

Crash report: https://crash-stats.mozilla.org/report/index/995e69f2-ac57-40b5-848f-b0ad60240722

MOZ_CRASH Reason: MOZ_CRASH(IPC FatalError in the parent process!)

Top 10 frames:

0  libxul.so  mozilla::ipc::FatalError(char const*, bool)  ipc/glue/ProtocolUtils.cpp:203
1  libxul.so  mozilla::ipc::IProtocol::HandleFatalError(char const*)  ipc/glue/ProtocolUtils.cpp:403
2  libxul.so  IPC::MessageReader::FatalError(char const*) const  ipc/chromium/src/chrome/common/ipc_message_utils.h:205
2  libxul.so  IPC::ParamTraits<mozilla::ipc::BigBuffer>::Read(IPC::MessageReader*, mozilla:...  ipc/glue/BigBuffer.cpp:86
3  libxul.so  IPC::ReadParam<mozilla::ipc::BigBuffer>(IPC::MessageReader*)  ipc/chromium/src/chrome/common/ipc_message_utils.h:485
3  libxul.so  mozilla::dom::PWebGLParent::OnMessageReceived(IPC::Message const&)  ipc/ipdl/PWebGLParent.cpp:296
4  libxul.so  mozilla::gfx::PCanvasManagerParent::OnMessageReceived(IPC::Message const&)  ipc/ipdl/PCanvasManagerParent.cpp:248
5  libxul.so  mozilla::ipc::MessageChannel::DispatchAsyncMessage(mozilla::ipc::ActorLifecyc...  ipc/glue/MessageChannel.cpp:1820
5  libxul.so  mozilla::ipc::MessageChannel::DispatchMessage(mozilla::ipc::ActorLifecyclePro...  ipc/glue/MessageChannel.cpp:1739
5  libxul.so  mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::ActorLifecycleProxy*, ...  ipc/glue/MessageChannel.cpp:1530
OS: Unspecified → Linux

Kelsey, does this look related to any recent change?

Flags: needinfo?(jgilbert)

I don't think so.
Maybe @aosmund was doing something with webgl IPC?

Flags: needinfo?(jgilbert) → needinfo?(aosmond)

Looking at the IPC fatal error msg, they seem to be all because we are failing to map in a shmem somewhere in the canvas protocol tree:
https://searchfox.org/mozilla-central/rev/0c55d51c0d2a9b672e42ad40ea54f90267f92a8e/ipc/glue/BigBuffer.cpp#86

Strangely it seems to have peaked on nightly and release at the same time.

Flags: needinfo?(aosmond)

The bug is linked to a topcrash signature, which matches the following criteria:

  • Top 20 desktop browser crashes on beta
  • Top 5 desktop browser crashes on Linux on beta

For more information, please visit BugBot documentation.

Keywords: topcrash

The severity field is not set for this bug.
:jgilbert, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(jgilbert)

This doesn't seem likely to be only an OOM, as many of the crash reports have very high virtual memory space, suggesting they have plenty of room to map in the buffer.

We add the alignment offset to mPendingCmdsPos before checking whether or not the new command will fit:
https://searchfox.org/mozilla-central/rev/4496b3ed9bb535832e4826f09fbcb645b559a32d/dom/canvas/WebGLChild.cpp#63

If we fail, we flush:
https://searchfox.org/mozilla-central/rev/4496b3ed9bb535832e4826f09fbcb645b559a32d/dom/canvas/WebGLChild.cpp#67

And use mPendingCmdsPos as the length:
https://searchfox.org/mozilla-central/rev/4496b3ed9bb535832e4826f09fbcb645b559a32d/dom/canvas/WebGLChild.cpp#80

If that exceeds the allocated size of the buffer, the mapping will fail.

Assignee: nobody → aosmond

Probably some website(s) and/or frameworks changed and made it more likely to hit the above condition, hence why it is happening on all release channels at the same time.

WebGLChild::mPendingCmdsPos represents the end position of
mPendingCmdsShmem. When checking to see if we can fit the next
command in the current buffer, we would add alignment overhead
first. If there was insufficient room, we would flush the buffer
with a slightly too big mPendingCmdsPos. Most of the time this is
fine since we would not read the zero initialized bytes anyways,
but sometimes it would overflow mPendingCmdsShmem's actual size.
This would cause us to fail to map in the buffer in the compositor
process and crash.

I think I was too tired while writing this patch. It does fix a mistake calculating the size, but won't fix the mapping crash since it uses the size written in the BigBuffer.

Based on the topcrash criteria, the crash signature linked to this bug is not a topcrash signature anymore.

For more information, please visit BugBot documentation.

Keywords: topcrash
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: