Closed Bug 1908797 Opened 11 months ago Closed 11 months ago

Crash in [@ mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError | IPC::WriteSequenceParam<T> | IPC::ParamTraits<nsTSubstring<T> >::Write | mozilla::net::PHttpBackgroundChannelParent::SendOnTransportAndData]

Categories

(Core :: Networking: HTTP, defect, P2)

defect

Tracking

()

RESOLVED DUPLICATE of bug 1909494
Tracking Status
firefox128 --- wontfix
firefox129 --- wontfix
firefox130 --- wontfix

People

(Reporter: mccr8, Unassigned)

Details

(Keywords: crash, topcrash, Whiteboard: [necko-triaged][necko-priority-new])

Crash Data

Crash report: https://crash-stats.mozilla.org/report/index/5ebccaa7-3003-4164-b76b-6f82e0240718

MOZ_CRASH Reason: MOZ_CRASH(IPC FatalError in the parent process!)

Top 10 frames:

0  libxul.so  mozilla::ipc::FatalError(char const*, bool)  ipc/glue/ProtocolUtils.cpp:203
1  libxul.so  mozilla::ipc::IProtocol::HandleFatalError(char const*)  ipc/glue/ProtocolUtils.cpp:403
2  libxul.so  IPC::WriteSequenceParam<char const&>(IPC::MessageWriter*, std::remove_referen...  ipc/chromium/src/chrome/common/ipc_message_utils.h:592
2  libxul.so  IPC::ParamTraits<nsTSubstring<char> >::Write(IPC::MessageWriter*, nsTSubstrin...  ipc/glue/IPCMessageUtilsSpecializations.h:92
2  libxul.so  IPC::WriteParam<nsTSubstring<char> const&>(IPC::MessageWriter*, nsTSubstring<...  ipc/chromium/src/chrome/common/ipc_message_utils.h:445
2  libxul.so  mozilla::net::PHttpBackgroundChannelParent::SendOnTransportAndData(nsresult c...  ipc/ipdl/PHttpBackgroundChannelParent.cpp:170
2  libxul.so  mozilla::net::HttpBackgroundChannelParent::OnTransportAndData(nsresult const&...  netwerk/protocol/http/HttpBackgroundChannelParent.cpp:229
2  libxul.so  std::_Function_handler<bool (nsTDependentSubstring<char> const&, unsigned lon...  /builds/worker/fetches/sysroot-x86_64-linux-gnu/usr/include/c++/8/bits/std_function.h:282
3  libxul.so  std::function<bool (nsTDependentSubstring<char> const&, unsigned long, unsign...  /builds/worker/fetches/sysroot-x86_64-linux-gnu/usr/include/c++/8/bits/std_function.h:687
3  libxul.so  mozilla::net::nsHttp::SendDataInChunks<nsTDependentSubstring<char> >(nsTStrin...  netwerk/protocol/http/nsHttp.h:316

I don't know if this is actionable, but there were 3 on a single day of Nightlies which seems a bit high.

A couple of signatures that look related are now in the Nightly topcrash list.

Crash Signature: [@ mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError | IPC::WriteSequenceParam<T> | IPC::ParamTraits<nsTSubstring<T> >::Write | mozilla::net::PHttpBackgroundChannelParent::SendOnTransportAndData] → [@ mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError | IPC::WriteSequenceParam<T> | IPC::ParamTraits<nsTSubstring<T> >::Write | mozilla::net::PHttpBackgroundChannelParent::SendOnTransportAndData] [@ IPC::MessageReader::FatalError | IP…
Keywords: topcrash
Crash Signature: IPC::ParamTraits<mozilla::ipc::BigBuffer>::Read | mozilla::dom::PWebGLParent::OnMessageReceived ] → IPC::ParamTraits<mozilla::ipc::BigBuffer>::Read | mozilla::dom::PWebGLParent::OnMessageReceived ] [@ IPC::MessageReader::FatalError | mozilla::ipc::data_pipe_detail::DataPipeRead<T> ]

Please file separate bugs for these signatures. It isn't clear that they are related at all, as this is IPC being used by different areas of the browser.

This one looks graphics related: [@ IPC::MessageReader::FatalError | IPC::ParamTraits<mozilla::ipc::BigBuffer>::Read | mozilla::dom::PWebGLParent::OnMessageReceived ]

This is a bundle of random stuff in storage, fetch and maybe other places: [@ IPC::MessageReader::FatalError | mozilla::ipc::data_pipe_detail::DataPipeRead<T> ] We probably need stuff added to the prefix list to split it up a bit.

Crash Signature: [@ mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError | IPC::WriteSequenceParam<T> | IPC::ParamTraits<nsTSubstring<T> >::Write | mozilla::net::PHttpBackgroundChannelParent::SendOnTransportAndData] [@ IPC::MessageReader::FatalError | IP… → [@ mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError | IPC::WriteSequenceParam<T> | IPC::ParamTraits<nsTSubstring<T> >::Write | mozilla::net::PHttpBackgroundChannelParent::SendOnTransportAndData]
Flags: needinfo?(ehenry)

Thanks - I filed new issues for 3 of these signatures.

Flags: needinfo?(ehenry)

Over half the crashes in the last 3 months were a single buildid: 20240719093654
(and 2nd most, though with only 14, is the other buildid from 0719)

This smells like something landed and was backed out. Any ideas?

This is a bad length somehow getting passed for an IPC send.

Flags: needinfo?(continuation)

(In reply to Randell Jesup [:jesup] (needinfo me) from comment #4)

Over half the crashes in the last 3 months were a single buildid: 20240719093654
(and 2nd most, though with only 14, is the other buildid from 0719)

This smells like something landed and was backed out. Any ideas?

It looks to me like the 20240719093654 build is the 128.0 release build. It doesn't surprise me that a release build would have a lot of the crashes because it is more stable than other channels.

Flags: needinfo?(continuation)

It seems it's hitting https://hg.mozilla.org/mozilla-central/file/ada7576ed0b64f7509e35f1e8cfa1508f0965b5a/ipc/chromium/src/chrome/common/ipc_message_utils.h#l596 because of an IPC message that's too big. Better diagnostics to catch such things might help. I doubt it's hitting negative aCount.

However, SendDataInChunks() should limit us to 128K: https://searchfox.org/mozilla-central/source/netwerk/protocol/http/nsHttp.h#307

Did anything change that might affect these length calculations?

Severity: -- → S2
Flags: needinfo?(nika)
Flags: needinfo?(continuation)
Priority: -- → P2
Whiteboard: [necko-triaged][necko-priority-new]

Where are you getting that from? I see "SharedMemory::WriteHandle failed" as the IPC fatal error message value. I don't know of anything that has changed.

Flags: needinfo?(continuation)

Also, why did it spike for ~4 days, then go (mostly) quiet, if it is release? We released a 128.0.2 and 128.0.3.. but those have 4 and 6, vs 150 for 128.0.

And 130 Nightly is 34.

So it seems to somehow be tied to specific builds...

As for where: the stack traces. i.e. https://crash-stats.mozilla.org/report/index/8fffc8d6-b9ed-49ae-9038-9ddb50240729#tab-details
Where do you see "SharedMemory::WriteHandle failed" in that report?

Flags: needinfo?(continuation)

(In reply to Randell Jesup [:jesup] (needinfo me) from comment #8)

As for where: the stack traces. i.e. https://crash-stats.mozilla.org/report/index/8fffc8d6-b9ed-49ae-9038-9ddb50240729#tab-details

Ah. That says it is crashing on line 602, which is after the validity check for byte_length passed, as far as I can see.

Where do you see "SharedMemory::WriteHandle failed" in that report?

It is in the "crash annotations" tab under "SharedMemory::WriteHandle failed". IIRC, Nika did add a mechanism at some point where we'd fall back to sending large messages over shmem under certain conditions I don't recall. Maybe that's kicking in but then shmem fails for some reason? That crash report you linked has 0 bytes of available page file. I'm not sure if that's unusual on Linux or not but it seems bad.

Flags: needinfo?(continuation)

This seems to be the same as bug 1909494 (the crash reason is also SharedMemory::WriteHandle failed).

Status: NEW → RESOLVED
Closed: 11 months ago
Duplicate of bug: 1909494
Resolution: --- → DUPLICATE
Severity: S2 → S3
Flags: needinfo?(nika)
You need to log in before you can comment on or make changes to this bug.