Crash in [@ mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError | IPC::WriteSequenceParam<T> | IPC::ParamTraits<nsTSubstring<T> >::Write | mozilla::net::PHttpBackgroundChannelParent::SendOnTransportAndData]
Categories
(Core :: Networking: HTTP, defect, P2)
Tracking
()
People
(Reporter: mccr8, Unassigned)
Details
(Keywords: crash, topcrash, Whiteboard: [necko-triaged][necko-priority-new])
Crash Data
Crash report: https://crash-stats.mozilla.org/report/index/5ebccaa7-3003-4164-b76b-6f82e0240718
MOZ_CRASH Reason: MOZ_CRASH(IPC FatalError in the parent process!)
Top 10 frames:
0 libxul.so mozilla::ipc::FatalError(char const*, bool) ipc/glue/ProtocolUtils.cpp:203
1 libxul.so mozilla::ipc::IProtocol::HandleFatalError(char const*) ipc/glue/ProtocolUtils.cpp:403
2 libxul.so IPC::WriteSequenceParam<char const&>(IPC::MessageWriter*, std::remove_referen... ipc/chromium/src/chrome/common/ipc_message_utils.h:592
2 libxul.so IPC::ParamTraits<nsTSubstring<char> >::Write(IPC::MessageWriter*, nsTSubstrin... ipc/glue/IPCMessageUtilsSpecializations.h:92
2 libxul.so IPC::WriteParam<nsTSubstring<char> const&>(IPC::MessageWriter*, nsTSubstring<... ipc/chromium/src/chrome/common/ipc_message_utils.h:445
2 libxul.so mozilla::net::PHttpBackgroundChannelParent::SendOnTransportAndData(nsresult c... ipc/ipdl/PHttpBackgroundChannelParent.cpp:170
2 libxul.so mozilla::net::HttpBackgroundChannelParent::OnTransportAndData(nsresult const&... netwerk/protocol/http/HttpBackgroundChannelParent.cpp:229
2 libxul.so std::_Function_handler<bool (nsTDependentSubstring<char> const&, unsigned lon... /builds/worker/fetches/sysroot-x86_64-linux-gnu/usr/include/c++/8/bits/std_function.h:282
3 libxul.so std::function<bool (nsTDependentSubstring<char> const&, unsigned long, unsign... /builds/worker/fetches/sysroot-x86_64-linux-gnu/usr/include/c++/8/bits/std_function.h:687
3 libxul.so mozilla::net::nsHttp::SendDataInChunks<nsTDependentSubstring<char> >(nsTStrin... netwerk/protocol/http/nsHttp.h:316
I don't know if this is actionable, but there were 3 on a single day of Nightlies which seems a bit high.
Comment 1•11 months ago
|
||
A couple of signatures that look related are now in the Nightly topcrash list.
Updated•11 months ago
|
Reporter | ||
Comment 2•11 months ago
|
||
Please file separate bugs for these signatures. It isn't clear that they are related at all, as this is IPC being used by different areas of the browser.
This one looks graphics related: [@ IPC::MessageReader::FatalError | IPC::ParamTraits<mozilla::ipc::BigBuffer>::Read | mozilla::dom::PWebGLParent::OnMessageReceived ]
This is a bundle of random stuff in storage, fetch and maybe other places: [@ IPC::MessageReader::FatalError | mozilla::ipc::data_pipe_detail::DataPipeRead<T> ] We probably need stuff added to the prefix list to split it up a bit.
Comment 3•11 months ago
|
||
Thanks - I filed new issues for 3 of these signatures.
Comment 4•11 months ago
|
||
Over half the crashes in the last 3 months were a single buildid: 20240719093654
(and 2nd most, though with only 14, is the other buildid from 0719)
This smells like something landed and was backed out. Any ideas?
This is a bad length somehow getting passed for an IPC send.
Reporter | ||
Comment 5•11 months ago
|
||
(In reply to Randell Jesup [:jesup] (needinfo me) from comment #4)
Over half the crashes in the last 3 months were a single buildid: 20240719093654
(and 2nd most, though with only 14, is the other buildid from 0719)This smells like something landed and was backed out. Any ideas?
It looks to me like the 20240719093654 build is the 128.0 release build. It doesn't surprise me that a release build would have a lot of the crashes because it is more stable than other channels.
Comment 6•11 months ago
|
||
It seems it's hitting https://hg.mozilla.org/mozilla-central/file/ada7576ed0b64f7509e35f1e8cfa1508f0965b5a/ipc/chromium/src/chrome/common/ipc_message_utils.h#l596 because of an IPC message that's too big. Better diagnostics to catch such things might help. I doubt it's hitting negative aCount.
However, SendDataInChunks() should limit us to 128K: https://searchfox.org/mozilla-central/source/netwerk/protocol/http/nsHttp.h#307
Did anything change that might affect these length calculations?
Reporter | ||
Comment 7•11 months ago
|
||
Where are you getting that from? I see "SharedMemory::WriteHandle failed" as the IPC fatal error message value. I don't know of anything that has changed.
Comment 8•11 months ago
|
||
Also, why did it spike for ~4 days, then go (mostly) quiet, if it is release? We released a 128.0.2 and 128.0.3.. but those have 4 and 6, vs 150 for 128.0.
And 130 Nightly is 34.
So it seems to somehow be tied to specific builds...
As for where: the stack traces. i.e. https://crash-stats.mozilla.org/report/index/8fffc8d6-b9ed-49ae-9038-9ddb50240729#tab-details
Where do you see "SharedMemory::WriteHandle failed" in that report?
Reporter | ||
Comment 9•11 months ago
|
||
(In reply to Randell Jesup [:jesup] (needinfo me) from comment #8)
As for where: the stack traces. i.e. https://crash-stats.mozilla.org/report/index/8fffc8d6-b9ed-49ae-9038-9ddb50240729#tab-details
Ah. That says it is crashing on line 602, which is after the validity check for byte_length passed, as far as I can see.
Where do you see "SharedMemory::WriteHandle failed" in that report?
It is in the "crash annotations" tab under "SharedMemory::WriteHandle failed". IIRC, Nika did add a mechanism at some point where we'd fall back to sending large messages over shmem under certain conditions I don't recall. Maybe that's kicking in but then shmem fails for some reason? That crash report you linked has 0 bytes of available page file. I'm not sure if that's unusual on Linux or not but it seems bad.
Comment 10•11 months ago
|
||
This seems to be the same as bug 1909494 (the crash reason is also SharedMemory::WriteHandle failed
).
Reporter | ||
Updated•9 months ago
|
Updated•9 months ago
|
Updated•9 months ago
|
Description
•