Closed Bug 1368343 Opened 7 years ago Closed 7 years ago

Crash in mozalloc_abort | NS_DebugBreak | mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError... with nsDocumentOpenInfo::OnStartRequest

Categories

(Core :: IPC, defect)

54 Branch
x86
Windows
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox-esr52 --- unaffected
firefox53 --- unaffected
firefox54 --- wontfix
firefox55 --- wontfix
firefox56 --- fixed
firefox57 --- fixed

People

(Reporter: philipp, Assigned: mrbkap)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is 
report bp-251522a2-c24a-4a65-9420-9b3540170528.
=============================================================

Crashing Thread (0)
Frame 	Module 	Signature 	Source
0 	mozglue.dll 	mozalloc_abort(char const* const) 	memory/mozalloc/mozalloc_abort.cpp:33
1 	xul.dll 	NS_DebugBreak 	xpcom/base/nsDebugImpl.cpp:428
2 	xul.dll 	mozilla::ipc::FatalError(char const*, char const*, bool) 	ipc/glue/ProtocolUtils.cpp:304
3 	xul.dll 	mozilla::ipc::IProtocol::HandleFatalError(char const*, char const*) 	ipc/glue/ProtocolUtils.cpp:483
4 	xul.dll 	mozilla::ipc::IProtocol::FatalError(char const* const) 	ipc/glue/ProtocolUtils.cpp:472
5 	xul.dll 	mozilla::devtools::PHeapSnapshotTempFileHelperChild::Write(mozilla::devtools::PHeapSnapshotTempFileHelperChild const*, IPC::Message*, bool) 	obj-firefox/ipc/ipdl/PHeapSnapshotTempFileHelperChild.cpp:335
6 	xul.dll 	mozilla::dom::PExternalHelperAppChild::SendDivertToParentUsing(mozilla::net::PChannelDiverterChild*, mozilla::dom::PBrowserChild*) 	obj-firefox/ipc/ipdl/PExternalHelperAppChild.cpp:132
7 	xul.dll 	mozilla::dom::ExternalHelperAppChild::DivertToParent(nsIDivertableChannel*, nsIRequest*, mozilla::dom::TabChild*) 	uriloader/exthandler/ExternalHelperAppChild.cpp:120
8 	xul.dll 	mozilla::dom::ExternalHelperAppChild::OnStartRequest(nsIRequest*, nsISupports*) 	uriloader/exthandler/ExternalHelperAppChild.cpp:77
9 	xul.dll 	nsDocumentOpenInfo::OnStartRequest(nsIRequest*, nsISupports*) 	uriloader/base/nsURILoader.cpp:289
10 	xul.dll 	mozilla::net::HttpChannelChild::DoOnStartRequest(nsIRequest*, nsISupports*) 	netwerk/protocol/http/HttpChannelChild.cpp:552
11 	xul.dll 	mozilla::net::HttpChannelChild::OnStartRequest(nsresult const&, mozilla::net::nsHttpResponseHead const&, bool const&, mozilla::net::nsHttpHeaderArray const&, bool const&, bool const&, unsigned int const&, nsCString const&, nsCString const&, mozilla::net::NetAddr const&, mozilla::net::NetAddr const&, unsigned int const&, nsCString const&) 	netwerk/protocol/http/HttpChannelChild.cpp:483
12 	xul.dll 	mozilla::net::StartRequestEvent::Run() 	netwerk/protocol/http/HttpChannelChild.cpp:348
13 	xul.dll 	mozilla::net::ChannelEventQueue::RunOrEnqueue(mozilla::net::ChannelEvent*, bool) 	obj-firefox/dist/include/mozilla/net/ChannelEventQueue.h:138
14 	xul.dll 	mozilla::net::HttpChannelChild::RecvOnStartRequest(nsresult const&, mozilla::net::nsHttpResponseHead const&, bool const&, mozilla::net::nsHttpHeaderArray const&, bool const&, bool const&, unsigned int const&, nsCString const&, nsCString const&, mozilla::net::NetAddr const&, mozilla::net::NetAddr const&, short const&, unsigned int const&, nsCString const&) 	netwerk/protocol/http/HttpChannelChild.cpp:398
15 	xul.dll 	mozilla::net::PHttpChannelChild::OnMessageReceived(IPC::Message const&) 	obj-firefox/ipc/ipdl/PHttpChannelChild.cpp:608
16 	xul.dll 	mozilla::dom::PContentChild::OnMessageReceived(IPC::Message const&) 	obj-firefox/ipc/ipdl/PContentChild.cpp:5676
17 	xul.dll 	mozilla::ipc::MessageChannel::DispatchAsyncMessage(IPC::Message const&) 	ipc/glue/MessageChannel.cpp:1812
18 	xul.dll 	mozilla::ipc::MessageChannel::DispatchMessageW(IPC::Message&&) 	ipc/glue/MessageChannel.cpp:1747
19 	xul.dll 	mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::MessageChannel::MessageTask&) 	ipc/glue/MessageChannel.cpp:1620
20 	xul.dll 	mozilla::ipc::MessageChannel::MessageTask::Run() 	ipc/glue/MessageChannel.cpp:1653
21 	xul.dll 	mozilla::ValidatingDispatcher::Runnable::Run() 	xpcom/threads/Dispatcher.cpp:257
22 	xul.dll 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp:1264
23 	xul.dll 	NS_ProcessNextEvent(nsIThread*, bool) 	xpcom/threads/nsThreadUtils.cpp:389
24 	xul.dll 	mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:124
25 	xul.dll 	mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:301
26 	xul.dll 	MessageLoop::RunHandler() 	ipc/chromium/src/base/message_loop.cc:231
27 	xul.dll 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc:211
28 	xul.dll 	nsBaseAppShell::Run() 	widget/nsBaseAppShell.cpp:156
29 	xul.dll 	nsAppShell::Run() 	widget/windows/nsAppShell.cpp:269
30 	xul.dll 	XRE_RunAppShell() 	toolkit/xre/nsEmbedFunctions.cpp:869
31 	xul.dll 	mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:269
32 	xul.dll 	MessageLoop::RunHandler() 	ipc/chromium/src/base/message_loop.cc:231
33 	xul.dll 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc:211
34 	xul.dll 	XRE_InitChildProcess(int, char** const, XREChildData const*) 	toolkit/xre/nsEmbedFunctions.cpp:693
35 	xul.dll 	mozilla::BootstrapImpl::XRE_InitChildProcess(int, char** const, XREChildData const*) 	toolkit/xre/Bootstrap.cpp:65
36 	firefox.exe 	content_process_main(mozilla::Bootstrap*, int, char** const) 	ipc/contentproc/plugin-container.cpp:64
37 	firefox.exe 	wmain 	toolkit/xre/nsWindowsWMain.cpp:115
38 	firefox.exe 	__scrt_common_main_seh 	f:/dd/vctools/crt/vcstartup/src/startup/exe_common.inl:253
39 	kernel32.dll 	BaseThreadInitThunk 	
40 	ntdll.dll 	__RtlUserThreadStart 	
41 	ntdll.dll 	_RtlUserThreadStart

this content-process crash signature is newly showing up from a couple of different installations in 54.0b11.
https://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=FIREFOX_54_0b10_RELEASE&tochange=FIREFOX_54_0b11_RELEASE
Component: Untriaged → Memory Allocator
Component: Memory Allocator → IPC
Nick, this looks heap-snapshot-related, have you been working on this or do you know who can take a look at it?
Flags: needinfo?(nfitzgerald)
I'm not doing devtools stuff anymore, but I'm always happy to review patches when needed.

Stack looks pretty deep in IPC glue code, and it isn't clear to me what heap snapshot specific thing might be related. Maybe related to content process sandboxing? Heap snapshot code reles on handing the content process a writable fd.

Forwarding needinfo to Jim.
Flags: needinfo?(nfitzgerald) → needinfo?(jimb)
(In reply to [:philipp] from comment #0)
> 5 	xul.dll 
> mozilla::devtools::PHeapSnapshotTempFileHelperChild::Write(mozilla::devtools:
> :PHeapSnapshotTempFileHelperChild const*, IPC::Message*, bool) 
> obj-firefox/ipc/ipdl/PHeapSnapshotTempFileHelperChild.cpp:335

Is it possible that this wasn't actually PHeapSnapshotTempFileHelperChild, but some other bitwise-identical method that the linker merged with it?

> 6 	xul.dll 
> mozilla::dom::PExternalHelperAppChild::SendDivertToParentUsing(mozilla::net::
> PChannelDiverterChild*, mozilla::dom::PBrowserChild*) 
> obj-firefox/ipc/ipdl/PExternalHelperAppChild.cpp:132

For example, either the PBrowserChild or PChannelDiverterChild serializer?  (Actors are serialized by sending their routing ID, so those methods should all be more or less identical.)

> 7 	xul.dll 
> mozilla::dom::ExternalHelperAppChild::DivertToParent(nsIDivertableChannel*,
> nsIRequest*, mozilla::dom::TabChild*) 
> uriloader/exthandler/ExternalHelperAppChild.cpp:120

In particular, is it possible that either the `diverter` or `tabChild` variables on this line could be either nullptr or already deleted?  (Deleted from IPDL's point of view, at least; actual use-after-free would probably have crashed sooner, I think.)
oh, this signature has totally ceased in 54.0b12 again, so it just might have been a fluke. looking at similar signatures on the beta channel, they mostly seem to occur in just one version: http://bit.ly/2qDAC5b
Would it be possible for me to get some URLs for this? This could be caused by trying to send a null PBrowser at [1]. It would be really useful to be able to see a testcase where this happens.

[1] http://searchfox.org/mozilla-central/rev/972b7a5149bbb2eaab48aafa87678c33d5f2f045/uriloader/exthandler/ExternalHelperAppChild.cpp#120
(In reply to Blake Kaplan (:mrbkap) from comment #5)
> Would it be possible for me to get some URLs for this? This could be caused
> by trying to send a null PBrowser at [1]. It would be really useful to be
> able to see a testcase where this happens.
None of the crashes have URLs associated with them, unfortunately.
I don't see (In reply to Jed Davis [:jld] (⏰UTC-6) from comment #3)
> (In reply to [:philipp] from comment #0)
> > 5 	xul.dll 
> > mozilla::devtools::PHeapSnapshotTempFileHelperChild::Write(mozilla::devtools:
> > :PHeapSnapshotTempFileHelperChild const*, IPC::Message*, bool) 
> > obj-firefox/ipc/ipdl/PHeapSnapshotTempFileHelperChild.cpp:335
> 
> Is it possible that this wasn't actually PHeapSnapshotTempFileHelperChild,
> but some other bitwise-identical method that the linker merged with it?

I'm pretty sure this must be it. From looking at that stack, there's just no way the heap snapshot code could be involved at all.

Given the frame for mozilla::net::HttpChannelChild::OnStartRequest, in netwerk/protocol/http/HttpChannelChild.cpp:483, I think this probably belongs to the networking team.

I asked #jsapi's bot "mrgiggles" who can review changes to HttpChannelChild.cpp, and picked Honza. Honza, if you're not the right person to look into this, could you needinfo someone more suitable? Thanks!
Flags: needinfo?(jimb)
Flags: needinfo?(honzab.moz)
Wrong Honza if this is devtools?
Flags: needinfo?(honzab.moz) → needinfo?(jimb)
If you need someone responsible for diversion by http ipc, it's Jason Duell.
If you need someone responsible for devtools (the top most frame), it's Honza Odvarko.
In comment 7 (directly above the needinfo), I explain why I think this is HTTP-related. The devtools function name is a red herring, as jdm suggested in comment 3.

Thanks for the referral.
Flags: needinfo?(jimb) → needinfo?(jduell.mcbugs)
:philipp,

So is this crash till happening (and how much?).  I'm trying to prioritize this appropriately.

My hunch is it will be hard to fix this without STR, but we can see where we get.
Flags: needinfo?(madperson)
yeah, the issue is still around - it's also a little bit hard to quantify its impact since the signature seems to shift in each build. when using the query from http://bit.ly/2uAYA6E it seems like there are around 130 daily crash reports on release, 70 on beta and 8 on nightly.

many user comments are saying they were attempting to download a photo album from facebook at the time of the crash.

now when i start looking at it in more detail, the crashes started on nightly with build 20170409030205 and on beta with 54.0b4 which would point to bug 1350058 as the common source of the crashes...
Blocks: 1350058
Crash Signature: [@ mozalloc_abort | NS_DebugBreak | mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError | mozilla::ipc::IProtocol::FatalError | mozilla::devtools::PHeapSnapshotTempFileHelperChild::Write] → [@ mozalloc_abort | NS_DebugBreak | mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError | mozilla::ipc::IProtocol::FatalError | mozilla::devtools::PHeapSnapshotTempFileHelperChild::Write] [@ mozalloc_abort | NS_DebugBreak | mozilla::ipc:…
Flags: needinfo?(madperson)
Summary: Crash in mozalloc_abort | NS_DebugBreak | mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError | mozilla::ipc::IProtocol::FatalError | mozilla::devtools::PHeapSnapshotTempFileHelperChild::Write → Crash in mozalloc_abort | NS_DebugBreak | mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError... with nsDocumentOpenInfo::OnStartRequest
I think that my patch in bug 1274706 should fix this crash as well.
See Also: → 1381934
Blake--re: comment 13: Did you mean a different bug?  You don't have a patch in bug 1274706
Flags: needinfo?(mrbkap)
Oops, philipp read my mind, it was bug 1381934.
Flags: needinfo?(mrbkap)
there's a report still after the patch has landed on nightly: bp-a449ccf2-1b2d-4003-8a21-ed1dd0170802
Crash Signature: mozilla::ipc::IProtocol::HandleFatalError | mozilla::ipc::IProtocol::FatalError | mozilla::dom::PBrowserChild::Write] → mozilla::ipc::IProtocol::HandleFatalError | mozilla::ipc::IProtocol::FatalError | mozilla::dom::PBrowserChild::Write] [@ mozalloc_abort | NS_DebugBreak | mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError | mozilla::ipc::IProtocol::Fat…
Assignee: nobody → mrbkap
Comment on attachment 8903804 [details]
Bug 1368343 - Be paranoid about null TabChild.

https://reviewboard.mozilla.org/r/175200/#review180652
Attachment #8903804 - Flags: review?(wmccloskey) → review+
Pushed by mrbkap@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/dc69549547d0
Be paranoid about null TabChild. r=billm
https://hg.mozilla.org/mozilla-central/rev/dc69549547d0
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
do you think we should uplift this to 56 as well?
Flags: needinfo?(mrbkap)
Comment on attachment 8903804 [details]
Bug 1368343 - Be paranoid about null TabChild.

Approval Request Comment
[Feature/Bug causing the regression]: bug 1301056
[User impact if declined]: Possible random crashes.
[Is this code covered by automated tests?]: No. We don't have solid STR.
[Has the fix been verified in Nightly?]: We're keeping an eye on the crash reports. I don't see any from any of the signatures here since the patch was landed.
[Needs manual test from QE? If yes, steps to reproduce]: n/a
[List of other uplifts needed for the feature/fix]: n/a
[Is the change risky?]: No, this is adding a null check in a place where we otherwise would crash.
[String changes made/needed]: n/a
Flags: needinfo?(mrbkap)
Flags: needinfo?(jduell.mcbugs)
Attachment #8903804 - Flags: approval-mozilla-beta?
Comment on attachment 8903804 [details]
Bug 1368343 - Be paranoid about null TabChild.

Take this one in 56 and if it fixes the crash. Beta56+.
Attachment #8903804 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.