Closed Bug 1328106 Opened 3 years ago Closed 3 years ago

[e10s] Crash in Abort | __delete__()d actor | mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError | mozilla::net::PTCPSocket::Transition

Categories

(Core :: DOM: Core & HTML, defect, critical)

43 Branch
All
Windows
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla54
Tracking Status
firefox50 --- wontfix
firefox51 --- wontfix
firefox52 --- fixed
firefox53 --- fixed
firefox54 --- fixed

People

(Reporter: philipp, Assigned: xeonchen)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is 
report bp-7fe10df9-55ea-4851-b111-e74622170101.
=============================================================

Crashing Thread (0)
Frame 	Module 	Signature 	Source
0 	mozglue.dll 	mozalloc_abort(char const* const) 	memory/mozalloc/mozalloc_abort.cpp:33
1 	xul.dll 	NS_DebugBreak 	xpcom/base/nsDebugImpl.cpp:436
2 	xul.dll 	mozilla::ipc::LogicError(char const*) 	ipc/glue/ProtocolUtils.cpp:322
3 	xul.dll 	mozilla::net::PTCPSocket::Transition(mozilla::ipc::Trigger, mozilla::net::PTCPSocket::State*) 	obj-firefox/ipc/ipdl/PTCPSocket.cpp:37
4 	xul.dll 	mozilla::net::PTCPSocketParent::SendCallback(nsString const&, CallbackData const&, unsigned int const&) 	obj-firefox/ipc/ipdl/PTCPSocketParent.cpp:71
5 	xul.dll 	mozilla::dom::TCPSocketParent::SendEvent(nsAString_internal const&, CallbackData, mozilla::dom::TCPReadyState) 	dom/network/TCPSocketParent.cpp:412
6 	xul.dll 	mozilla::dom::TCPSocket::FireEvent(nsAString_internal const&) 	dom/network/TCPSocket.cpp:517
7 	xul.dll 	mozilla::dom::TCPSocket::OnTransportStatus(nsITransport*, nsresult, __int64, __int64) 	dom/network/TCPSocket.cpp:1042
8 	xul.dll 	nsTransportStatusEvent::Run() 	netwerk/base/nsTransportUtils.cpp:77
9 	xul.dll 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp:1067
10 	xul.dll 	NS_ProcessNextEvent(nsIThread*, bool) 	xpcom/glue/nsThreadUtils.cpp:311
11 	xul.dll 	mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:124
12 	xul.dll 	MessageLoop::RunHandler() 	ipc/chromium/src/base/message_loop.cc:225
13 	xul.dll 	nsBaseAppShell::Run() 	widget/nsBaseAppShell.cpp:156
14 	xul.dll 	nsAppShell::Run() 	widget/windows/nsAppShell.cpp:262
15 	xul.dll 	nsAppStartup::Run() 	toolkit/components/startup/nsAppStartup.cpp:283
16 	xul.dll 	XREMain::XRE_mainRun() 	toolkit/xre/nsAppRunner.cpp:4401
17 	xul.dll 	XREMain::XRE_main(int, char** const, nsXREAppData const*) 	toolkit/xre/nsAppRunner.cpp:4534
18 	xul.dll 	XRE_main 	toolkit/xre/nsAppRunner.cpp:4625

this crash signature is regressing on windows since firefox version 48. it's only occurring when e10s is on so it is also increasing in volume as more users get e10s enabled by default.
according to correlation data it's also highly related to the presence of some otherwise rare modules like "ksproxy.ax", which is part of windows ("WDM Streaming ActiveMovie Proxy").

Correlations for Firefox Release:
(97.92% in signature vs 00.39% overall) Module "ksproxy.ax" = true
(97.57% in signature vs 00.35% overall) Module "Kswdmcap.ax" = true
(97.57% in signature vs 00.76% overall) Module "mfc42.dll" = true
(99.31% in signature vs 03.91% overall) Module "devenum.dll" = true
(94.10% in signature vs 00.07% overall) abort_message = ###!!! ABORT: __delete__()d actor: file c:/builds/moz2_slave/m-rel-w32-00000000000000000000/build/src/ipc/glue/ProtocolUtils.cpp, line 409
(97.22% in signature vs 10.52% overall) Module "odbc32.dll" = true
(86.46% in signature vs 00.36% overall) Module "vidcap.ax" = true
(100.0% in signature vs 37.37% overall) dom_ipc_enabled = 1
(97.92% in signature vs 36.04% overall) Module "d3d9.dll" = true
(100.0% in signature vs 39.98% overall) reason = EXCEPTION_BREAKPOINT
(99.31% in signature vs 40.67% overall) Module "msdmo.dll" = true
(98.61% in signature vs 44.42% overall) Module "ksuser.dll" = true
(98.96% in signature vs 45.03% overall) Module "explorerframe.dll" = true
(97.57% in signature vs 43.41% overall) Module "quartz.dll" = true
(36.81% in signature vs 00.98% overall) Module "dpapi.dll" = true
(44.79% in signature vs 10.12% overall) Module "odbcint.dll" = true
(37.15% in signature vs 04.34% overall) Module "policymanager.dll" = true
(37.15% in signature vs 04.37% overall) Module "msvcp110_win.dll" = true
TCPSocketParent::SendEvent is missing a check for mIPCOpen before communicating with the child actor.
Sounds like an issue that goes back to bug 885982 then (and is rising in frequency as we rollout e10s to an increasingly-large set of users).
Blocks: 885982
Component: General → Networking
Version: 50 Branch → 43 Branch
Shouldn't this be filed under DOM compoment?
Flags: needinfo?(josh)
Component: Networking → DOM
Flags: needinfo?(josh)
Gary, would you like to help on this?
Flags: needinfo?(xeonchen)
No problem, I'll take it
Assignee: nobody → xeonchen
Flags: needinfo?(xeonchen)
Comment on attachment 8829342 [details]
Bug 1328106 - check IPC state before sending event;

https://reviewboard.mozilla.org/r/106462/#review107580
Attachment #8829342 - Flags: review?(josh) → review+
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/52e77106022d
check IPC state before sending event; r=jdm
https://hg.mozilla.org/mozilla-central/rev/52e77106022d
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
hi, do you think we should uplift this patch?
Flags: needinfo?(xeonchen)
Looks like a good candidate for 52/53 uplift.
(In reply to [:philipp] from comment #10)
> hi, do you think we should uplift this patch?

Sorry for late reply.
Sure, I think we should uplift this.
Flags: needinfo?(xeonchen)
Comment on attachment 8829342 [details]
Bug 1328106 - check IPC state before sending event;

Approval Request Comment
[Feature/Bug causing the regression]: bug 885982
[User impact if declined]: cause crash issue when e10s is enabled
[Is this code covered by automated tests?]: no
[Has the fix been verified in Nightly?]: not sure.
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]: N/A
[Is the change risky?]: no
[Why is the change risky/not risky?]: it only checks null-pointer before use.
[String changes made/needed]: N/A

Hi philipp, do you know if it has been verified in Nightly?
Flags: needinfo?(madperson)
Attachment #8829342 - Flags: approval-mozilla-beta?
Attachment #8829342 - Flags: approval-mozilla-aurora?
i think it wasn't crashing frequently enough on nightly to say with confidence that the patch has worked, but so far there are certainly no more crashes there after the patch has landed which is a good sign :))
Flags: needinfo?(madperson)
Comment on attachment 8829342 [details]
Bug 1328106 - check IPC state before sending event;

e10s crash fix, let's get this in aurora53 and beta52.  Should be in 52.0b4.
Attachment #8829342 - Flags: approval-mozilla-beta?
Attachment #8829342 - Flags: approval-mozilla-beta+
Attachment #8829342 - Flags: approval-mozilla-aurora?
Attachment #8829342 - Flags: approval-mozilla-aurora+
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.