Last Comment Bug 828034 - crash in mozilla::ipc::RPCChannel::EnteredCxxStack
: crash in mozilla::ipc::RPCChannel::EnteredCxxStack
: crash, regression, topcrash
Product: Core
Classification: Components
Component: Plug-ins (show other bugs)
: 20 Branch
: All Windows 7
P1 critical (vote)
: mozilla21
Assigned To: Aaron Klotz [:aklotz]
: Manuela Muntean [Away]
: Benjamin Smedberg [:bsmedberg]
Depends on:
Blocks: 805591 829909
  Show dependency treegraph
Reported: 2013-01-08 14:16 PST by Scoobidiver (away)
Modified: 2013-02-28 06:45 PST (History)
8 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---

Proposed crash fix (808 bytes, patch)
2013-01-23 16:04 PST, Aaron Klotz [:aklotz]
benjamin: review+
Details | Diff | Splinter Review
Proposed crash fix, rev. 2 (3.83 KB, patch)
2013-01-24 13:14 PST, Aaron Klotz [:aklotz]
benjamin: review+
bajaj.bhavana: approval‑mozilla‑aurora+
Details | Diff | Splinter Review

Description User image Scoobidiver (away) 2013-01-08 14:16:03 PST
It first showed up in 20.0a1/20130106. The regression range is:

Signature 	mozilla::ipc::RPCChannel::EnteredCxxStack() | mozilla::ipc::RPCChannel::CxxStackFrame::CxxStackFrame(mozilla::ipc::RPCChannel&, mozilla::ipc::RPCChannel::Direction, IPC::Message const*) | mozilla::plugins::PPluginInstanceParent::CallUpdateWindow() More Reports Search
UUID	5e4f9196-94c3-4ac3-85a6-6ebc42130108
Date Processed	2013-01-08 19:28:47
Uptime	539
Last Crash	3.0 weeks before submission
Install Age	9.0 minutes since version was first installed.
Install Time	2013-01-08 19:19:16
Product	Firefox
Version	21.0a1
Build ID	20130108033457
Release Channel	nightly
OS	Windows NT
OS Version	6.1.7600
Build Architecture	x86
Build Architecture Info	GenuineIntel family 6 model 23 stepping 10
Crash Address	0x0
App Notes 	
AdapterVendorID: 0x8086, AdapterDeviceID: 0x2a42, AdapterSubsysID: 360b103c, AdapterDriverVersion:
D3D10 Layers? D3D10 Layers- D3D9 Layers? D3D9 Layers- 
EMCheckCompatibility	True
Adapter Vendor ID	0x8086
Adapter Device ID	0x2a42
Total Virtual Memory	2147352576
Available Virtual Memory	1411579904
System Memory Use Percentage	87
Available Page File	415076352
Available Physical Memory	126259200

Frame 	Module 	Signature 	Source
0 	xul.dll 	mozilla::ipc::RPCChannel::EnteredCxxStack 	obj-firefox/dist/include/mozilla/ipc/RPCChannel.h:197
1 	xul.dll 	mozilla::ipc::RPCChannel::CxxStackFrame::CxxStackFrame 	obj-firefox/dist/include/mozilla/ipc/RPCChannel.h:250
2 	xul.dll 	mozilla::ipc::RPCChannel::Call 	ipc/glue/RPCChannel.cpp:136
3 	xul.dll 	mozilla::plugins::PPluginInstanceParent::CallUpdateWindow 	obj-firefox/ipc/ipdl/PPluginInstanceParent.cpp:1076
4 	xul.dll 	nsWindow::OnPaint 	widget/windows/nsWindowGfx.cpp:203
5 	xul.dll 	nsWindow::ProcessMessage 	widget/windows/nsWindow.cpp:4802
6 	xul.dll 	nsWindow::WindowProcInternal 	widget/windows/nsWindow.cpp:4407
7 	xul.dll 	CallWindowProcCrashProtected 	xpcom/base/nsCrashOnException.cpp:32
8 	xul.dll 	nsWindow::WindowProc 	widget/windows/nsWindow.cpp:4359
9 	user32.dll 	InternalCallWinProc 	
10 	user32.dll 	UserCallWinProcCheckWow 	
11 	user32.dll 	CallWindowProcAorW 	
12 	user32.dll 	CallWindowProcW 	
13 	xul.dll 	mozilla::plugins::PluginInstanceParent::PluginWindowHookProc 	dom/plugins/ipc/PluginInstanceParent.cpp:1862
14 	user32.dll 	InternalCallWinProc 	
15 	user32.dll 	UserCallWinProcCheckWow 	
16 	user32.dll 	CallWindowProcAorW 	
17 	user32.dll 	CallWindowProcW 	
18 	xul.dll 	PluginWndProcInternal 	dom/plugins/base/nsPluginNativeWindowWin.cpp:327
19 	xul.dll 	CallWindowProcCrashProtected 	xpcom/base/nsCrashOnException.cpp:32
20 	xul.dll 	PluginWndProc 	dom/plugins/base/nsPluginNativeWindowWin.cpp:356
21 	user32.dll 	InternalCallWinProc 	
22 	user32.dll 	GetRealWindowOwner 	
23 	user32.dll 	DispatchClientMessage 	
24 	user32.dll 	__fnDWORD 	
25 	ntdll.dll 	KiUserCallbackDispatcher 	
26 	ntdll.dll 	KiUserApcDispatcher 	
27 	user32.dll 	DispatchMessageW 	
28 	xul.dll 	nsAppShell::ProcessNextNativeEvent 	widget/windows/nsAppShell.cpp:328
29 	xul.dll 	nsBaseAppShell::OnProcessNextEvent 	widget/xpwidgets/nsBaseAppShell.cpp:280

More reports at:|+mozilla%3A%3Aipc%3A%3ARPCChannel%3A%3ACxxStackFrame%3A%3ACxxStackFrame%28mozilla%3A%3Aipc%3A%3ARPCChannel%26%2C+mozilla%3A%3Aipc%3A%3ARPCChannel%3A%3ADirection%2C+IPC%3A%3AMessage+const*%29+|+mozilla%3A%3Aplugins%3A%3APPluginInstanceParent%3A%3ACallUpdateWindow%28%29
Comment 1 User image Scoobidiver (away) 2013-01-09 04:35:23 PST
It's #9 top browser crasher in 21.0a1.
Comment 2 User image Alex Keybl [:akeybl] 2013-01-13 16:21:47 PST
KaiRo - would you mind grabbing URLs and correlations?
Comment 3 User image Robert Kaiser 2013-01-15 08:29:07 PST
Note that we had a crash with the same signature a few months back (Fx15/16) in bug 770805 and then Benjamin found a fix, so CCing him again here.

2 	about:blank
...and a long list of other sites that are probably using Flash, including some adult video site, apparently.

There are no correlation reports for this signature.
Comment 4 User image Benjamin Smedberg [:bsmedberg] 2013-01-15 11:58:16 PST
See bug 829909, which almost certainly has the same root cause. Regression from bug 805591.
Comment 5 User image Manuela Muntean [Away] 2013-01-18 05:02:51 PST
I've tried to reproduce the crashes on both latest Nightly and Aurora, with intense stress testing, but without any luck.

I've also tried to lower dom.ipc.plugins.hangUITimeoutSecs so that the Plugin Hang UI triggers more easily, but still no Firefox crash.

My attempts were with multiple tabs and multiple windows, all with Flash content, both on Windows 7 and Windows 8.
Comment 6 User image Aaron Klotz [:aklotz] 2013-01-23 16:04:12 PST
Created attachment 705626 [details] [diff] [review]
Proposed crash fix

I was finally able to obtain some useful information from new correlation reports indicating that nearly half of the recent crashes occurred on 32-bit, single-core machines. I fired up an old dual-core desktop of mine, forced Windows to boot with only one core, and eventually was able to reproduce with WinDbg attached!

The debugger is showing that PluginModuleParent::CleanupFromTimeout is firing as expected. Usually when the IPC channel is closed from here, its status is already indicating an error (i.e. the I/O thread reported the channel error first). OTOH, occasionally the channel state was still showing that everything was OK (indicating that CleanupFromTimeout beat the I/O thread to the punch), so the channel was not closing with error. A crash whose stack matches this signature would follow.

It looks like the correct thing to do here is for CleanupFromTimeout() to call CloseWithError() on the channel instead of doing a regular Close().
Comment 7 User image Benjamin Smedberg [:bsmedberg] 2013-01-24 07:19:08 PST
Comment on attachment 705626 [details] [diff] [review]
Proposed crash fix

I don't think this can hurt.
Comment 8 User image Aaron Klotz [:aklotz] 2013-01-24 13:14:10 PST
Created attachment 706038 [details] [diff] [review]
Proposed crash fix, rev. 2

Sorry, the last revision broke a bunch of tests on try. We need to select which Close* function to call depending on whether the child process was terminated directly from ShouldContinueFromReplyTimeout or from a separate thread via the Plugin Hang UI.
Comment 9 User image Aaron Klotz [:aklotz] 2013-01-24 13:42:43 PST
Try in progress:
Comment 10 User image Ryan VanderMeulen [:RyanVM] 2013-01-24 18:12:44 PST
Comment 11 User image Ryan VanderMeulen [:RyanVM] 2013-01-26 16:55:57 PST
Comment 12 User image Robert Kaiser 2013-01-28 08:31:16 PST
This landed on the 26th, but the Nightly from 27th still crashes, e.g. bp-22d1907c-3776-4cd6-b3f2-1bab72130127 :(
Comment 13 User image Scoobidiver (away) 2013-01-28 08:35:48 PST
(In reply to Robert Kaiser ( from comment #12)
> This landed on the 26th
Yes for the date, no for the build. It first landed in 21.0a1/20120128. See
Comment 14 User image Robert Kaiser 2013-01-28 08:40:55 PST
(In reply to Scoobidiver from comment #13)
> (In reply to Robert Kaiser ( from comment #12)
> > This landed on the 26th
> Yes for the date, no for the build. It first landed in 21.0a1/20120128. See
> pushloghtml?fromchange=f18b12139151&tochange=80fed51ae074

Oh, interesting. And also, that's a relief. Haven't seen crashes with the 28 build yet, but it's still rather early on that day, so let's see.
Comment 15 User image Robert Kaiser 2013-01-30 13:31:58 PST
Looks like we're good here, and like bug 829909 is good as well. No crashes after the builds from the 27th.
Comment 16 User image Aaron Klotz [:aklotz] 2013-01-30 14:07:58 PST
Comment on attachment 706038 [details] [diff] [review]
Proposed crash fix, rev. 2

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 805591
User impact if declined: Intermittent crashes when Plugin Hang UI terminates a plugin
Testing completed (on m-c, etc.): Landed on m-c on Jan 28, no crashes for this signature on Nightly since
Risk to taking this patch (and alternatives if risky): None
String or UUID changes made by this patch: None
Comment 17 User image Aaron Klotz [:aklotz] 2013-01-31 13:05:28 PST
checkin-needed for Aurora, please.
Comment 18 User image Ryan VanderMeulen [:RyanVM] 2013-02-01 07:15:51 PST
Comment 19 User image Manuela Muntean [Away] 2013-02-08 07:26:08 PST
I don't see any crash reports in Socorro, after 2013-02-01.

Here are the reports for the first and third signature of this bug, within last week, because I couldn't find any reports regarding the second signature.
Comment 20 User image Robert Kaiser 2013-02-08 08:30:14 PST
Yes, as I said in comment #15, we're good on this one.
Comment 21 User image Manuela Muntean [Away] 2013-02-28 06:41:58 PST
There aren't any new crashes reported in Socorro, for neither one of the 3 signatures of this bug, within last month.  No new crashes after Firefox 20 beta 1, neither.

Reports are available here:

Note You need to log in before you can comment on or make changes to this bug.