Closed Bug 593467 Opened 14 years ago Closed 13 years ago

Carbon paint events are sometimes sent to Cocoa plugins, results in crash

Categories

(Core Graveyard :: Plug-ins, defect)

All
macOS
defect
Not set
critical

Tracking

(blocking2.0 -)

RESOLVED DUPLICATE of bug 634387
Tracking Status
blocking2.0 --- -

People

(Reporter: marcia, Unassigned)

References

()

Details

(Keywords: crash, regression, reproducible)

Seen while reviewing Mac trunk crash stats. http://tinyurl.com/2gy74uq links to the current crashes. This is currently the #3 top crash for Mac on the trunk. According to the crash stats table, crashes started using the 2010090100 build. There are no comments in any of the crash reports.

Frame  	Module  	Signature [Expand]  	Source
0 	XUL 	NS_StackWalk 	xpcom/base/nsStackWalk.cpp:1488
1 	XUL 	nsTraceRefcntImpl::WalkTheStack 	xpcom/base/nsTraceRefcntImpl.cpp:891
2 	XUL 	NS_DebugBreak_P 	xpcom/base/nsDebugImpl.cpp:336
3 	XUL 	mozilla::plugins::PPluginInstanceChild::OnCallReceived 	PPluginInstanceChild.cpp:1947
4 	XUL 	mozilla::plugins::PPluginModuleChild::OnCallReceived 	PPluginModuleChild.cpp:546
5 	XUL 	mozilla::ipc::RPCChannel::DispatchIncall 	ipc/glue/RPCChannel.cpp:510
6 	XUL 	mozilla::ipc::RPCChannel::OnMaybeDequeueOne 	ipc/glue/RPCChannel.cpp:434
7 	XUL 	MessageLoop::DeferOrRunPendingTask 	ipc/chromium/src/base/message_loop.cc:343
8 	XUL 	MessageLoop::DoWork 	ipc/chromium/src/base/message_loop.cc:451
9 	XUL 	base::MessagePumpCFRunLoopBase::RunWorkSource 	ipc/chromium/src/base/message_pump_mac.mm:291
10 	CoreFoundation 	__CFRunLoopDoSources0 	
11 	CoreFoundation 	__CFRunLoopRun 	
12 	CoreFoundation 	CFRunLoopRunSpecific 	
13 	CoreFoundation 	CFRunLoopRunInMode 	
14 	HIToolbox 	RunCurrentEventLoopInMode 	
15 	HIToolbox 	ReceiveNextEventCommon 	
16 	HIToolbox 	BlockUntilNextEventMatchingListInMode 	
17 	AppKit 	_DPSNextEvent 	
18 	AppKit 	-[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] 	
19 	AppKit 	-[NSApplication run] 	
20 	XUL 	base::MessagePumpNSApplication::DoRun 	ipc/chromium/src/base/message_pump_mac.mm:677
21 	XUL 	base::MessagePumpCFRunLoopBase::Run 	ipc/chromium/src/base/message_pump_mac.mm:213
22 	XUL 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:219
23 	XUL 	XRE_InitChildProcess 	toolkit/xre/nsEmbedFunctions.cpp:432
24 	plugin-container 	main 	ipc/app/MozillaRuntimeMain.cpp:87
25 	plugin-container 	plugin-container@0xef5 	
26 		@0x6
http://gcc.gnu.org/onlinedocs/gcc/Return-Address.html

— Built-in Function: void * __builtin_frame_address (unsigned int level)

On some machines it may be impossible to determine the frame address of any function other than the current one; in such cases, or when the top of the stack has been reached, this function will return 0 if the first frame pointer is properly initialized by the startup code.

the code needs a null check of bp.
Blocks: 326594
OS: Mac OS X → Windows 7
OS: Windows 7 → Mac OS X
sorry, safari seems to like to perform random acts of violence against the os field :(.
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0b6pre) Gecko/20100909 Firefox/4.0b6pre

I was playing the video in the article, when I was about to press Cmnd-l to focus on the location bar, but I ended up just pressing Cmnd for a second or so. The cursor beach balled and the plugin crashed:

http://crash-stats.mozilla.com/report/index/bp-f743edd6-5e8e-40de-a9a1-6ac1a2100909
I the crash is probably a regression from 590179,
though perhaps the bug is whatever is causing NS_DebugBreak_P to be called.
Blocks: 590179
Hmm.  Is this code calling NS_DebugBreak_P in opt builds?

We build with -O3 -fomit-frame-pointer on Linux, and have for a good long while.  Are we just not hitting NS_DebugBreak_P there?
(In reply to comment #5)
> We build with -O3 -fomit-frame-pointer on Linux, and have for a good long
> while.

For the record, we currently do -Os -fomit-frame-pointer.

http://mxr.mozilla.org/mozilla-central/source/configure.in#2196
(In reply to comment #4)
> I the crash is probably a regression from 590179,
> though perhaps the bug is whatever is causing NS_DebugBreak_P to be called.

Yes, this is just fallout from a real, serious bug, and the origin of the bug is being masked by gcc inlining FatalError() :S.  Benoit had a patch to make FatalError virtual to prevent the inlining, but apparently it hasn't landed.
I was planning on landing it this weekend but if anyone gets a chance they can land it today for me.
Depends on: 589371
Blocking, and over to Benoit then.
Assignee: nobody → b56girard
blocking2.0: ? → betaN+
I landed 589371. Now the next traces reported should tell us which IPC messages is causing the crash.
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0b7pre) Gecko/20100924 Firefox/4.0b7pre

I encountered this in http://milesmaedamixes.com/

I'd just spent 20 minutes listening. Then I tried to switch tabs from this page, and it beachballed.

http://crash-stats.mozilla.com/report/index/d988e7fb-3939-40b6-b07c-e59bc2100924
http://crash-stats.mozilla.com/report/index/23cf0e30-5815-49f7-9dfd-17b292100924

It happened again. I was watching a video in youtube, and after a couple of minutes, I tried to close the tab while the video was playing, and it beachballed and the plugin crashed.
Both times the error is coming from PPluginInstanceChild.cpp:1658, the same invalid IPC message. I'll take a look when I get home.
The Paint IPC message is causing the crash. The NPEvent must be corrupted and failing deserialization. The problem is that OSX code doesn't use the Paint IPC message (it uses HandleEvent). How do we end up with that message?

Any ideas?
OS: Mac OS X → Windows XP
OS: Windows XP → Mac OS X
I have gotten this crash quite a few times now on OS X. I often get it after focusing the Flash plugin, waiting a while (or just watching the Flash content), and then using Cmd-T to open a new tab, or pressing a key in general. I cannot reproduce it consistently, though.

Keypress events maybe?

I can provide my Breakpad links if they'd be any use.
See Bug 601538 . That bug causes flash to crash when I hit cmd-t. (but not all the time)
Hardware: x86 → All
Assignee: b56girard → joshmoz
Summary: Crash in [@ NS_StackWalk ] → Carbon paint events are sometimes sent to Cocoa plugins, results in crash
From what I have found, this bug seems to be happening when we are sending messages to a plugin process for a 'view' that has been destroyed.

Steps to reproduce:
- Open a youtube video in several tabs (8 worked for me) and let it start playing
- Quickly cycle through them using ctrl-tab (which builds up draw events)
- Stop cycling and start closing the tabs

If you had attached gdb to the plugin-process, you sometimes see a stack trace similar to the ones on crash-stats (although I did get a couple of different ones). This is also not a problem just in flash since I can reproduce with Quickime running OOP.
Keywords: reproducible
Keywords: regression
Whiteboard: [hardblocker]
I'm going to change this to blocking2.0-. I suspect this crash still exists but I can't reproduce it any more and I don't think we have evidence that this happens often enough to block a release. I'll keep an eye on bug reports and crash stats and go back to blocking if I see this happening more.

If we could come up with more reliable repro steps that would be great. I suspect this would be pretty easy to fix if we could reproduce it reliably.
blocking2.0: betaN+ → -
Comment 19 reproduces it occasionally. Alternately, my steps:

- Open any YouTube video and let it start playing.
- Click any of the video links in "Suggestions" without pausing or stopping the current video.
- When the new page loads, Flash will have crashed.

This happens for me 50% of the time or more on OS X 10.6.
Actually, that sounds like bug 572134 to me. Symptoms are similar so it's hard to tell the two apart.
I wonder if this is a dupe of 572134. Maybe we just didn't realize it before NPN_ConvertPoint and NPP_SetWindow calls started making it much worse. The repro steps for this are the same and if we were destroying an instance before a paint call we might incorrectly assume that the destroyed instance was a carbon plugin.
Duping for now, will re-open if we see this again.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → DUPLICATE
I don't think this is a duplicate. I looked into the problem before in Comment #14 and the problem was that the widget view of the plug-in would somehow be set to carbon events. Then they would get bubbled down to the plug-in and fail to (de)serialize causing a crash. I asked you a few questions about it on IRC.

Now I'm not sure if this is still an issue with the recent changes. If we see (de) serialization problems and console errors this is likely the problem.
Scott saw something similar in Comment #18 independently.
OK, I'll un-dupe. I think you looked at this more closely than I did.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Whiteboard: [hardblocker]
Not sure if this helps much, but with GDB hooked up running the STR from comment 19 (variations of which have been crashing Flash ever since it went OOP on Mac), this is the stack trace I got:

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00000076
0x00000076 in ?? ()
(gdb) bt
#0  0x00000076 in ?? ()
Cannot access memory at address 0x76
#1  0x17beaccb in unregister_ShockwaveFlash ()
#2  0x17a0e4b9 in dyld_stub_sprintf ()
#3  0x17bec60f in unregister_ShockwaveFlash ()
#4  0x17bd4960 in unregister_ShockwaveFlash ()
#5  0x17bdafa5 in unregister_ShockwaveFlash ()
#6  0x17dcf6b7 in main ()
#7  0x17dc8502 in main ()
#8  0x17dcc1e1 in main ()
#9  0x17dcdac7 in main ()
#10 0x17af7437 in dyld_stub_sprintf ()
#11 0x17c56fa1 in main ()
#12 0x17c61235 in main ()
#13 0x17c5f13d in main ()
#14 0x17bcc06b in FlashPlayer_10_2_151_49_FlashPlayer ()
#15 0x01534f8e in mozilla::plugins::PluginModuleChild::NPP_Destroy (this=0x581f218, instance=0x5078630) at PluginModuleChild.h:359
#16 0x01534809 in mozilla::plugins::PluginInstanceChild::AnswerNPP_Destroy (this=0x5078630, aResult=0xbfffc9c0) at /Users/jag/moz-hg/mozilla/mozilla/dom/plugins/PluginInstanceChild.cpp:3152
#17 0x0166e5a0 in mozilla::plugins::PPluginInstanceChild::OnCallReceived (this=0x5078630, __msg=@0xbfffcbfc, __reply=@0xbfffcb5c) at PPluginInstanceChild.cpp:1750
#18 0x01660cb3 in mozilla::plugins::PPluginModuleChild::OnCallReceived (this=0x581f218, __msg=@0xbfffcbfc, __reply=@0xbfffcb5c) at PPluginModuleChild.cpp:574
#19 0x0156113d in mozilla::ipc::RPCChannel::DispatchIncall (this=0x581f220, call=@0xbfffcbfc) at /Users/jag/moz-hg/mozilla/mozilla/ipc/glue/RPCChannel.cpp:512
#20 0x0156151e in mozilla::ipc::RPCChannel::Incall (this=0x581f220, call=@0xbfffcbfc, stackDepth=0) at /Users/jag/moz-hg/mozilla/mozilla/ipc/glue/RPCChannel.cpp:498
#21 0x01561ae7 in mozilla::ipc::RPCChannel::OnMaybeDequeueOne (this=0x581f220) at /Users/jag/moz-hg/mozilla/mozilla/ipc/glue/RPCChannel.cpp:429
#22 0x01564be9 in DispatchToMethod<mozilla::ipc::RPCChannel, bool (mozilla::ipc::RPCChannel::*)()> (obj=0x581f220, method={__pfn = 0x15618ec <mozilla::ipc::RPCChannel::OnMaybeDequeueOne()>, __delta = 0}, arg=@0x500e5ec) at tuple.h:383
#23 0x01564c25 in RunnableMethod<mozilla::ipc::RPCChannel, bool (mozilla::ipc::RPCChannel::*)(), Tuple0>::Run (this=0x500e5d0) at task.h:307
Right before plugin-container crashes I see:

###!!! ASSERTION: Must have a native view to convert coordinates.: 'inView', file /Users/jag/moz-hg/mozilla/mozilla/layout/generic/nsPluginUtilsOSX.mm, line 160

a few times.
How recent is your build? This sounds like bug 572134.
Fresh pull from today. Just verified that it has the patch from that bug applied. I also managed to reproduce this with 4.0b10. But yeah, given this bug's title, the crash I'm seeing is a different bug.
Okay, so ignore comments 28 - 31, different bug indeed.

I just got this today in a recent (day old or so) trunk build:

###!!! ASSERTION: Attempted to serialize unknown event type.: 'Not Reached', file ../../dist/include/mozilla/plugins/NPEventOSX.h, line 110
###!!! ASSERTION: Attempted to de-serialize unknown event type.: 'Not Reached', file ../../dist/include/mozilla/plugins/NPEventOSX.h, line 206
###!!! ASSERTION: IPDL error:: 'Error', file PPluginInstanceChild.cpp, line 2133
###!!! ASSERTION: error deserializing (better message TODO): 'Error', file PPluginInstanceChild.cpp, line 2134
###!!! ABORT: [PPluginInstanceChild] abort()ing as a result: file PPluginInstanceChild.cpp, line 2136
mozilla::plugins::PPluginInstanceChild::FatalError(char const*) const+0x0000009E [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x016589B4]
mozilla::plugins::PPluginInstanceChild::OnCallReceived(IPC::Message const&, IPC::Message*&)+0x00000FE3 [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x016609FB]
mozilla::plugins::PPluginModuleChild::OnCallReceived(IPC::Message const&, IPC::Message*&)+0x0000007B [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x01653E0F]
mozilla::ipc::RPCChannel::DispatchIncall(IPC::Message const&)+0x000000DF [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x01554221]
mozilla::ipc::RPCChannel::Incall(IPC::Message const&, unsigned long)+0x000002B8 [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x01554602]
mozilla::ipc::RPCChannel::OnMaybeDequeueOne()+0x000001FB [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x01554BCB]
void DispatchToMethod<mozilla::ipc::RPCChannel, bool (mozilla::ipc::RPCChannel::*)()>(mozilla::ipc::RPCChannel*, bool (mozilla::ipc::RPCChannel::*)(), Tuple0 const&)+0x00000038 [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x01557CCD]
RunnableMethod<mozilla::ipc::RPCChannel, bool (mozilla::ipc::RPCChannel::*)(), Tuple0>::Run()+0x00000039 [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x01557D09]
mozilla::ipc::RPCChannel::RefCountedTask::Run()+0x0000001C [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x01556634]
mozilla::ipc::RPCChannel::DequeueTask::Run()+0x0000001C [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x01557D82]
MessageLoop::RunTask(Task*)+0x0000007D [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x01805913]
MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&)+0x00000035 [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x01806049]
MessageLoop::DoWork()+0x000000FD [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x018065D3]
base::MessagePumpCFRunLoopBase::RunWork()+0x0000004A [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x0187089C]
base::MessagePumpCFRunLoopBase::RunWorkSource(void*)+0x00000017 [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x018708E1]
__CFRunLoopDoSources0+0x000004B1 [/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation +0x0003F361]
__CFRunLoopRun+0x0000042F [/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation +0x0003CF8F]
CFRunLoopRunSpecific+0x000001C4 [/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation +0x0003C464]
CFRunLoopRunInMode+0x00000061 [/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation +0x0003C291]
RunCurrentEventLoopInMode+0x00000188 [/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox +0x00035004]
ReceiveNextEventCommon+0x00000162 [/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox +0x00034DBB]
BlockUntilNextEventMatchingListInMode+0x00000051 [/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox +0x00034C40]
_DPSNextEvent+0x0000034F [/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit +0x0004878D]
-[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:]+0x0000009C [/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit +0x00047FCE]
-[NSApplication run]+0x00000335 [/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit +0x0000A247]
base::MessagePumpNSApplication::DoRun(base::MessagePump::Delegate*)+0x00000082 [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x0187043A]
base::MessagePumpCFRunLoopBase::Run(base::MessagePump::Delegate*)+0x000000AF [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x018709CD]
MessageLoop::RunInternal()+0x0000007A [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x01805D0E]
MessageLoop::RunHandler()+0x00000011 [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x01805D25]
MessageLoop::Run()+0x00000023 [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x01805D89]
XRE_InitChildProcess+0x00000999 [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/XUL +0x00047EB6]
main+0x00000051 [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/plugin-container.app/Contents/MacOS/plugin-container +0x00000DF7]
start+0x00000036 [/Users/jag/moz-hg/ff-debug/i386/dist/MinefieldDebug.app/Contents/MacOS/plugin-container.app/Contents/MacOS/plugin-container +0x00000D7A]
Erh, and I can't reproduce it, but it happened for me when I had several YouTube tabs open, one playing, and I opened a new tab (about:blank).
(In reply to comment #33)
> Erh, and I can't reproduce it, but it happened for me when I had several
> YouTube tabs open, one playing, and I opened a new tab (about:blank).

I assume when you opened a new tab above you were using the keyboard (cmd-t), and in comment 3 Juan says all he had to do was press a modifier key. This sounds like bug 634387, and while I was fixing that I realized that the parent stack doesn't reliably point to the problematic event. Duping this against this, reopen if this is still a problem.
Assignee: joshmoz → nobody
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → DUPLICATE
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.