Closed Bug 550026 Opened 16 years ago Closed 15 years ago

browser-fatal 0 == mCxxStackFrames assertion failure in ~RPCChannel on plugin crash during AnswerProcessSomeEvents()

Categories

(Core Graveyard :: Plug-ins, defect)

x86
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla1.9.3a3

People

(Reporter: karlt, Assigned: cjones)

References

Details

Attachments

(4 files, 2 obsolete files)

This is a rather big problem: when we spin a nested event loop, we should only be processing native (GTK) events, never XPCOM events.
Hm, this is an unfortunate series of events. I guess the fix is to somehow get at the nsAppShell directly?
I'm not sure, why do you need to get at it directly, instead of, say, calling g_main_context_iteration directly?
We could, but then we'll have to add a Qt impl soon. Maybe the increased control is worth a fragmented impl, though (AppShell does some funky stuff).
When an IPP spins a nested GLib event loop, XPCOM events can get run.
The fundamental issue is that the PluginCrashed event ran while code for that plugin was still on the C++ stack. We want to prevent that from happening. Unfortunately it's not as simple as directly running the glib event loop. Like karl said, XPCOM events can be run from the glib loop under a specific circumstance: two NS_DispatchToMainThread()s before returning to the "main" event loop. This is pretty likely in this situation. A better fix would be to directly call into nsBaseAppShell::DoProcessNextNativeEvent(), which prevents XPCOM events from running as long as a nested glib event loop isn't entered on top of DoProcessNextNativeEvent(). But, there are two problems with this solution: first, the unholy marriage of nsThread and nsAppShell leaves DoProcessNextNativeEvent private and buried away in a place that's hard to get to. We'd need some ugly hacks to touch it. Second, there are other places in the code that spin nested glib event loops off native events, so this solution is incomplete. I'm going to implement a fix for deferring PluginCrashed until the plugin code is off the stack (which we need regardless) and file a followup about looking into using DoProcessNextNativeEvent() or g_content_main_iteration() to cut down on XPCOM event processing in this "nested" loop.
I don't see any way to write a deterministic test abstracting this crash, so I'm going to settle for a nondeterministic one in which the failure mode is spurious pass. Hopefully I can get it to fail locally once or twice ...
bent/jimm, is it possible for the windows/*Channel stuff to process XPCOM events off of windows messages? If so, the windows code is vulnerable to this bug too, and it might be nice to have a windows impl of the test I'm writing.
Amazingly enough, I was able to repro this bug using the attached test 4/6 times first time I tried (!!!). r? to karlt because this test is rather complicated, and hopefully you can think of a better approach :).
Assignee: nobody → jones.chris.g
Attachment #430250 - Flags: review?(karlt)
Turned out to be rather more complicated than I expected. This test turned up several bugs :S.
Attachment #430269 - Flags: review?(benjamin)
So... if this is a task, will the chromium message loop NestableTasksAllowed code prevent the problem automatically? Or do we actually set NestableTasksAllowed when in this nested-event-loop situation?
For async-streams I'm also going to need tasks which are only delivered from the toplevel message loop: I think we're going to need a more general solution.
Attachment #430269 - Flags: review?(benjamin) → review-
(In reply to comment #9) > r? to karlt because this test is rather complicated, and hopefully you can > think of a better approach :). FWIW this test worked probably >75% of the time for me yesterday (amazingly enough), but a simpler test would still be nice.
Modified Chris's patch a bit. Chris and I talked about how to do this in a way that was not so reliant on timing. There's still a tweak factor here: the usleep in CrasherThread(). In a debug build values between 100 and 3000 gave reliable crashes with 1 and 2 cores active. 30000 even gives crashes most of the time. I've picked 200 us to be nearer the shorter end of the scale, for the sake of optimized builds, and in case we decide not to run so many browser process events.
Attachment #430250 - Attachment is obsolete: true
Attachment #430513 - Flags: review?(jones.chris.g)
Attachment #430250 - Flags: review?(karlt)
Including missing file.
Attachment #430513 - Attachment is obsolete: true
Attachment #430513 - Flags: review?(jones.chris.g)
Attachment #430514 - Flags: review?(jones.chris.g)
Attachment #430514 - Flags: review?(jones.chris.g) → review+
Comment on attachment 430269 [details] [diff] [review] Don't call PluginCrashed while plugin code is still on the stack Requesting re-review based on IRC chat about both SetNestableTasksEnabled() and PostNonNestableTask() being broken in our "embedding". I forgot what the final question(s) were (and I ran out of backscroll), but my opinion is not to block this on fixing those.
Attachment #430269 - Flags: review- → review?(benjamin)
Comment on attachment 430269 [details] [diff] [review] Don't call PluginCrashed while plugin code is still on the stack This still makes me nervous because it looks like we'd be busy-waiting on Windows, but I guess that the nested event loop ought to exit when the crash happens, so we should be safe.
Attachment #430269 - Flags: review?(benjamin) → review+
Needed to modify the test to handle other events on timers. The event causing the failure for me was gtk_selection_retrieval_timeout from the previous test, test_copyText.html, but possibly other events could be scheduled for other times. http://hg.mozilla.org/projects/electrolysis/rev/ab7b233c05fe
One of the patches I pushed early this morning caused this tinderbox crash http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1268299882.1268300695.3799.gz Luckily valgrind caught what appears to be causing the crash ###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv WARNING: Failed to send message!: file /home/cjones/mozilla/mozilla-central/dom/plugins/PluginScriptableObjectParent.cpp, line 216 ==888== Thread 1: ==888== Invalid read of size 8 ==888== at 0x69ECE2C: mozilla::plugins::PluginScriptableObjectParent::Unprotect() (PluginScriptableObjectParent.cpp:703) ==888== by 0x69EF77D: mozilla::plugins::ProtectedActor<mozilla::plugins::PluginScriptableObjectParent, mozilla::plugins::ProtectedActorTraits<mozilla::plugins::PluginScriptableObjectParent> >::~ProtectedActor() (PluginScriptableObjectUtils.h:288) ==888== by 0x69EB1FD: mozilla::plugins::PluginScriptableObjectParent::ScriptableInvoke(NPObject*, void*, _NPVariant const*, unsigned int, _NPVariant*) (PluginScriptableObjectParent.cpp:228) ==888== by 0x66F0BF4: CallNPMethodInternal(JSContext*, JSObject*, unsigned int, long*, long*, int) (nsJSNPRuntime.cpp:1433) ==888== by 0x66F0D8C: CallNPMethod(JSContext*, JSObject*, unsigned int, long*, long*) (nsJSNPRuntime.cpp:1483) ==888== by 0x82E0EB0: js_Invoke (jsinterp.cpp:1370) ==888== by 0x82CC9FC: js_Interpret (jsops.cpp:2277) ==888== by 0x82E0F02: js_Invoke (jsinterp.cpp:1378) ==888== by 0x82E1232: js_InternalInvoke (jsinterp.cpp:1435) ==888== by 0x82588AB: JS_CallFunctionValue (jsapi.cpp:4951) ==888== by 0x6036B45: nsJSContext::CallEventHandler(nsISupports*, void*, void*, nsIArray*, nsIVariant**) (nsJSEnvironment.cpp:2161) ==888== by 0x60C2883: nsJSEventListener::HandleEvent(nsIDOMEvent*) (nsJSEventListener.cpp:228) ==888== Address 0x1c9b0650 is 48 bytes inside a block of size 64 free'd ==888== at 0x4C24D68: free (vg_replace_malloc.c:325) ==888== by 0x6B6704F: moz_free (nsTraceMalloc.c:1264) ==888== by 0x69EC728: mozilla::plugins::PluginScriptableObjectParent::~PluginScriptableObjectParent() (mozalloc.h:228) ==888== by 0x69DBA73: mozilla::plugins::PluginInstanceParent::DeallocPPluginScriptableObject(mozilla::plugins::PPluginScriptableObjectParent*) (PluginInstanceParent.cpp:705) ==888== by 0x6A0EE84: mozilla::plugins::PPluginInstanceParent::DeallocSubtree() (PPluginInstanceParent.cpp:1462) ==888== by 0x6A03B85: mozilla::plugins::PPluginModuleParent::DeallocSubtree() (PPluginModuleParent.cpp:723) ==888== by 0x6A03746: mozilla::plugins::PPluginModuleParent::OnChannelError() (PPluginModuleParent.cpp:599) ==888== by 0x69F3143: mozilla::ipc::AsyncChannel::NotifyMaybeChannelError() (AsyncChannel.cpp:304) ==888== by 0x69F4595: void DispatchToMethod<mozilla::ipc::AsyncChannel, void (mozilla::ipc::AsyncChannel::*)()>(mozilla::ipc::AsyncChannel*, void (mozilla::ipc::AsyncChannel::*)(), Tuple0 const&) (tuple.h:383) ==888== by 0x69F43A9: RunnableMethod<mozilla::ipc::AsyncChannel, void (mozilla::ipc::AsyncChannel::*)(), Tuple0>::Run() (task.h:307) ==888== by 0x6BA4611: MessageLoop::RunTask(Task*) (message_loop.cc:336) ==888== by 0x6BA4681: MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) (message_loop.cc:344) ==888== ==888== Invalid read of size 4 ==888== at 0x69ECE5E: mozilla::plugins::PluginScriptableObjectParent::Unprotect() (PluginScriptableObjectParent.cpp:704) ==888== by 0x69EF77D: mozilla::plugins::ProtectedActor<mozilla::plugins::PluginScriptableObjectParent, mozilla::plugins::ProtectedActorTraits<mozilla::plugins::PluginScriptableObjectParent> >::~ProtectedActor() (PluginScriptableObjectUtils.h:288) ==888== by 0x69EB1FD: mozilla::plugins::PluginScriptableObjectParent::ScriptableInvoke(NPObject*, void*, _NPVariant const*, unsigned int, _NPVariant*) (PluginScriptableObjectParent.cpp:228) ==888== by 0x66F0BF4: CallNPMethodInternal(JSContext*, JSObject*, unsigned int, long*, long*, int) (nsJSNPRuntime.cpp:1433) ==888== by 0x66F0D8C: CallNPMethod(JSContext*, JSObject*, unsigned int, long*, long*) (nsJSNPRuntime.cpp:1483) ==888== by 0x82E0EB0: js_Invoke (jsinterp.cpp:1370) ==888== by 0x82CC9FC: js_Interpret (jsops.cpp:2277) ==888== by 0x82E0F02: js_Invoke (jsinterp.cpp:1378) ==888== by 0x82E1232: js_InternalInvoke (jsinterp.cpp:1435) ==888== by 0x82588AB: JS_CallFunctionValue (jsapi.cpp:4951) ==888== by 0x6036B45: nsJSContext::CallEventHandler(nsISupports*, void*, void*, nsIArray*, nsIVariant**) (nsJSEnvironment.cpp:2161) ==888== by 0x60C2883: nsJSEventListener::HandleEvent(nsIDOMEvent*) (nsJSEventListener.cpp:228) ==888== Address 0x1c9b0658 is 56 bytes inside a block of size 64 free'd ==888== at 0x4C24D68: free (vg_replace_malloc.c:325) ==888== by 0x6B6704F: moz_free (nsTraceMalloc.c:1264) ==888== by 0x69EC728: mozilla::plugins::PluginScriptableObjectParent::~PluginScriptableObjectParent() (mozalloc.h:228) ==888== by 0x69DBA73: mozilla::plugins::PluginInstanceParent::DeallocPPluginScriptableObject(mozilla::plugins::PPluginScriptableObjectParent*) (PluginInstanceParent.cpp:705) ==888== by 0x6A0EE84: mozilla::plugins::PPluginInstanceParent::DeallocSubtree() (PPluginInstanceParent.cpp:1462) ==888== by 0x6A03B85: mozilla::plugins::PPluginModuleParent::DeallocSubtree() (PPluginModuleParent.cpp:723) ==888== by 0x6A03746: mozilla::plugins::PPluginModuleParent::OnChannelError() (PPluginModuleParent.cpp:599) ==888== by 0x69F3143: mozilla::ipc::AsyncChannel::NotifyMaybeChannelError() (AsyncChannel.cpp:304) ==888== by 0x69F4595: void DispatchToMethod<mozilla::ipc::AsyncChannel, void (mozilla::ipc::AsyncChannel::*)()>(mozilla::ipc::AsyncChannel*, void (mozilla::ipc::AsyncChannel::*)(), Tuple0 const&) (tuple.h:383) ==888== by 0x69F43A9: RunnableMethod<mozilla::ipc::AsyncChannel, void (mozilla::ipc::AsyncChannel::*)(), Tuple0>::Run() (task.h:307) ==888== by 0x6BA4611: MessageLoop::RunTask(Task*) (message_loop.cc:336) ==888== by 0x6BA4681: MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) (message_loop.cc:344) ==888== ==888== Invalid read of size 4 ==888== at 0x69ECE8E: mozilla::plugins::PluginScriptableObjectParent::Unprotect() (PluginScriptableObjectParent.cpp:706) ==888== by 0x69EF77D: mozilla::plugins::ProtectedActor<mozilla::plugins::PluginScriptableObjectParent, mozilla::plugins::ProtectedActorTraits<mozilla::plugins::PluginScriptableObjectParent> >::~ProtectedActor() (PluginScriptableObjectUtils.h:288) ==888== by 0x69EB1FD: mozilla::plugins::PluginScriptableObjectParent::ScriptableInvoke(NPObject*, void*, _NPVariant const*, unsigned int, _NPVariant*) (PluginScriptableObjectParent.cpp:228) ==888== by 0x66F0BF4: CallNPMethodInternal(JSContext*, JSObject*, unsigned int, long*, long*, int) (nsJSNPRuntime.cpp:1433) ==888== by 0x66F0D8C: CallNPMethod(JSContext*, JSObject*, unsigned int, long*, long*) (nsJSNPRuntime.cpp:1483) ==888== by 0x82E0EB0: js_Invoke (jsinterp.cpp:1370) ==888== by 0x82CC9FC: js_Interpret (jsops.cpp:2277) ==888== by 0x82E0F02: js_Invoke (jsinterp.cpp:1378) ==888== by 0x82E1232: js_InternalInvoke (jsinterp.cpp:1435) ==888== by 0x82588AB: JS_CallFunctionValue (jsapi.cpp:4951) ==888== by 0x6036B45: nsJSContext::CallEventHandler(nsISupports*, void*, void*, nsIArray*, nsIVariant**) (nsJSEnvironment.cpp:2161) ==888== by 0x60C2883: nsJSEventListener::HandleEvent(nsIDOMEvent*) (nsJSEventListener.cpp:228) ==888== Address 0x1c9b065c is 60 bytes inside a block of size 64 free'd ==888== at 0x4C24D68: free (vg_replace_malloc.c:325) ==888== by 0x6B6704F: moz_free (nsTraceMalloc.c:1264) ==888== by 0x69EC728: mozilla::plugins::PluginScriptableObjectParent::~PluginScriptableObjectParent() (mozalloc.h:228) ==888== by 0x69DBA73: mozilla::plugins::PluginInstanceParent::DeallocPPluginScriptableObject(mozilla::plugins::PPluginScriptableObjectParent*) (PluginInstanceParent.cpp:705) ==888== by 0x6A0EE84: mozilla::plugins::PPluginInstanceParent::DeallocSubtree() (PPluginInstanceParent.cpp:1462) ==888== by 0x6A03B85: mozilla::plugins::PPluginModuleParent::DeallocSubtree() (PPluginModuleParent.cpp:723) ==888== by 0x6A03746: mozilla::plugins::PPluginModuleParent::OnChannelError() (PPluginModuleParent.cpp:599) ==888== by 0x69F3143: mozilla::ipc::AsyncChannel::NotifyMaybeChannelError() (AsyncChannel.cpp:304) ==888== by 0x69F4595: void DispatchToMethod<mozilla::ipc::AsyncChannel, void (mozilla::ipc::AsyncChannel::*)()>(mozilla::ipc::AsyncChannel*, void (mozilla::ipc::AsyncChannel::*)(), Tuple0 const&) (tuple.h:383) ==888== by 0x69F43A9: RunnableMethod<mozilla::ipc::AsyncChannel, void (mozilla::ipc::AsyncChannel::*)(), Tuple0>::Run() (task.h:307) ==888== by 0x6BA4611: MessageLoop::RunTask(Task*) (message_loop.cc:336) ==888== by 0x6BA4681: MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) (message_loop.cc:344) ==888== My tentative hypothesis is that we're running NotifyChannelError from the "nested event loop", but valgrind isn't giving me enough context frames to be sure. That would be really bad :(.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
OK, my hypothesis is right :( :( :(. ==1287== Invalid read of size 8 ==1287== at 0x69ECE2C: mozilla::plugins::PluginScriptableObjectParent::Unprotect() (PluginScriptableObjectParent.cpp:703) ==1287== by 0x69EF77D: mozilla::plugins::ProtectedActor<mozilla::plugins::PluginScriptableObjectParent, mozilla::plugins::ProtectedActorTraits<mozilla::plugins::PluginScriptableObjectParent> >::~ProtectedActor() (PluginScriptableObjectUtils.h:288) ==1287== by 0x69EB1FD: mozilla::plugins::PluginScriptableObjectParent::ScriptableInvoke(NPObject*, void*, _NPVariant const*, unsigned int, _NPVariant*) (PluginScriptableObjectParent.cpp:228) ==1287== by 0x66F0BF4: CallNPMethodInternal(JSContext*, JSObject*, unsigned int, long*, long*, int) (nsJSNPRuntime.cpp:1433) ==1287== by 0x66F0D8C: CallNPMethod(JSContext*, JSObject*, unsigned int, long*, long*) (nsJSNPRuntime.cpp:1483) ==1287== by 0x82E0EB0: js_Invoke (jsinterp.cpp:1370) ==1287== by 0x82CC9FC: js_Interpret (jsops.cpp:2277) ==1287== by 0x82E0F02: js_Invoke (jsinterp.cpp:1378) ==1287== by 0x82E1232: js_InternalInvoke (jsinterp.cpp:1435) ==1287== by 0x82588AB: JS_CallFunctionValue (jsapi.cpp:4951) ==1287== by 0x6036B45: nsJSContext::CallEventHandler(nsISupports*, void*, void*, nsIArray*, nsIVariant**) (nsJSEnvironment.cpp:2161) ==1287== by 0x60C2883: nsJSEventListener::HandleEvent(nsIDOMEvent*) (nsJSEventListener.cpp:228) ==1287== by 0x5E2B510: nsEventListenerManager::HandleEventSubType(nsListenerStruct*, nsIDOMEventListener*, nsIDOMEvent*, nsPIDOMEventTarget*, unsigned int, nsCxPusher*) (nsEventListenerManager.cpp:1082) ==1287== by 0x5E2BAD5: nsEventListenerManager::HandleEvent(nsPresContext*, nsEvent*, nsIDOMEvent**, nsPIDOMEventTarget*, unsigned int, nsEventStatus*, nsCxPusher*) (nsEventListenerManager.cpp:1198) ==1287== by 0x5E5A97B: nsEventTargetChainItem::HandleEvent(nsEventChainPostVisitor&, unsigned int, int, nsCxPusher*) (nsEventDispatcher.cpp:201) ==1287== by 0x5E58833: nsEventTargetChainItem::HandleEventTargetChain(nsEventChainPostVisitor&, unsigned int, nsDispatchingCallback*, int, nsCxPusher*) (nsEventDispatcher.cpp:326) ==1287== by 0x5E594EF: nsEventDispatcher::Dispatch(nsISupports*, nsPresContext*, nsEvent*, nsIDOMEvent*, nsEventStatus*, nsDispatchingCallback*, nsCOMArray<nsPIDOMEventTarget>*) (nsEventDispatcher.cpp:601) ==1287== by 0x5A2E6BC: DocumentViewerImpl::LoadComplete(unsigned int) (nsDocumentViewer.cpp:1027) ==1287== by 0x6504E45: nsDocShell::EndPageLoad(nsIWebProgress*, nsIChannel*, unsigned int) (nsDocShell.cpp:5746) ==1287== by 0x650481A: nsDocShell::OnStateChange(nsIWebProgress*, nsIRequest*, unsigned int, unsigned int) (nsDocShell.cpp:5624) ==1287== by 0x653625F: nsDocLoader::FireOnStateChange(nsIWebProgress*, nsIRequest*, int, unsigned int) (nsDocLoader.cpp:1314) ==1287== by 0x6534E71: nsDocLoader::doStopDocumentLoad(nsIRequest*, unsigned int) (nsDocLoader.cpp:926) ==1287== by 0x653498C: nsDocLoader::DocLoaderIsEmpty(int) (nsDocLoader.cpp:802) ==1287== by 0x653453A: nsDocLoader::OnStopRequest(nsIRequest*, nsISupports*, unsigned int) (nsDocLoader.cpp:697) ==1287== by 0x57A8A47: nsLoadGroup::RemoveRequest(nsIRequest*, nsISupports*, unsigned int) (nsLoadGroup.cpp:680) ==1287== by 0x5D656A4: nsDocument::DoUnblockOnload() (nsDocument.cpp:7093) ==1287== by 0x5D6547F: nsDocument::UnblockOnload(int) (nsDocument.cpp:7040) ==1287== by 0x5D5A9E1: nsDocument::DispatchContentLoadedEvents() (nsDocument.cpp:4024) ==1287== by 0x5D749B7: nsRunnableMethod<nsDocument, void>::Run() (nsThreadUtils.h:282) ==1287== by 0x6B266DE: nsThread::ProcessNextEvent(int, int*) (nsThread.cpp:527) ==1287== Address 0x1effed30 is 48 bytes inside a block of size 64 free'd ==1287== at 0x4C24D68: free (vg_replace_malloc.c:325) ==1287== by 0x6B6704F: moz_free (nsTraceMalloc.c:1264) ==1287== by 0x69EC728: mozilla::plugins::PluginScriptableObjectParent::~PluginScriptableObjectParent() (mozalloc.h:228) ==1287== by 0x69DBA73: mozilla::plugins::PluginInstanceParent::DeallocPPluginScriptableObject(mozilla::plugins::PPluginScriptableObjectParent*) (PluginInstanceParent.cpp:705) ==1287== by 0x6A0EE84: mozilla::plugins::PPluginInstanceParent::DeallocSubtree() (PPluginInstanceParent.cpp:1462) ==1287== by 0x6A03B85: mozilla::plugins::PPluginModuleParent::DeallocSubtree() (PPluginModuleParent.cpp:723) ==1287== by 0x6A03746: mozilla::plugins::PPluginModuleParent::OnChannelError() (PPluginModuleParent.cpp:599) ==1287== by 0x69F3143: mozilla::ipc::AsyncChannel::NotifyMaybeChannelError() (AsyncChannel.cpp:304) ==1287== by 0x69F4595: void DispatchToMethod<mozilla::ipc::AsyncChannel, void (mozilla::ipc::AsyncChannel::*)()>(mozilla::ipc::AsyncChannel*, void (mozilla::ipc::AsyncChannel::*)(), Tuple0 const&) (tuple.h:383) ==1287== by 0x69F43A9: RunnableMethod<mozilla::ipc::AsyncChannel, void (mozilla::ipc::AsyncChannel::*)(), Tuple0>::Run() (task.h:307) ==1287== by 0x6BA4611: MessageLoop::RunTask(Task*) (message_loop.cc:336) ==1287== by 0x6BA4681: MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) (message_loop.cc:344) ==1287== by 0x6BA4A7F: MessageLoop::DoWork() (message_loop.cc:444) ==1287== by 0x69F79BC: mozilla::ipc::DoWorkRunnable::Run() (MessagePump.cpp:75) ==1287== by 0x6B266DE: nsThread::ProcessNextEvent(int, int*) (nsThread.cpp:527) ==1287== by 0x6AB5B0C: NS_ProcessPendingEvents_P(nsIThread*, unsigned int) (nsThreadUtils.cpp:200) ==1287== by 0x689F1AE: nsBaseAppShell::NativeEventCallback() (nsBaseAppShell.cpp:125) ==1287== by 0x687A069: nsAppShell::EventProcessorCallback(_GIOChannel*, GIOCondition, void*) (nsAppShell.cpp:70) ==1287== by 0xB4D4BCD: g_main_context_dispatch (gmain.c:1960) ==1287== by 0xB4D8597: g_main_context_iterate (gmain.c:2591) ==1287== by 0xB4D86BF: g_main_context_iteration (gmain.c:2654) ==1287== by 0x69E34DD: mozilla::plugins::PluginModuleParent::AnswerProcessSomeEvents() (PluginModuleParent.cpp:883) ==1287== by 0x6A0353E: mozilla::plugins::PPluginModuleParent::OnCallReceived(IPC::Message const&, IPC::Message*&) (PPluginModuleParent.cpp:543) ==1287== by 0x69FA39E: mozilla::ipc::RPCChannel::DispatchIncall(IPC::Message const&) (RPCChannel.cpp:472) ==1287== by 0x69FA2B3: mozilla::ipc::RPCChannel::Incall(IPC::Message const&, unsigned long) (RPCChannel.cpp:458) ==1287== by 0x69F99BE: mozilla::ipc::RPCChannel::Call(IPC::Message*, IPC::Message*) (RPCChannel.cpp:299) ==1287== by 0x6A17768: mozilla::plugins::PPluginScriptableObjectParent::CallInvoke(long const&, nsTArray<mozilla::plugins::Variant> const&, mozilla::plugins::Variant*, bool*) (PPluginScriptableObjectParent.cpp:223) ==1287== by 0x69EB133: mozilla::plugins::PluginScriptableObjectParent::ScriptableInvoke(NPObject*, void*, _NPVariant const*, unsigned int, _NPVariant*) (PluginScriptableObjectParent.cpp:214) ==1287== by 0x66F0BF4: CallNPMethodInternal(JSContext*, JSObject*, unsigned int, long*, long*, int) (nsJSNPRuntime.cpp:1433) ==1287== by 0x66F0D8C: CallNPMethod(JSContext*, JSObject*, unsigned int, long*, long*) (nsJSNPRuntime.cpp:1483) ==1287== I don't know of an easy fix offhand, backing out. Probably need to bite the bullet and fix PostNonNestableTask().
Hm, looks like there's a problem with the test, too http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1268304664.1268305935.18840.gz#err0 NPP_Destroy WARNING: Should have crashed in ProcessBrowserEvents: 'glib warning', file /builds/slave/mozilla-central-linux-debug/build/toolkit/xre/nsSigHandlers.cpp, line 225 ** (<unknown>:5558): WARNING **: Should have crashed in ProcessBrowserEvents NEXT ERROR 16 ERROR TEST-UNEXPECTED-FAIL | /tests/modules/plugin/test/test_crash_nested_loop.html | p.crashInNestedLoop() should throw an exception
Actually, the test failure might be due to processing an XPCOM event that ends up calling into the plugin through NPRuntime.
(In reply to comment #22) > I don't know of an easy fix offhand, backing out. Probably need to bite the > bullet and fix PostNonNestableTask(). One workaround might be somehow only processing X events in this nested loop, if that's possible.
(In reply to comment #25) > (In reply to comment #22) > > I don't know of an easy fix offhand, backing out. Probably need to bite the > > bullet and fix PostNonNestableTask(). > > One workaround might be somehow only processing X events in this nested loop, > if that's possible. I tried a few gross hacks here, but nothing worked. Putting aside until karlt can weigh in.
This makes the early "Don't call PluginCrashed..." simpler, but we still need it so that we can cancel the PluginCrashed event.
Attachment #432006 - Flags: review?(bent.mozilla)
(In reply to comment #23) > ** (<unknown>:5558): WARNING **: Should have crashed in ProcessBrowserEvents I made the plugin thread sleeps more thorough as an experiment http://hg.mozilla.org/projects/electrolysis/rev/c713ff1a0d4a If that doesn't help, I might try re-adding cjones' crashing-thread-ready condvars or adding some logging.
Attachment #432006 - Flags: review?(bent.mozilla) → review+
Depends on: 551875
I landed the test again with the crashing-thread-ready condvars or and some logging. http://hg.mozilla.org/projects/electrolysis/rev/5b9325736070 http://hg.mozilla.org/projects/electrolysis/rev/6b944ce388e8 Of 10 test runs: 3 seem to have run as intended with a plugin crash detected during a cross-process nested loop. 15 INFO Running /tests/modules/plugin/test/test_crash_nested_loop.html... ++DOMWINDOW == 16 (0xaeaee08) [serial = 18] [outer = 0xa9f61c0] pldhash: for the table at address 0xc059948, the given entrySize of 48 probably favors chaining over double hashing. ++DOCSHELL 0xc0598e0 == 8 ++DOMWINDOW == 17 (0xc059fe0) [serial = 19] [outer = (nil)] ++DOMWINDOW == 18 (0xc0432f8) [serial = 20] [outer = 0xc059fb0] For application/x-test found plugin libnptest.so Detected nested glib event loop Begin crash sequence Spinning mini nested loop ... ###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv ###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv ###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv ###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv ###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv ###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv ... quitting mini nested loop; processed 5 tasks ###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv WARNING: Failed to send message!: file /builds/slave/electrolysis-linux-debug/build/dom/plugins/PluginScriptableObjectParent.cpp, line 217 16 INFO TEST-PASS | /tests/modules/plugin/test/test_crash_nested_loop.html | p.crashInNestedLoop() should throw an exception 2 runs have had the plugin crash too early, before the browser processes any events. (The test still reports a pass.) 15 INFO Running /tests/modules/plugin/test/test_crash_nested_loop.html... ++DOMWINDOW == 16 (0xad557c0) [serial = 18] [outer = 0x9b92be8] pldhash: for the table at address 0x9a79e38, the given entrySize of 48 probably favors chaining over double hashing. ++DOCSHELL 0x9a79dd0 == 8 ++DOMWINDOW == 17 (0xa39bf88) [serial = 19] [outer = (nil)] ++DOMWINDOW == 18 (0x9735040) [serial = 20] [outer = 0xa39bf58] For application/x-test found plugin libnptest.so Detected nested glib event loop Begin crash sequence Time to process browser events ###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv WARNING: Failed to send message!: file /builds/slave/electrolysis-linux-debug/build/dom/plugins/PluginScriptableObjectParent.cpp, line 216 16 INFO TEST-PASS | /tests/modules/plugin/test/test_crash_nested_loop.html | p.crashInNestedLoop() should throw an exception 1 run has passed, so the plugin process crashed before returning from pluginCrashInNestedLoop, but I'm not sure why NPP_Destroy (in the plugin I assume ) is called within the nested loop. 12 INFO Running /tests/modules/plugin/test/test_copyText.html... NPP_Destroy ++DOMWINDOW == 15 (0xc598f40) [serial = 17] [outer = 0xb569570] For application/x-test found plugin libnptest.so Detected nested glib event loop Spinning mini nested loop ... NPP_Destroy ... quitting mini nested loop; processed 12 tasks 13 INFO TEST-PASS | /tests/modules/plugin/test/test_copyText.html | undefined 4 have failed after NPP_Destroy ran in the nested loop. 15 INFO Running /tests/modules/plugin/test/test_crash_nested_loop.html... ++DOMWINDOW == 16 (0xb17d280) [serial = 18] [outer = 0xb3564e8] pldhash: for the table at address 0xc0486e8, the given entrySize of 48 probably favors chaining over double hashing. ++DOCSHELL 0xc048680 == 8 ++DOMWINDOW == 17 (0xbb71dd8) [serial = 19] [outer = (nil)] ++DOMWINDOW == 18 (0xc0498c8) [serial = 20] [outer = 0xbb71da8] For application/x-test found plugin libnptest.so Detected nested glib event loop Begin crash sequence Spinning mini nested loop ... NPP_Destroy ... quitting mini nested loop; processed 5 tasks Spinning mini nested loop ... ... quitting mini nested loop; processed 0 tasks WARNING: Should have crashed in ProcessBrowserEvents: 'glib warning', file /builds/slave/electrolysis-linux-debug/build/toolkit/xre/nsSigHandlers.cpp, line 225 ** (<unknown>:3304): WARNING **: Should have crashed in ProcessBrowserEvents NEXT ERROR 16 ERROR TEST-UNEXPECTED-FAIL | /tests/modules/plugin/test/test_crash_nested_loop.html | p.crashInNestedLoop() should throw an exception
For thread-specific signals such as SIGSEGV generated here, I assume that the crash reporter runs on that thread while the other threads continue. The crash reporter waits for the server process to respond, so I suspect this is why the plugin process does not terminate as fast as I was expecting. http://hg.mozilla.org/mozilla-central/annotate/048fb26978db/toolkit/crashreporter/google-breakpad/src/client/linux/crash_generation/crash_generation_client.cc#l72 I see two options here: 1) Disconnect the crash reporter's signal handler before generating the fatal signal in the plugin process. This seems less favourable because it would seem best to test crash reporting. 2) The test seems to be initiating the crash reporter usually at the desired time. We could simply continue to loop until the child process terminates (assuming that the crash continues to happen at the desired time often enough.) I have a remaining question though: Why is NPP_Destroy being called, apparently on detection of a child process crash? The test plugin is the only code that I know prints this message: http://hg.mozilla.org/mozilla-central/annotate/4b8936ac4a31/modules/plugin/test/testplugin/nptest.cpp#l788
NPP_Destroy was from destruction of a previous instance.
Fixed comment 30.
Status: REOPENED → RESOLVED
Closed: 16 years ago15 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.3a3
The test seems much more reliable when the crash reporter is not involved in the plugin crash. Using _exit instead of a signal, it passed a dozen runs on e10s so landed on m-c: http://hg.mozilla.org/mozilla-central/rev/b381eacdbca5
Flags: in-testsuite+
Depends on: 561770
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: