Closed
Bug 680130
Opened 13 years ago
Closed 8 years ago
Plugin child instance hang/crash in mozilla::plugins::PluginModuleChild::ShouldContinueFromReplyTimeout()
Categories
(Core Graveyard :: Plug-ins, defect)
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: jimm, Unassigned)
References
Details
(Keywords: crash, Whiteboard: [read comment 12])
Crash Data
These and other similar crashes are "by design" due to the child timeout logic added in bug 677711. Causes vary, but basically this is a case where the parent is frozen and the child times out, causing it to kill itself to free the parent.
This should probably serve as a meta bug since in some of these cases we should be able to find a work around for the parent hang.
![]() |
Reporter | |
Updated•13 years ago
|
Crash Signature: [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | mozilla::plugins::PPluginScriptableObjectChild::FatalError(char const* const) ] → [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | mozilla::plugins::PPluginScriptableObjectChild::FatalError(char const* const) ]
[@ mozalloc_abort(char const* const) | NS_DebugBreak_P | mozilla::plugins::PluginModuleChild::ShouldContinueFromReply…
![]() |
Reporter | |
Comment 1•13 years ago
|
||
Most common offenders so far based on a manual sampling, ordered by most common:
mozilla::plugins::PluginInstanceChild::ShowPluginFrame
send:
http://mxr.mozilla.org/mozilla-central/source/dom/plugins/ipc/PluginInstanceChild.cpp#2995
recv:
http://mxr.mozilla.org/mozilla-central/source/dom/plugins/ipc/PluginInstanceParent.cpp#500
mozilla::plugins::PPluginModuleChild::CallNPN_UserAgent
mozilla::plugins::PPluginInstanceChild::CallPStreamNotifyConstructor
mozilla::plugins::PPluginScriptableObjectChild::CallNPN_Evaluate
mozilla::plugins::PPluginScriptableObjectChild::CallInvokeDefault
About ~60%-70% of the timeouts are in ShowPluginFrame.
![]() |
Reporter | |
Comment 2•13 years ago
|
||
I don't think ShowPluginFrame is a real hang, but I do think it might indicate a serious bottle neck on our plugin rendering code. This reminds me of all the slow performance bugs we had back when we released oopp.
![]() |
Reporter | |
Comment 3•13 years ago
|
||
More accurate stats doing some simple screen scraping of crashstats over the last two days:
52 : 'mozilla::plugins::PPluginInstanceChild::SendShow(..)'
19 : 'mozilla::plugins::PluginModuleChild::GetUserAgent()'
7 : 'mozilla::plugins::PluginScriptableObjectChild::Evaluate(..)'
2 : 'mozilla::plugins::child::_posturlnotify'
2 : 'mozilla::plugins::PluginScriptableObjectChild::ScriptableInvokeDefault(..)'
2 : 'mozilla::plugins::child::_geturlnotify'
1 : 'mozilla::plugins::PluginInstanceChild::NPN_GetValue(NPNVariable,void*)'
![]() |
Reporter | |
Comment 4•13 years ago
|
||
Note dependent bugs aren't necessarily critical since the timeout code in bug 677711 will be disabled when we merge. But we want to keep this going on mc so we can continue to diagnose.
Related to this I'm trying to get bug 679238 finished up so we have more visibility on the parent side when this happens.
Also it would be great if we could somehow generate an accurate list of the most common stacks leading up to ShouldContinueFromReplyTimeout() in the child, similar to what I generated with some screen scraping in bug comment 3.
![]() |
Reporter | |
Updated•13 years ago
|
Summary: Plugin child instance crash in mozalloc_abort(char const* const) → Plugin child instance crash in ShouldContinueFromReplyTimeout()
![]() |
||
Comment 5•13 years ago
|
||
[@ mozalloc_abort(char const* const) | NS_DebugBreak_P | mozilla::plugins::PluginModuleChild::ShouldContinueFromReplyTimeout()] has been rising really much in the last few days, is there an underlying problem causing this?
![]() |
Reporter | |
Comment 6•13 years ago
|
||
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #5)
> [@ mozalloc_abort(char const* const) | NS_DebugBreak_P |
> mozilla::plugins::PluginModuleChild::ShouldContinueFromReplyTimeout()] has
> been rising really much in the last few days, is there an underlying problem
> causing this?
The crash-on-parent-hang code was disabled in aurora (9.0a1, 9.0a2) on 10/11:
http://hg.mozilla.org/releases/mozilla-aurora/rev/c4862aaec55b
https://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A9.0a2&platform=windows&query_search=signature&query_type=contains&reason_type=contains&date=10%2F14%2F2011%2009%3A23%3A46&range_value=1&range_unit=weeks&hang_type=any&process_type=plugin&plugin_field=filename&plugin_query_type=exact&do_query=1&signature=mozalloc_abort%28char%20const*%20const%29%20|%20msvcr80.dll%400xe456
which clearly shut these crashes down as intended.
As far as 10.0a1, the particular stack you mention stated showing up on the same date:
https://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A10.0a1&platform=windows&query_search=signature&query_type=contains&reason_type=contains&date=10%2F14%2F2011 09%3A23%3A46&range_value=1&range_unit=weeks&hang_type=any&process_type=plugin&plugin_field=filename&plugin_query_type=exact&do_query=1&signature=mozalloc_abort(char const* const) | NS_DebugBreak_P | mozilla%3A%3Aplugins%3A%3APluginModuleChild%3A%3AShouldContinueFromReplyTimeout()
This looks like it's caused by the crash code, which is an intended crash. The other signatures related to this, for example:
https://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A10.0a1&platform=windows&query_search=signature&query_type=contains&reason_type=contains&date=10%2F14%2F2011%2009%3A23%3A46&range_value=1&range_unit=weeks&hang_type=any&process_type=plugin&plugin_field=filename&plugin_query_type=exact&do_query=1&signature=mozalloc_abort%28char%20const*%20const%29%20|%20_RTC_Terminate
stop on 10/10, so, I don't know, maybe the signature swap was caused by a socorro change or something and the dates between the two branches happened to match up? I landed the aurora change when I was able to get around to it, it wasn't correlated with any other landings or changes. So I don't think the changes between the two branches are related.
![]() |
Reporter | |
Comment 7•13 years ago
|
||
The other common crash stack went away as well:
https://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A10.0a1&platform=windows&query_search=signature&query_type=contains&reason_type=contains&date=10%2F14%2F2011%2009%3A23%3A46&range_value=1&range_unit=weeks&hang_type=any&process_type=plugin&plugin_field=filename&plugin_query_type=exact&do_query=1&signature=mozalloc_abort%28char%20const*%20const%29%20|%20msvcr80.dll%400xe456
![]() |
Reporter | |
Comment 8•13 years ago
|
||
Looks like:
https://bugzilla.mozilla.org/show_bug.cgi?id=691912
https://github.com/mozilla/socorro/commit/4209b3a5559150805a66d0e1ca98e3f3a92cff60
Might be it. That added *abort to the skip list, not sure when that went live though on our production servers.
![]() |
||
Comment 9•13 years ago
|
||
Hmm, there were a few skiplist changes in that Socorro release, right. Not sure this one was it, but if the others all went away and are replaced by this signature, it's alright anyhow. Just wondered about the rise, which was determined to be a false alarm, so everything's alright. :)
Comment 10•13 years ago
|
||
(In reply to Jim Mathies [:jimm] from comment #2)
> I don't think ShowPluginFrame is a real hang, but I do think it might
> indicate a serious bottle neck on our plugin rendering code. This reminds me
> of all the slow performance bugs we had back when we released oopp.
I have bp-af631221-5284-4d27-8ec1-74e322111025 and bp-3f3901d8-9cf5-43bf-9520-b31762111025 "[@ mozalloc_abort(char const* const) | NS_DebugBreak_P | mozilla::plugins::PluginModuleChild::ShouldContinueFromReplyTimeout() ]", which says that this Bug 680130 is related.
It occurs when I restart Nightly with over 220 Tabs (using TM+). When I use Aurora or "Release" I seldom have trouble with that few a number of Tabs and need to go up to 250 or more to crash on startup.
Updated•13 years ago
|
![]() |
Reporter | |
Updated•13 years ago
|
Crash Signature: [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | mozilla::plugins::PPluginScriptableObjectChild::FatalError(char const* const) ]
[@ mozalloc_abort(char const* const) | NS_DebugBreak_P | mozilla::plugins::PluginModuleChild::ShouldContinueFromReply… → [@ mozalloc_abort(char const* const) | NS_DebugBreak_P | mozilla::plugins::PluginModuleChild::ShouldContinueFromReplyTimeout() ]
[@ hang | mozalloc_abort(char const* const) | NS_DebugBreak_P | mozilla::plugins::PluginModuleChild::ShouldContinueFromReplyT…
Depends on: 711971
Summary: Plugin child instance crash in ShouldContinueFromReplyTimeout() → Plugin child instance hang/crash in mozilla::plugins::PluginModuleChild::ShouldContinueFromReplyTimeout()
![]() |
Reporter | |
Comment 12•13 years ago
|
||
These crashes are triggered by the code that landed in bug 677711 which aborts the child if the child detects a parent hang after 15 seconds. This code was disabled on all branches on 1/7/12. We'll re-enable this code once bug 679238 is fixed. Currently we don't get good parent side stacks, so the child aborts were pretty useless, aside from giving us an idea of how many parent hangs we experience.
Whiteboard: [read comment 12]
Comment 14•11 years ago
|
||
This is on my Bug List so I am checking in.
(In reply to Jim Mathies [:jimm] (away 4/1-4/21) from comment #12)
> These crashes are triggered by the code that landed in bug 677711 which
> aborts the child if the child detects a parent hang after 15 seconds. This
> code was disabled on all branches on 1/7/12. We'll re-enable this code once
> bug 679238 is fixed. Currently we don't get good parent side stacks, so the
> child aborts were pretty useless, aside from giving us an idea of how many
> parent hangs we experience.
In Bug 697739 Comment 12 I used a Registry Modification to prevent BSODs that I was getting from Watchdog.sys -- I think that defeated (fixed, for me) that Code (parent hang).
I noticed that this BR was started 2011-08-18, most replies were made within 5 months, with the exception of Stephen Donner's "Depends" more than a year ago (also consisting of Comment more than one year old).
ALL other Bugs mentioned on this Page are closed.
I suggest closing this BR.
Updated•9 years ago
|
Crash Signature: mozilla::plugins::PluginModuleChild::ShouldContinueFromReplyTimeout() ]
[@ mozalloc_abort(char const* const) | _RTC_Terminate ]
[@ hang | mozalloc_abort(char const* const) | _RTC_Terminate ] → mozilla::plugins::PluginModuleChild::ShouldContinueFromReplyTimeout() ]
[@ mozalloc_abort(char const* const) | _RTC_Terminate ]
[@ hang | mozalloc_abort(char const* const) | _RTC_Terminate ]
[@ mozalloc_abort | NS_DebugBreak_P | mozilla::plugins::Plug…
Comment 15•8 years ago
|
||
I'm marking this bug as WORKSFORME as bug crashlog signature didn't appear from a long time (over half year).
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
Updated•3 years ago
|
Product: Core → Core Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•