Closed
Bug 677711
Opened 13 years ago
Closed 13 years ago
[OOPP] Trigger child shutdown on parent time outs
Categories
(Core Graveyard :: Plug-ins, defect)
Tracking
(firefox9- fixed, firefox10-, firefox11-)
RESOLVED
FIXED
mozilla9
People
(Reporter: jimm, Assigned: jimm)
References
Details
(Whiteboard: [qa-])
Attachments
(2 files, 1 obsolete file)
9.40 KB,
patch
|
benjamin
:
review+
|
Details | Diff | Splinter Review |
1.00 KB,
patch
|
christian
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
No description provided.
Assignee | ||
Comment 1•13 years ago
|
||
Assignee | ||
Comment 2•13 years ago
|
||
Been experimenting with this today. This actually works pretty well at getting out from under some of the parent side hangs I have test cases for. (bug 675645 & bug 611826 are good examples). I started with just using NS_ABORT(), but that triggered a messy set of assertions in the parent that didn't look good. So I came up with this, which shuts down the module. The parent handles this fine and the child doesn't actually have to crash. Down side to this is we don't get crash stacks for either process. Also looked for a way to get a bit of data over to the parent async but so far no luck. I'll dig into that a bit more tomorrow. I also experimented with destroying and re-creating plugin child windows which are often the root of the problem. Unfortunately flash did not like that at all, crashing in most cases. (I believe plugin api consumers are supposed to support swapping out to new windows, but apparently flash doesn't support it.)
Assignee | ||
Comment 3•13 years ago
|
||
One other note - the 45 second time out in the child case is way too long. If we're going to do this, we should set that lower, at least for the child. If the browser is hung, I don't think any user is going to sit around staring at a frozen window for almost a minute. Plus, we should be proactive about freeing up the browser anyway. I've been testing with a five second time out which works quite well. That might be a little short.. maybe the number should be somewhere in between that and the parent hang time.
Comment 4•13 years ago
|
||
We really should just NS_RUNTIMEABORT. What assertions do you see in the parent?
Assignee | ||
Comment 5•13 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #4) > We really should just NS_RUNTIMEABORT. What assertions do you see in the > parent? http://pastebin.mozilla.org/1295747 This also brings up the "plugin container problem" Windows dialog in debug mode on Windows. I didn't like that at all. I'd rather it get shutdown by the parent cleanly.. but I'm open to whatever. I was hoping a could get some stack data over to the parent somehow, was planning on looking into that today. It looks like we have a independent crash reporter pipe open for communication I might be able to use.
Comment 6•13 years ago
|
||
###!!! ABORT: terminating child process: file f:/Mozilla/firefox/mc/dom/plugins/ipc/PluginModuleChild.cpp, line 532 is expected, that's what NS_RUNTIMEABORT prints. ###!!! ASSERTION: Mismatched RPC stack frames: 'this == mChannel->mTopFrame', file f:/Mozilla/firefox/mc/ipc/glue/WindowsMessageLoop.cpp, line 612 This sounds like it ought to be investigated. ###!!! ASSERTION: IPDL error:: 'Error', file f:/Mozilla/firefox/MC-DBG/ipc/ipdl/PPluginScriptableObjectChild.cpp, line 1226 This doesn't make much sense: once you issue a NS_RUNTIMEABORT, the child ought to be dead, but it's still running and issuing assertions? ###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv expected, this is normal Also the windows crash dialog is normal if you don't have crash reporting enabled: if it is enabled you should be getting a normal minidump.
Comment 7•13 years ago
|
||
It might not work properly if you're killing the parent process. plugin-container doesn't write its own minidumps, it asks the chrome process to do so, and if you've killed that it will probably just give up and you'll get the Windows dialog.
Comment 8•13 years ago
|
||
We're not killing the parent process; the plugin process is just committing suicide.
Assignee | ||
Comment 9•13 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #6) > ###!!! ASSERTION: Mismatched RPC stack frames: 'this == > mChannel->mTopFrame', file > f:/Mozilla/firefox/mc/ipc/glue/WindowsMessageLoop.cpp, line 612 > This sounds like it ought to be investigated. > > ###!!! ASSERTION: IPDL error:: 'Error', file > f:/Mozilla/firefox/MC-DBG/ipc/ipdl/PPluginScriptableObjectChild.cpp, line > 1226 > This doesn't make much sense: once you issue a NS_RUNTIMEABORT, the child > ought to be dead, but it's still running and issuing assertions? In debug mode, the child process gets hung up by the windows error collection dialog, so I think stuff underneath continues to run. I can debug a bit and see what's going on. This spews while the windows error reporting dialog is visible and is "checking for solutions".
Assignee | ||
Comment 10•13 years ago
|
||
Note, even with the use of runtime abort, we still don't get minidumps: WARNING: [PluginModuleParent::ActorDestroy] abnormal shutdown without minidump! This is in ActorDestroy of PluginModuleParent. Not sure yet if there's a way to go ahead and trigger some sort of report. Whether we shut down the module or just crash, it seems some work will need to be done in getting crash data stored away.
Assignee | ||
Comment 11•13 years ago
|
||
Ok, with the crash reporter enabled this is working out pretty well.
Assignee: nobody → jmathies
Attachment #551907 -
Attachment is obsolete: true
Attachment #552165 -
Flags: review?(benjamin)
Assignee | ||
Comment 12•13 years ago
|
||
Hmm, I have an unused timeoutMs var in the prefs observer, I'll remove that.
Assignee | ||
Comment 13•13 years ago
|
||
release builds: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jmathies@mozilla.com-b017b3bf473d
Comment 14•13 years ago
|
||
Comment on attachment 552165 [details] [diff] [review] patch v.1 Please mark tracking-firefoxN on this when it lands, since we'll probably want to track this very carefully. And file a followup on getting a parent-side stack when we do this child-side abort, since we really want a hang-pair in this case.
Attachment #552165 -
Flags: review?(benjamin) → review+
Assignee | ||
Comment 15•13 years ago
|
||
Landed on inbound: http://hg.mozilla.org/integration/mozilla-inbound/rev/bef7bcdcd41e Note to drivers: we probably want to disable this feature in releases. This will depend on the frequency stacks generated by these changes show up in crash stats. To disable - the following release pref should be set to 0: +// How long a plugin process will wait for a response from the parent +// to a synchronous request before terminating itself. After this +// point the child assumes the parent is hung. +pref("dom.ipc.plugins.parentTimeoutSecs", 15);
status-firefox8:
--- → fixed
tracking-firefox8:
--- → ?
Assignee | ||
Comment 16•13 years ago
|
||
This isn't going to make the last 8 inbound merge, so when tracking 9 flags arrive in bugzilla they need to be set on this bug.
status-firefox8:
fixed → ---
tracking-firefox8:
? → ---
Comment 17•13 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/bef7bcdcd41e
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla9
Assignee | ||
Updated•13 years ago
|
tracking-firefox9:
--- → ?
Comment 18•13 years ago
|
||
Is this an approval request? This is already in 9, I presume. If it is an approval request, please unset the tracking flag and set the approval request in the patch with an explanation of the risk / reward.
Assignee | ||
Comment 19•13 years ago
|
||
(In reply to Asa Dotzler [:asa] from comment #18) > Is this an approval request? This is already in 9, I presume. If it is an > approval request, please unset the tracking flag and set the approval > request in the patch with an explanation of the risk / reward. We want this disabled in releases for now. I'll post a patch to disable this in 9 now that we've branched.
Assignee | ||
Comment 20•13 years ago
|
||
Assignee | ||
Updated•13 years ago
|
Attachment #563783 -
Flags: approval-mozilla-aurora?
Attachment #563783 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee | ||
Comment 21•13 years ago
|
||
(In reply to Jim Mathies [:jimm] from comment #20) > Created attachment 563783 [details] [diff] [review] [diff] [details] [review] > disable for aurora and up patch https://hg.mozilla.org/releases/mozilla-aurora/rev/c4862aaec55b
Assignee | ||
Updated•13 years ago
|
status-firefox9:
--- → fixed
Updated•13 years ago
|
Comment 23•13 years ago
|
||
[Triage Comment] Please provide STR here by tomorrow at 12:00PM PT 12/14. We will be holding our FF9 sign-offs later that afternoon, and need to be able to verify. Thanks!
Assignee | ||
Comment 24•13 years ago
|
||
(In reply to Alex Keybl [:akeybl] from comment #23) > [Triage Comment] > Please provide STR here by tomorrow at 12:00PM PT 12/14. We will be holding > our FF9 sign-offs later that afternoon, and need to be able to verify. > Thanks! This code is only enabled on mc, so you don't have to worry about verifying anything. To verify it's disabled, you can check the pref in 'disable for aurora and up patch' is properly set.
Comment 25•13 years ago
|
||
Thanks Jim. So to be completely clear... dom.ipc.plugins.parentTimeoutSecs == 0 in Firefox 9 means this is FIXED?
Assignee | ||
Comment 26•13 years ago
|
||
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #25) > Thanks Jim. So to be completely clear... > > dom.ipc.plugins.parentTimeoutSecs == 0 in Firefox 9 means this is FIXED? Disabled really. There's no need to confirm the feature is "fixed", the reports this generates have been showing up in crash stats since the first patch landed. I don't see the stacks from this in aurora builds so it's properly disabled. So yes, basically, confirmed fixed on mc and confirmed disabled on aurora. :) The crash stack in question is: mozalloc_abort(char const* const) | NS_DebugBreak_P | mozilla::plugins::PluginModuleChild::ShouldContinueFromReplyTimeout() We may enabled in the future for all releases, but we'll file a new bug on that.
Comment 27•13 years ago
|
||
Okay, thanks Jim. I'm going to mark this one qa-. We can verify the new bug when it gets enabled.
Whiteboard: [qa?] → [qa-]
Comment 28•13 years ago
|
||
Looks like this isn't fixed on FF10.. dom.ipc.plugins.parentTimeoutSecs=15 .. and is thus suspected to be causing bug 683967. Jim - can you prepare the patch for FF10/11?
Comment 29•13 years ago
|
||
Looks like bsmedberg will actually create the disable patch in 683967. Untracking here.
Updated•2 years ago
|
Product: Core → Core Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•