Closed
Bug 919878
Opened 11 years ago
Closed 11 years ago
Removing a tab before ::Show() completes causes assertions and child process to terminate
Categories
(Core :: IPC, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla29
People
(Reporter: markh, Assigned: billm)
References
Details
Attachments
(1 file)
12.43 KB,
patch
|
mattwoodrow
:
review+
|
Details | Diff | Splinter Review |
STR:
* Set browser.tabs.remote=true
* In a "browser" scratchpad, execute:
gBrowser.selectedTab = gBrowser.addTab();
gBrowser.removeCurrentTab();
In a debug build, the following will be seen on the console:
[Child 11280] WARNING: nsWindow::GetNativeData not implemented for this type: file o:/src/mozilla-git/central2/widget/xpwidgets/PuppetWidget.cpp, line 672
++DOCSHELL 03461140 == 2 [id = 2]
++DOMWINDOW == 3 (03744B30) [serial = 4] [outer = 00000000]
[Parent 10348] WARNING: No docshells for remote frames!: file o:/src/mozilla-git/central2/content/base/src/nsFrameLoader.cpp, line 571
<repeated a few times>
[Parent 10348] WARNING: Can't allocate graphics resources, aborting subprocess: file o:/src/mozilla-git/central2/dom/ipc/TabParent.cpp, line 1397
###!!! [Parent][SyncChannel] Error: Value error: message was deserialized, but contained an illegal value
###!!! [Parent][AsyncChannel] Error: Channel error: cannot send/recv
...
[Parent 10348] WARNING: RemoveObserver() called for unregistered observer: file o:/src/mozilla-git/central2/hal/Hal.cpp, line 205
[Parent 10348] WARNING: RemoveObserver() called for unregistered observer: file o:/src/mozilla-git/central2/hal/Hal.cpp, line 205
[Parent 10348] WARNING: RemoveObserver() called for unregistered observer: file o:/src/mozilla-git/central2/hal/Hal.cpp, line 205
[Parent 10348] WARNING: RemoveObserver() called for unregistered observer: file o:/src/mozilla-git/central2/hal/Hal.cpp, line 205
[Parent 10348] WARNING: RemoveObserver() called for unregistered observer: file o:/src/mozilla-git/central2/hal/Hal.cpp, line 205
[Parent 10348] WARNING: file o:/src/mozilla-git/central2/ipc/chromium/src/base/process_util_win.cc, line 184
The problem is, roughly:
* The parent creates the browser element and calls SendPBrowserConstructor to do the client side.
* TabParent::Show() is called - this sends an async message to the child.
* We then remove the browser element - XULElement::RemoveChildAt() is called, which ends up destroying the TabParent, and in the process calls TabParent::SetOwnerElement(null)
* The child process, acting on the ::Show() message, send a message back which ends up in TabParent::AllocPRenderFrameParent(). This sees frameLoader being null (due to the SetOwnerElement(null)) and hits NS_WARNING("Can't allocate graphics resources, aborting subprocess");
* This returns null, which we see as the "Error: Value error: message was deserialized, but contained an illegal value" and we terminate the child.
* A little later, we end up at process_util_win.cc, line 184, which is attempting to get the PID for the child, and fails.
* Later, and somewhat non-deterministically, we end up with a variety of failures and assertions about condition variables and or InvalidHandle exceptions as we try and close the process handle. I haven't dug into these, but they start to appear within seconds (or up to a minute) of doing the above. I suspect we are getting confused about the child state and/or not cleaning everything up correctly.
I'm marking this as blocking the crash-e10s bug (899758), as even though there isn't really a "crash" here, the parent reports it as one and about:tabcrashed may be displayed.
Reporter | ||
Comment 1•11 years ago
|
||
(In reply to Mark Hammond (:markh) from comment #0)
> * Later, and somewhat non-deterministically, we end up with a variety of
> failures and assertions about condition variables and or InvalidHandle
> exceptions as we try and close the process handle.
I've narrowed down the InvalidHandle exception while closing the process handle - see bug 920397.
Updated•11 years ago
|
Priority: -- → P1
Assignee | ||
Comment 2•11 years ago
|
||
I can reproduce this. I'll see if I can figure it out.
Assignee: nobody → wmccloskey
Assignee | ||
Comment 3•11 years ago
|
||
Mark's comment tells pretty much what happens. The TabParent is asked to allocate a RenderFrameParent. It tries to find the FrameLoader and can't get one because the tab has already been closed by that time. So it returns NULL. IPDL takes that as a sign that it should kill the child process. We're not very good at killing the child, so that goes badly.
However, we shouldn't be killing the child in the first place. This patch splits RenderFrameParent to have a mostly empty constructor, with the main logic going in an Init method. The Init method returns a success/fail boolean to the child.
Attachment #832635 -
Flags: review?(matt.woodrow)
Updated•11 years ago
|
Attachment #832635 -
Flags: review?(matt.woodrow) → review+
Assignee | ||
Comment 4•11 years ago
|
||
Comment 5•11 years ago
|
||
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
Comment 6•11 years ago
|
||
Backed out along with the other changesets in https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=b92529577644 for something in that push causing shutdown hangs on OS X, eg:
https://tbpl.mozilla.org/php/getParsedLog.php?id=30796582&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=30802180&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=30816360&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=30818266&tree=Mozilla-Inbound
Backout:
https://hg.mozilla.org/mozilla-central/rev/25c9def33304
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 7•11 years ago
|
||
This appears to be the patch that was causing the shutdown hangs. I'll need to investigate more.
Assignee | ||
Comment 8•11 years ago
|
||
I did a try run with debugging info:
https://tbpl.mozilla.org/?tree=Try&rev=13892bdf30ec
It looks like the problem is triggered some time during the social tests. They create some extra processes, and during that time they exercise the code added in this patch. I see this message:
19:23:00 INFO - [Parent 1191] WARNING: Can't allocate graphics resources. May already be shutting down.: file ../../../dom/ipc/TabParent.cpp, line 1515
Then we add some code that tries to wait until it gets a SIGCHLD from the child process created. It never comes, and we hang at shutdown waiting for it. I'll investigate more tomorrow, but I thought I'd post this in case somebody wants to look into it.
Assignee | ||
Comment 9•11 years ago
|
||
I'm still waiting for try results, but I think I tracked down the problem here. Bug 951908.
Depends on: 951908
Assignee | ||
Comment 10•11 years ago
|
||
Comment 11•11 years ago
|
||
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Comment 12•11 years ago
|
||
Do we need this on any other branches?
Target Milestone: mozilla28 → mozilla29
Reporter | ||
Comment 13•11 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #12)
> Do we need this on any other branches?
While they are technically affected, I don't think that's necessary.
Comment 14•11 years ago
|
||
We do hit bug 960605 on Aurora, so I'd love if you'd reconsider for Fx28 at least :)
Reporter | ||
Comment 15•11 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #14)
> We do hit bug 960605 on Aurora, so I'd love if you'd reconsider for Fx28 at
> least :)
Fair enough :) Bill - any objections to nominating this for Aurora?
Flags: needinfo?(wmccloskey)
Assignee | ||
Comment 16•11 years ago
|
||
I think bug 951908 is the one we want for aurora. I've requested a backport for that. Bug 960605 has more details.
Flags: needinfo?(wmccloskey)
You need to log in
before you can comment on or make changes to this bug.
Description
•