Crash in [@ IPCError-browser | AddChildDoc binding to nonexistant proxy!]
Categories
(Core :: Disability Access APIs, defect)
Tracking
()
People
(Reporter: aryx, Assigned: Jamie)
References
Details
(Keywords: crash)
Crash Data
Attachments
(2 files)
Almost all crashes running on Windows 10, all on Nightly with Fission enabled.
Previous instance of this crash signature: bug 1574286
Crash report: https://crash-stats.mozilla.org/report/index/80169e22-82b7-4e55-96c4-509bd0201122
Reason: EXCEPTION_BREAKPOINT
Top 10 frames of crashing thread:
0 ntdll.dll NtWaitForAlertByThreadId
1 ntdll.dll RtlSleepConditionVariableSRW
2 kernelbase.dll SleepConditionVariableSRW
3 mozglue.dll mozilla::detail::ConditionVariableImpl::wait mozglue/misc/ConditionVariable_windows.cpp:50
4 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1143
5 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:109
6 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:327
7 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:309
8 xul.dll nsBaseAppShell::Run widget/nsBaseAppShell.cpp:137
9 xul.dll nsAppShell::Run widget/windows/nsAppShell.cpp:602
Comment 1•3 years ago
|
||
-
100% of the (45) crashes from 81 - 85 Nightly have Fission enabled. Let's track this for Fission MVP to monitor the crash volume.
-
Is the stack trace in comment 0 really from the crashing thread or process?
-
btw, "nonexistant" should be spelled "nonexistent".
Assignee | ||
Comment 2•3 years ago
|
||
(In reply to Chris Peterson [:cpeterson] from comment #1)
- Is the stack trace in comment 0 really from the crashing thread or process?
Sort of. The error occurs in the parent process, which returns an IPC error, but IPC errors deliberately crash the content process. The problem is that the stack is useless, since we don't know what code path the child process took to send the request to the parent, nor do we know what code path the parent process took to return the error. So, it's all guesswork from here. :(
Assignee | ||
Comment 3•3 years ago
|
||
I've given this some thought and I really don't know what's causing this. To learn more, we could start with DIAGNOSTIC_ASSERT(false) where we throw the IPC error. This would give us the stack that led to the error, since AddChildDoc is called from a few places.
Updated•3 years ago
|
Assignee | ||
Comment 4•3 years ago
|
||
Dropping severity due to low crash volume.
CPeterson and I discussed this via email. A diagnostic assert will only fire on Nightly and I'm not sure if we have enough of these crashes there now to give us useful data. On the other hand, a beta assert would cause a full browser crash for affected users instead of just a tab crash. We'd get data, but perhaps at the cost of user pain. For now, we've decided to just wait and see what happens when the Fission beta experiment starts. We should be able to uplift a beta assert fairly quickly if we need the data.
Comment 5•3 years ago
|
||
Jamie, the Fission beta experiment has been ongoing since Fx89. What would like to collect from the beta experiment to make progress here?
Assignee | ||
Comment 6•3 years ago
|
||
Interestingly, if I'm reading the data correctly, we aren't seeing these crashes at all on beta. I guess we don't crash for IPC errors on beta?
There are 70+ crashes this week on Nightly, so I think adding the diagnostic assert on Nightly might be useful.
Updated•3 years ago
|
Assignee | ||
Comment 7•3 years ago
|
||
Returning an IPC failure causes a crash in the content process on Nightly.
However, the crash doesn't have a useful stack.
A diagnostic assert will cause a full browser crash instead of a tab crash, which isn't ideal.
On the other hand, it will hopefully allow us to get much more useful info about the crash.
Updated•3 years ago
|
Assignee | ||
Updated•3 years ago
|
Pushed by jteh@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/91426e1b5ca1 Add diagnostic assert to DocAccessibleParent::AddChildDoc when binding to a nonexistent proxy. r=eeejay
Comment 9•3 years ago
|
||
bugherder |
Comment 10•3 years ago
|
||
Here's a crash report:
https://crash-stats.mozilla.org/report/index/04c29944-b806-4a23-ba6c-391650210519
because of the added assertion.
Assignee | ||
Comment 11•3 years ago
•
|
||
Ah. That's more useful.
This is happening when adding an in-process iframe. Looking at the crash dump, I see that the document to which it's being added is itself an OOP iframe, but that OOP iframe document has been shut down. I think it got shut down because the OOP iframe's embedder got destroyed.
To put this another way, we have:
top level -> OOP iframe -> in-proc iframe
Top level gets destroyed, destroying OOP iframe's DocAccessibleParent in its wake, but the OOP iframe is really still alive for now.
Nika, is there any chance that an OOP iframe can transfer ownership to an embedder in a different process from its original embedder? In the above example, could top level switch processes without recreating OOP iframe?
If so, I'll need to make sure embedders don't shut down OOP iframes.
If not, I can just ignore any requests coming in for destroyed documents.
Comment 12•3 years ago
|
||
(In reply to James Teh [:Jamie] from comment #11)
Nika, is there any chance that an OOP iframe can transfer ownership to an embedder in a different process from its original embedder? In the above example, could top level switch processes without recreating OOP iframe?
No, there's no way to re-use an oop iframe in a single document. We can only swap subframes between documents in the very specific situation of dragging a tab into a new window in the browser UI, and it's not possible to do in content.
Assignee | ||
Comment 13•3 years ago
|
||
(In reply to Nika Layzell [:nika] (ni? for response) from comment #12)
We can only swap subframes between documents in the very specific situation of dragging a tab into a new window in the browser UI
Ah... but if the top level DocAccessibleParent gets destroyed (even if that's triggered by a parent process action), that will destroy the OOP iframe DocAccessibleParent too.
Is the "Move Tab -> Move to New Window" command in the tab context menu enough to trigger this? I can't seem to pull off dragging a tab to a new window with the mouse, even using my screen reader's mouse commands.
When I do this, it seems that the actual DOM document (in the content process) doesn't get recreated and so we also keep the PDocAccessible actor. Could dragging a tab to another window ever cause the actual DOM document (or PBrowser actor) to be destroyed/recreated?
I do notice some unrelated minor a11y breakage when dragging a tab to a new window. Is there some notification in the parent process I can listen to for when a tab gets moved to a new window?
Comment 14•3 years ago
|
||
(In reply to James Teh [:Jamie] from comment #13)
Ah... but if the top level DocAccessibleParent gets destroyed (even if that's triggered by a parent process action), that will destroy the OOP iframe DocAccessibleParent too.
It should not be destroyed in this situation :-)
Is the "Move Tab -> Move to New Window" command in the tab context menu enough to trigger this? I can't seem to pull off dragging a tab to a new window with the mouse, even using my screen reader's mouse commands.
Yes, that should be an effective way to trigger that code path.
When I do this, it seems that the actual DOM document (in the content process) doesn't get recreated and so we also keep the PDocAccessible actor. Could dragging a tab to another window ever cause the actual DOM document (or PBrowser actor) to be destroyed/recreated?
It should not :-)
I do notice some unrelated minor a11y breakage when dragging a tab to a new window. Is there some notification in the parent process I can listen to for when a tab gets moved to a new window?
I believe this is currently generally performed using swapDocShells
on the browser custom element, which should fire some events here: https://searchfox.org/mozilla-central/rev/2f1a015b004b79f1145c81cdf86b15481a5630e2/toolkit/content/widgets/browser-custom-element.js#1542.
Assignee | ||
Comment 15•3 years ago
|
||
Assignee | ||
Comment 16•3 years ago
|
||
(In reply to James Teh [:Jamie] from comment #13)
I do notice some unrelated minor a11y breakage when dragging a tab to a new window.
Filed bug 1712233 for this.
Comment 17•3 years ago
|
||
Pushed by jteh@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/0620992c6ba8 When constructing a new DocAccessibleParent for an in-process iframe, handle the case where the parent document has already been destroyed. r=eeejay
Comment 18•3 years ago
|
||
bugherder |
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Updated•3 years ago
|
Updated•3 years ago
|
Comment 19•3 years ago
|
||
The crash has resurfaced
https://crash-stats.mozilla.org/report/index/7624dc19-bf06-411f-8e29-3e7c70210528
Assignee | ||
Comment 20•3 years ago
|
||
(In reply to lomombwlo from comment #19)
The crash has resurfaced
https://crash-stats.mozilla.org/report/index/7624dc19-bf06-411f-8e29-3e7c70210528
This one is bug 1713680.
Description
•