[fission enabled] Crash when reloading https://hsivonen.fi/fission-host.html
Categories
(Core :: DOM: Navigation, enhancement, P3)
Tracking
()
People
(Reporter: hsivonen, Assigned: farre)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: regression, Whiteboard: [stockwell unknown])
Crash Data
Attachments
(1 file)
47 bytes,
text/x-phabricator-request
|
pascalc
:
approval-mozilla-beta+
|
Details | Review |
With fission.oopif.attribute
set to true
but using Basic Layers, when reloading https://hsivonen.fi/fission-host.html , I got https://crash-stats.mozilla.org/report/index/36407b33-bb6b-49f6-b177-aecae0190402
Reporter | ||
Comment 1•5 years ago
|
||
Does this look actionable with this level of info?
Updated•5 years ago
|
Comment 2•5 years ago
|
||
It looks like the BrowsingContext object is null when it is received for the PBrowser constructor in the content process, as a segfault appears to be occuring when trying to read the mType
field.
In this case, I'm imagining that this was created for a subframe. The BrowsinGContext for the subframe should be created in https://searchfox.org/mozilla-central/rev/201450283cddc9e409cec707acb65ba6cf6037b1/dom/base/nsFrameLoader.cpp#2592-2598, and then sent over IPC to the parent.
The crash address doesn't make a ton of sense for the offset of the type field in a BrowsingContext
however, so I'm not sure if that guess is correct.
Forwarding ni? to :farre, who might be able to look at this more.
Updated•5 years ago
|
Updated•5 years ago
|
Comment 4•5 years ago
|
||
[Tracking Requested - why for this release]: seems to be a regression in beta
Updated•5 years ago
|
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 6•5 years ago
|
||
I can repro the crash locally, the stacktrace for me is:
#0 0x00007fcc2f03d82b in nsDocShell::SetTreeOwner(nsIDocShellTreeOwner*) (this=0x7fcc17a50000, aTreeOwner=<optimized out>) at /home/farre/src/gecko/work-1/docshell/base/nsDocShell.cpp:3172
#1 0x00007fcc2f211337 in nsWebBrowser::Create(nsIWebBrowserChrome*, nsIWidget*, mozilla::OriginAttributes const&, mozilla::dom::BrowsingContext*, bool)
(aContainerWindow=<optimized out>, aParentWidget=<optimized out>, aOriginAttributes=..., aBrowsingContext=0x7fcc17a50000, aDisableHistory=false)
at /home/farre/src/gecko/work-1/toolkit/components/browser/nsWebBrowser.cpp:147
#2 0x00007fcc2dcc5a3d in mozilla::dom::TabChild::Init(mozIDOMWindowProxy*) (this=<optimized out>, aParent=<optimized out>) at /home/farre/src/gecko/work-1/dom/ipc/TabChild.cpp:524
#3 0x00007fcc2dc8b9c9 in mozilla::dom::ContentChild::RecvPBrowserConstructor(mozilla::dom::PBrowserChild*, mozilla::dom::IdType<mozilla::dom::TabParent> const&, mozilla::dom::IdType<mozilla::dom::TabParent> const&, mozilla::dom::IPCTabContext const&, unsigned int const&, mozilla::dom::IdType<mozilla::dom::ContentParent> const&, mozilla::dom::BrowsingContext*, bool const&)
(this=<optimized out>, aActor=0x7fcc16ca9858, aTabId=..., aSameTabGroupAs=..., aContext=..., aChromeFlags=<optimized out>, aCpID=..., aBrowsingContext=0x7fcc1a987240, aIsForBrowser=@0x7ffddd40734f: true)
at /home/farre/src/gecko/work-1/dom/ipc/ContentChild.cpp:1831
#4 0x00007fcc2bd6ee54 in mozilla::dom::PContentChild::OnMessageReceived(IPC::Message const&) (this=0x7fcc35f6d820, msg__=...) at /home/farre/src/gecko/work-1/obj-linux-release/ipc/ipdl/PContentChild.cpp:6427
#5 0x00007fcc2bc90332 in mozilla::ipc::MessageChannel::DispatchAsyncMessage(IPC::Message const&) (this=0x7fcc35f174d8, aMsg=...) at /home/farre/src/gecko/work-1/ipc/glue/MessageChannel.cpp:2151
#6 0x00007fcc2bc8f396 in mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message&&) (this=0x7fcc35f174d8, aMsg=...) at /home/farre/src/gecko/work-1/ipc/glue/MessageChannel.cpp:2078
#7 0x00007fcc2bc8fb37 in mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::MessageChannel::MessageTask&) (this=0x7fcc35f174d8, aTask=...) at /home/farre/src/gecko/work-1/ipc/glue/MessageChannel.cpp:1937
#8 0x00007fcc2bc8febe in mozilla::ipc::MessageChannel::MessageTask::Run() (this=0x7fcc17a22120) at /home/farre/src/gecko/work-1/ipc/glue/MessageChannel.cpp:1968
#9 0x00007fcc2b68fe2e in mozilla::SchedulerGroup::Runnable::Run() (this=0x7fcc1793a480) at /home/farre/src/gecko/work-1/xpcom/threads/SchedulerGroup.cpp:295
#10 0x00007fcc2b69e7b6 in nsThread::ProcessNextEvent(bool, bool*) (this=<optimized out>, aMayWait=<optimized out>, aResult=<optimized out>) at /home/farre/src/gecko/work-1/xpcom/threads/nsThread.cpp:1180
#11 0x00007fcc2b6a04a8 in NS_ProcessNextEvent(nsIThread*, bool) (aThread=0x7fcc179241a0, aMayWait=false) at /home/farre/src/gecko/work-1/xpcom/threads/nsThreadUtils.cpp:486
#12 0x00007fcc2bc9274a in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) (this=0x7fcc35f94a10, aDelegate=0x7ffddd408468) at /home/farre/src/gecko/work-1/ipc/glue/MessagePump.cpp:88
#13 0x00007fcc2bc228a9 in MessageLoop::RunInternal() (this=<optimized out>) at /home/farre/src/gecko/work-1/ipc/chromium/src/base/message_loop.cc:315
#14 0x00007fcc2bc228a9 in MessageLoop::RunHandler() (this=<optimized out>) at /home/farre/src/gecko/work-1/ipc/chromium/src/base/message_loop.cc:308
#15 0x00007fcc2bc228a9 in MessageLoop::Run() (this=0x7fcc3133b6a9) at /home/farre/src/gecko/work-1/ipc/chromium/src/base/message_loop.cc:290
#16 0x00007fcc2df5a799 in nsBaseAppShell::Run() (this=0x7fcc19af94c0) at /home/farre/src/gecko/work-1/widget/nsBaseAppShell.cpp:137
#17 0x00007fcc2f4084d4 in XRE_RunAppShell() () at /home/farre/src/gecko/work-1/toolkit/xre/nsEmbedFunctions.cpp:919
#18 0x00007fcc2bc228a9 in MessageLoop::RunInternal() (this=<optimized out>) at /home/farre/src/gecko/work-1/ipc/chromium/src/base/message_loop.cc:315
#19 0x00007fcc2bc228a9 in MessageLoop::RunHandler() (this=<optimized out>) at /home/farre/src/gecko/work-1/ipc/chromium/src/base/message_loop.cc:308
#20 0x00007fcc2bc228a9 in MessageLoop::Run() (this=0x7fcc3133b6a9) at /home/farre/src/gecko/work-1/ipc/chromium/src/base/message_loop.cc:290
#21 0x00007fcc2f40819d in XRE_InitChildProcess(int, char**, XREChildData const*) (aArgc=<optimized out>, aArgv=<optimized out>, aChildData=<optimized out>)
at /home/farre/src/gecko/work-1/toolkit/xre/nsEmbedFunctions.cpp:757
#22 0x00005557908a6d7b in content_process_main(mozilla::Bootstrap*, int, char**) (bootstrap=0x7fcc35f4b6b0, argc=<optimized out>, argv=<optimized out>)
at /home/farre/src/gecko/work-1/browser/app/../../ipc/contentproc/plugin-container.cpp:56
#23 0x00005557908a6d7b in main(int, char**, char**) (argc=<optimized out>, argv=0x7ffddd409808, envp=0x7ffddd409890) at /home/farre/src/gecko/work-1/browser/app/nsBrowserApp.cpp:263
which is:
if (mTabChild) {
nsCOMPtr<nsITabChild> oldTabChild = do_QueryReferent(mTabChild);
MOZ_RELEASE_ASSERT(oldTabChild == newTabChild,
"Cannot cahnge TabChild during nsDocShell lifetime!");
}
and that crash at least seem to be in the correct place!
Updated•5 years ago
|
Comment 7•5 years ago
|
||
(In reply to Calixte Denizet (:calixte) from comment #4)
[Tracking Requested - why for this release]: seems to be a regression in beta
This happens only when fission.oopif.attribute is set to true. This is our fission testing pref and is false by default so P1 isn't accurate here. Since Andreas is already working on it, I'm assigning it P2.
Assignee | ||
Comment 8•5 years ago
|
||
And now I got the other stack as well.
#0 0x00007f4168714b9c in mozilla::dom::BrowsingContext::IsContent() const (this=0x0) at /home/farre/src/gecko/work-1/obj-linux/dist/include/mozilla/dom/BrowsingContext.h:153
#1 0x00007f416db95bfe in nsWebBrowser::Create(nsIWebBrowserChrome*, nsIWidget*, mozilla::OriginAttributes const&, mozilla::dom::BrowsingContext*, bool)
(aContainerWindow=0x7f41534ae9a8, aParentWidget=0x7f415174fc00, aOriginAttributes=..., aBrowsingContext=0x0, aDisableHistory=false)
at /home/farre/src/gecko/work-1/toolkit/components/browser/nsWebBrowser.cpp:108
#2 0x00007f416aeedef8 in mozilla::dom::TabChild::Init(mozIDOMWindowProxy*) (this=0x7f41534ae800, aParent=0x0) at /home/farre/src/gecko/work-1/dom/ipc/TabChild.cpp:524
#3 0x00007f416ae5f445 in mozilla::dom::ContentChild::RecvPBrowserConstructor(mozilla::dom::PBrowserChild*, mozilla::dom::IdType<mozilla::dom::TabParent> const&, mozilla::dom::IdType<mozilla::dom::TabParent> const&, mozilla::dom::IPCTabContext const&, unsigned int const&, mozilla::dom::IdType<mozilla::dom::ContentParent> const&, mozilla::dom::BrowsingContext*, bool const&)
(this=0x7f4178b6f820, aActor=0x7f41534ae860, aTabId=..., aSameTabGroupAs=..., aContext=..., aChromeFlags=@0x7ffc3596129c: 0, aCpID=..., aBrowsingContext=0x0, aIsForBrowser=@0x7ffc35961287: true)
at /home/farre/src/gecko/work-1/dom/ipc/ContentChild.cpp:1831
#4 0x00007f4166df8c30 in mozilla::dom::PContentChild::OnMessageReceived(IPC::Message const&) (this=0x7f4178b6f820, msg__=...) at /home/farre/src/gecko/work-1/obj-linux/ipc/ipdl/PContentChild.cpp:6427
#5 0x00007f416ae6822e in mozilla::dom::ContentChild::OnMessageReceived(IPC::Message const&) (this=0x7f4178b6f820, aMsg=...) at /home/farre/src/gecko/work-1/dom/ipc/ContentChild.cpp:3740
#6 0x00007f4166c46512 in mozilla::ipc::MessageChannel::DispatchAsyncMessage(IPC::Message const&) (this=0x7f4178b174f8, aMsg=...) at /home/farre/src/gecko/work-1/ipc/glue/MessageChannel.cpp:2151
#7 0x00007f4166c4507a in mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message&&) (this=0x7f4178b174f8, aMsg=...) at /home/farre/src/gecko/work-1/ipc/glue/MessageChannel.cpp:2078
#8 0x00007f4166c458fc in mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::MessageChannel::MessageTask&) (this=0x7f4178b174f8, aTask=...) at /home/farre/src/gecko/work-1/ipc/glue/MessageChannel.cpp:1937
#9 0x00007f4166c45e05 in mozilla::ipc::MessageChannel::MessageTask::Run() (this=0x7f41538544a0) at /home/farre/src/gecko/work-1/ipc/glue/MessageChannel.cpp:1968
#10 0x00007f41660af639 in mozilla::SchedulerGroup::Runnable::Run() (this=0x7f4151751b00) at /home/farre/src/gecko/work-1/xpcom/threads/SchedulerGroup.cpp:295
#11 0x00007f41660dbfba in nsThread::ProcessNextEvent(bool, bool*) (this=0x7f4153885050, aMayWait=true, aResult=0x7ffc359630a7) at /home/farre/src/gecko/work-1/xpcom/threads/nsThread.cpp:1180
#12 0x00007f41660df5a3 in NS_ProcessNextEvent(nsIThread*, bool) (aThread=0x7f4153885050, aMayWait=true) at /home/farre/src/gecko/work-1/xpcom/threads/nsThreadUtils.cpp:486
#13 0x00007f4166c498a3 in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) (this=0x7f4178b9d420, aDelegate=0x7ffc35963558) at /home/farre/src/gecko/work-1/ipc/glue/MessagePump.cpp:110
#14 0x00007f4166c4a499 in mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) (this=0x7f4178b9d420, aDelegate=0x7ffc35963558) at /home/farre/src/gecko/work-1/ipc/glue/MessagePump.cpp:271
#15 0x00007f4166b6366f in MessageLoop::RunInternal() (this=0x7ffc35963558) at /home/farre/src/gecko/work-1/ipc/chromium/src/base/message_loop.cc:315
#16 0x00007f4166b635e5 in MessageLoop::RunHandler() (this=0x7ffc35963558) at /home/farre/src/gecko/work-1/ipc/chromium/src/base/message_loop.cc:308
#17 0x00007f4166b6359a in MessageLoop::Run() (this=0x7ffc35963558) at /home/farre/src/gecko/work-1/ipc/chromium/src/base/message_loop.cc:290
#18 0x00007f416b4d7613 in nsBaseAppShell::Run() (this=0x7f41538404a0) at /home/farre/src/gecko/work-1/widget/nsBaseAppShell.cpp:137
#19 0x00007f416dfaf864 in XRE_RunAppShell() () at /home/farre/src/gecko/work-1/toolkit/xre/nsEmbedFunctions.cpp:919
#20 0x00007f4166c4a2f3 in mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) (this=0x7f4178b9d420, aDelegate=0x7ffc35963558) at /home/farre/src/gecko/work-1/ipc/glue/MessagePump.cpp:238
#21 0x00007f4166b6366f in MessageLoop::RunInternal() (this=0x7ffc35963558) at /home/farre/src/gecko/work-1/ipc/chromium/src/base/message_loop.cc:315
#22 0x00007f4166b635e5 in MessageLoop::RunHandler() (this=0x7ffc35963558) at /home/farre/src/gecko/work-1/ipc/chromium/src/base/message_loop.cc:308
#23 0x00007f4166b6359a in MessageLoop::Run() (this=0x7ffc35963558) at /home/farre/src/gecko/work-1/ipc/chromium/src/base/message_loop.cc:290
#24 0x00007f416dfaf040 in XRE_InitChildProcess(int, char**, XREChildData const*) (aArgc=13, aArgv=0x7ffc359639b8, aChildData=0x7ffc35963860) at /home/farre/src/gecko/work-1/toolkit/xre/nsEmbedFunctions.cpp:757
#25 0x00007f416dfb9f47 in mozilla::BootstrapImpl::XRE_InitChildProcess(int, char**, XREChildData const*) (this=0x7f4178b4b6b0, argc=15, argv=0x7ffc359639b8, aChildData=0x7ffc35963860)
at /home/farre/src/gecko/work-1/toolkit/xre/Bootstrap.cpp:67
#26 0x000055bf29d2686a in content_process_main(mozilla::Bootstrap*, int, char**) (bootstrap=0x7f4178b4b6b0, argc=15, argv=0x7ffc359639b8)
at /home/farre/src/gecko/work-1/browser/app/../../ipc/contentproc/plugin-container.cpp:56
#27 0x000055bf29d2696c in main(int, char**, char**) (argc=16, argv=0x7ffc359639b8, envp=0x7ffc35963a40) at /home/farre/src/gecko/work-1/browser/app/nsBrowserApp.cpp:263
and here a crash in mozilla::dom::BrowsingContext::IsContent seems way more reasonable since the bc passed to RecvPBrowserConstructor is indeed null.
Updated•5 years ago
|
Assignee | ||
Comment 9•5 years ago
|
||
Assignee | ||
Comment 10•5 years ago
|
||
Part 1 fixes the stack from Comment 8. Interestingly enough, the reason for that one was because we didn't sync browsing contexts to all content processes that were subscribed to its group.
Assignee | ||
Comment 11•5 years ago
•
|
||
The stack from Comment 6 ends up here: https://searchfox.org/mozilla-central/source/docshell/base/nsDocShell.cpp#3170-3171
Nika, might know more here, otherwise I'll continue investigating on monday.
Assignee | ||
Updated•5 years ago
|
Comment 12•5 years ago
|
||
(In reply to Neha Kochar [:neha] from comment #7)
This happens only when fission.oopif.attribute is set to true. This is our fission testing pref and is false by default so P1 isn't accurate here. Since Andreas is already working on it, I'm assigning it P2.
I'm not so sure this only happens with the pref set. At least some of the crash reports don't have a user-set value for a fission pref .
Comment 13•5 years ago
|
||
Also, I had pointed Andreas at https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=239638663&repo=autoland&lineNumber=3639 which was a similar (identical?) crash on trunk, definitely without the pref set.
Updated•5 years ago
|
Comment 14•5 years ago
|
||
This is almost certainly caused by a flaw in the existing system for doing the iframe fission
attruibute. Once we get proper oop iframe switching (which is in bug 1539163), this attribute can probably be retired.
Effectively, this is caused because fission-attribute remote frames are loaded in some "web" content process, and if the content process supply is exhausted, it may round-robin and accidentally end up in the same process as its embedder. If that happens, then we get this assertion failure.
This can be handled by increasing the processCount
to a silly large number, (E.G. the fix from my test: https://searchfox.org/mozilla-central/rev/6dab6dad9cc852011a14275a8b2c2c03ed7600a7/dom/ipc/tests/test_force_oop_iframe.html#17)
Updated•5 years ago
|
Updated•5 years ago
|
Comment 15•5 years ago
|
||
Unassigning andreas, as this isn't something worth spending effort on trying to fix.
Comment hidden (Intermittent Failures Robot) |
Comment 17•5 years ago
|
||
Pushed by afarre@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/12d67626c02a Part 1: Sync a BrowsingContext to it's groups. r=nika
Assignee | ||
Comment 18•5 years ago
|
||
So I queued part 1 for landing, but we should keep this bug open since that doesn't fix the issue we see in Comment 6.
Comment 19•5 years ago
|
||
bugherder |
Updated•5 years ago
|
(In reply to Andreas Farre [:farre] from comment #18)
So I queued part 1 for landing, but we should keep this bug open since that doesn't fix the issue we see in Comment 6.
Reading that comment I assume we should reopen the bug?
Assignee | ||
Comment 21•5 years ago
|
||
Re-opening this to continue tracking the crash-stack in comment 6
Updated•5 years ago
|
Updated•5 years ago
|
Comment 22•5 years ago
|
||
Comment on attachment 9057923 [details]
Bug 1541038 - Part 1: Sync a BrowsingContext to it's groups. r=nika
Beta/Release Uplift Approval Request
- User impact if declined: Possible crashes in beta when fission oop pref turned on. Crash stack aggregation already shows this is a problem, and that this patch fixed similar crash stacks in nightly.
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Already tested on nightly for a while with no other complaints
- String changes made/needed:
Comment 23•5 years ago
|
||
Comment on attachment 9057923 [details]
Bug 1541038 - Part 1: Sync a BrowsingContext to it's groups. r=nika
Crash fix on beta, uplift approved for 67 beta 13, thanks.
Comment 24•5 years ago
|
||
bugherder uplift |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Updated•5 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 33•4 years ago
|
||
The remaining work here is fixing the crash at nsDocShell::SetTreeOwner as seen in the backtrace in comment 6. This is the same as bug 1560220 so closing this as dup.
Updated•2 years ago
|
Description
•