Crash in [@ mozilla::dom::CanonicalBrowsingContext::SetCrossGroupOpenerId] via XULFrameElement::LoadSrc()
Categories
(Core :: XUL, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr115 | --- | unaffected |
firefox-esr128 | --- | unaffected |
firefox136 | --- | unaffected |
firefox137 | --- | unaffected |
firefox138 | --- | wontfix |
firefox139 | --- | wontfix |
People
(Reporter: mccr8, Unassigned, NeedInfo)
Details
(Keywords: crash, regression)
Crash Data
Crash report: https://crash-stats.mozilla.org/report/index/0c7b0283-30b1-44d5-a0d3-68ddd0250314
MOZ_CRASH Reason:
MOZ_DIAGNOSTIC_ASSERT(mCrossGroupOpenerId == 0) (Can only set CrossGroupOpenerId once)
Top 10 frames:
0 xul.dll mozilla::dom::CanonicalBrowsingContext::SetCrossGroupOpenerId(unsigned long l... docshell/base/CanonicalBrowsingContext.cpp:2853
0 xul.dll nsFrameLoader::Create(mozilla::dom::Element*, bool, nsIOpenWindowInfo*) dom/base/nsFrameLoader.cpp:441
1 xul.dll mozilla::dom::XULFrameElement::LoadSrc() dom/xul/XULFrameElement.cpp:104
2 xul.dll mozilla::dom::XULFrameElement::BindToTree(mozilla::dom::BindContext&, nsINode&) dom/xul/XULFrameElement.cpp:156
3 xul.dll mozilla::dom::Element::BindToTree(mozilla::dom::BindContext&, nsINode&) dom/base/Element.cpp:2225
3 xul.dll nsStyledElement::BindToTree(mozilla::dom::BindContext&, nsINode&) dom/base/nsStyledElement.cpp:197
4 xul.dll nsXULElement::BindToTree(mozilla::dom::BindContext&, nsINode&) dom/xul/nsXULElement.cpp:558
5 xul.dll mozilla::dom::Element::BindToTree(mozilla::dom::BindContext&, nsINode&) dom/base/Element.cpp:2225
5 xul.dll nsStyledElement::BindToTree(mozilla::dom::BindContext&, nsINode&) dom/base/nsStyledElement.cpp:197
6 xul.dll nsXULElement::BindToTree(mozilla::dom::BindContext&, nsINode&) dom/xul/nsXULElement.cpp:558
Looks like a regression in the 20250313102934 build. Here are the set of changes added in that build. These crashes are all via XULFrameElement::LoadSrc().
Almost all of the crashes have URLs like this: https://shopee.com.br/buyer/login?next=https%3A%2F%2Fshopee.com.br%2F
Emilio, any ideas what might have caused this? Thanks.
Comment 1•3 months ago
|
||
This is a parent process crash. If I had to guess, it'd be a browser front-end side change... But I don't see anything too suspicious there, and I don't see how this can happen by code inspection. Front-end can set this, but in the stack from comment 0 we have just created the browsing context, so... Nika and Farre seem familiar with this flag, maybe they have ideas?
Comment 2•3 months ago
|
||
In this situation, we're creating a pop-up window from Gecko, which can be seen from the ContentParent::RecvCreateWindow
call higher up on the stack, which in turn is calling nsWindowWatcher::OpenWindowWithRemoteTab
. During this event loop, we receive a document load event (presumably for the pop-up window), and insert the initial browser into the DOM.
It's actually likely that the BrowsingContext was created earlier, because the content process generally creates the BrowsingContext in situations like these. The BrowsingContext is actually carried in the nsIOpenWindowInfo (via. the mNextRemoteBrowser
member). The fact that we set the CrossGroupOpenerId
multiple times vaguely suggests that frontend code is inserting two separate <browser>
elements which both use the same nsIOpenWindowInfo
. I expect that we might crash soon after this call in that case even if we didn't hit this assertion, as a BrowsingContext
cannot be embedded twice.
The implementation of nsBrowserAccess
was recently moved to a separate module in bug 1946400, which could potentially be related to a behaviour change on the frontend. I expect that inserting the same nsIOpenWindowInfo
twice is likely something only possible in a rare edge-case caused by failure in some earlier part of window creation.
One possibility worth noting is that it is possible that the pop-up creation was originally going to be targeted to create a new tab, but fell back to opening a new window due to returning null
with a success result (https://searchfox.org/mozilla-central/rev/a8a00d67c6c7118f0b95cffa26740202c3b9e6f3/dom/ipc/ContentParent.cpp#5334-5337). If this happens in a case where the browser was actually inserted, and then not returned, it could lead to this behaviour.
Leaving a ni? for :gijs in case there were some changes to the frontend window opening logic recently around bug 1946400.
Comment 3•3 months ago
|
||
(In reply to Nika Layzell [:nika] (ni? for response) from comment #2)
In this situation, we're creating a pop-up window from Gecko, which can be seen from the
ContentParent::RecvCreateWindow
call higher up on the stack, which in turn is callingnsWindowWatcher::OpenWindowWithRemoteTab
. During this event loop, we receive a document load event (presumably for the pop-up window), and insert the initial browser into the DOM.
Nitpick but to be sure we're on the same page, it's DOMContentLoaded (via DispatchContentLoadedEvents
and the sole call to DispatchTrustedEvent
in there), right?
It's actually likely that the BrowsingContext was created earlier, because the content process generally creates the BrowsingContext in situations like these. The BrowsingContext is actually carried in the nsIOpenWindowInfo (via. the
mNextRemoteBrowser
member). The fact that we set theCrossGroupOpenerId
multiple times vaguely suggests that frontend code is inserting two separate<browser>
elements which both use the samensIOpenWindowInfo
. I expect that we might crash soon after this call in that case even if we didn't hit this assertion, as aBrowsingContext
cannot be embedded twice.
How could we verify this, perhaps with a separate diagnostic assert? Can docshell/dom code detect the reuse of the nsIOpenWindowInfo
?
I confess to not being very familiar with this code, so I'm not sure I have immediate ideas on how this reuse would happen. Off-hand, if we're talking about initialOpenWindowInfo
, AFAICT that is only read here and then used further down... but I don't see how it'd be reused.
Are we sure this is reuse of nsIOpenWindowInfo - and in particular, could it be that somehow we have tried to set a crossGroupOpenerId
on the initial tab for some reason (ie via some other path that may not even involve nsIOpenWindowInfo
at all) before we hit the code I linked in the previous para?
The implementation of
nsBrowserAccess
was recently moved to a separate module in bug 1946400, which could potentially be related to a behaviour change on the frontend. I expect that inserting the samensIOpenWindowInfo
twice is likely something only possible in a rare edge-case caused by failure in some earlier part of window creation.One possibility worth noting is that it is possible that the pop-up creation was originally going to be targeted to create a new tab, but fell back to opening a new window due to returning
null
with a success result (https://searchfox.org/mozilla-central/rev/a8a00d67c6c7118f0b95cffa26740202c3b9e6f3/dom/ipc/ContentParent.cpp#5334-5337). If this happens in a case where the browser was actually inserted, and then not returned, it could lead to this behaviour.Leaving a ni? for :gijs in case there were some changes to the frontend window opening logic recently around bug 1946400.
There were not meant/supposed to be any functional/timing changes in that bug. Given the history on one of the other changes I can understand the suspicion but I don't see anything obvious when re-reviewing the change now...
The timing seems plausible-ish, though it is strange that there are 0 crashes on 137 nightly, but there are crashes on 138 nightly as well as 137 beta, which vaguely hints at something that landed in 138 and uplifted to 137 - 1946400 landed in 137 nightly, just over a week before merge day. There are multiple crashes each build ID (on Nightly) starting March 13th with the 10.29am build, afaict.
I also noticed that in correlations,
(85.19% in signature vs 08.96% overall) accessibility = 1
and looking at https://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=FIREFOX_137_0b4_RELEASE&tochange=FIREFOX_137_0b6_RELEASE there are a number of tabgroups related uplifts, including things like:
Jeremy Swinarton — Bug 1951319: Deferred session restore re-opens tab groups a=pascalc
and quite a number of the crashes have panorama / tab grouping extensions, so I wonder if there is some correlation there (but can't check for correlations with prefs). (That bug is also in the pushlog from comment 0.)
Bouncing needinfo given I'm still a bit lost here and hopeful that Nika has more context / can help move the guessing game forward a little. :-)
Comment 4•3 months ago
|
||
(In reply to :Gijs (he/him) from comment #3)
Nitpick but to be sure we're on the same page, it's DOMContentLoaded (via
DispatchContentLoadedEvents
and the sole call toDispatchTrustedEvent
in there), right?
Yes, it is DOMContentLoaded
(fired here: https://hg.mozilla.org/mozilla-central/file/a1cc47ed2278fe3b16902c3b70ba4303030f17f6/dom/base/Document.cpp#l8369)
How could we verify this, perhaps with a separate diagnostic assert? Can docshell/dom code detect the reuse of the
nsIOpenWindowInfo
?
I don't think there's currently any mechanism for it with how the code was written. It's probably possible to add some extra assertions for that by adding some tracking state in the object, but that would need to be implemented.
I confess to not being very familiar with this code, so I'm not sure I have immediate ideas on how this reuse would happen. Off-hand, if we're talking about
initialOpenWindowInfo
, AFAICT that is only read here and then used further down... but I don't see how it'd be reused.
I believe the way this would be being re-used is if the nsIOpenWindowInfo
is consumed during the call to createContentWindowInFrame(...)
(https://searchfox.org/mozilla-central/rev/81b1c51a6b1cfe1dc2a5c8b39f3346bc1d167790/dom/ipc/ContentParent.cpp#5318), which is generally used to create a new tab. If that call to createContentWindowInFrame
did not throw while returning null
, we'd enter this case (https://searchfox.org/mozilla-central/rev/81b1c51a6b1cfe1dc2a5c8b39f3346bc1d167790/dom/ipc/ContentParent.cpp#5334-5337), which would cause us to also call OpenWindowWithRemoteTab
, re-using the same nsIOpenWindowInfo
(as it's assumed in the case where frontend returns null
that the nsIOpenWindowInfo
was not consumed).
Perhaps some of the changes you mention later in this bug changed how tabs are created & managed in frontend content, such that there's a new error codepath being taken, or such that the tab is inserted in a "lazy" state, meaning the browser appears to be "null", but then is initialized asynchronously with the provided nsIOpenWindowInfo
? I'm unsure how this exactly could be happening.
Leaving the ni? on me for now, so I can get back to this soon, as I haven't had much of a chance to look into it yet unfortunately.
Comment 5•3 months ago
|
||
(In reply to Nika Layzell [:nika] (ni? for response) from comment #4)
Perhaps some of the changes you mention later in this bug changed how tabs are created & managed in frontend content, such that there's a new error codepath being taken, or such that the tab is inserted in a "lazy" state, meaning the browser appears to be "null", but then is initialized asynchronously with the provided
nsIOpenWindowInfo
? I'm unsure how this exactly could be happening.
Adding Dão here in case he has ideas on this front wrt bug 1951319 or other recent tab groups changes.
Comment 6•3 months ago
|
||
I don't know anything furthat than what's been discussed already.
Comment 7•3 months ago
|
||
Sorry for the late reply. Generally, nothing in here seems rings a bell for me, so I'd expect this to be unrelated to tab groups.
(In reply to :Gijs (he/him) from comment #5)
(In reply to Nika Layzell [:nika] (ni? for response) from comment #4)
Perhaps some of the changes you mention later in this bug changed how tabs are created & managed in frontend content, such that there's a new error codepath being taken, or such that the tab is inserted in a "lazy" state, meaning the browser appears to be "null", but then is initialized asynchronously with the provided
nsIOpenWindowInfo
? I'm unsure how this exactly could be happening.Adding Dão here in case he has ideas on this front wrt bug 1951319 or other recent tab groups changes.
Bug 1951319 has been uplifted and shipped in 137. Do we know if this started happening in 137?
Comment 8•3 months ago
|
||
(In reply to Dão Gottwald [:dao] from comment #7)
Adding Dão here in case he has ideas on this front wrt bug 1951319 or other recent tab groups changes.
Bug 1951319 has been uplifted and shipped in 137. Do we know if this started happening in 137?
Yes, comment #3 has analysis of this. Note that this is a diagnostic assert so it won't be crashing on release or late beta, only nightly and early beta.
Updated•2 months ago
|
Updated•1 month ago
|
Description
•