Crash in [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed]
Categories
(Core :: DOM: Content Processes, defect, P1)
Tracking
()
People
(Reporter: jseward, Assigned: u608768)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: crash, regression)
Crash Data
Attachments
(6 files, 1 obsolete file)
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review |
This bug is for crash report bp-0b708881-01c2-4302-8627-350170190901.
This appears 11 times in 11 different installations in the Windows nightlies
of 20190829214656.
Top 10 frames of crashing thread:
0 ntdll.dll NtWaitForAlertByThreadId
1 ntdll.dll RtlSleepConditionVariableSRW
2 kernelbase.dll SleepConditionVariableSRW
3 mozglue.dll mozilla::detail::ConditionVariableImpl::wait mozglue/misc/ConditionVariable_windows.cpp:50
4 xul.dll struct already_AddRefed<nsIRunnable> mozilla::ThreadEventQueue<mozilla::PrioritizedEventQueue>::GetEvent xpcom/threads/ThreadEventQueue.cpp:153
5 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1134
6 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:486
7 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:110
8 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:308
9 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:290
Reporter | ||
Comment 1•5 years ago
|
||
From the stack I have no idea what component is involved. Andrew, any idea?
Comment 2•5 years ago
|
||
"CommonCreateWindow Unexpected aChromeFlags passed" makes it sound like DOM: Content Processes.
Kashav, could this be a regression from bug 1562264? It involved window flags. I'm not sure when this crash first showed up.
Updated•5 years ago
|
So bug 1562264 introduced this assertion, but bug 1576204 made the failure condition more strict. I'll take a look.
Updated•5 years ago
|
Its not only Windows. i receiving similar crashes on Linux
https://crash-stats.mozilla.org/report/index/56ca36a0-3c40-4884-9dce-aed460191009
https://crash-stats.mozilla.org/report/index/31314859-e184-4973-9720-1cb130191008
https://crash-stats.mozilla.org/report/index/56d7b8bc-6784-47ed-bda6-c8b290191001
Updated•5 years ago
|
Comment 5•5 years ago
|
||
This is still happening.
I haven't been able to reproduce this at all.
(In reply to V. Korn from comment #4)
Its not only Windows. i receiving similar crashes on Linux
Do you have STR?
(In reply to :kashav from comment #6)
I haven't been able to reproduce this at all.
(In reply to V. Korn from comment #4)
Its not only Windows. i receiving similar crashes on Linux
Do you have STR?
Its pure random in my case.
Just got one today https://crash-stats.mozilla.org/report/index/997ccd62-98ee-48c0-89c1-800e70191111
Comment 8•5 years ago
|
||
Updating platform to reflect that this is happening on all three platforms. There is one comment that it is happen on macOS on https://forums.macrumors.com/threads/macos-10-15-catalina-on-unsupported-macs.2183772/page-261, but the reporter said even that is intermittent.
(In reply to :kashav from comment #6)
I haven't been able to reproduce this at all.
(In reply to V. Korn from comment #4)
Its not only Windows. i receiving similar crashes on Linux
Do you have STR?
You asked me if i can STR, well, cant say its 100% sure and that its usable , but in my case usually this lead to similar crash (just got another one):
prereq - make sure you have layers.acceleration.force-enabled true, this will lead to about:support like this
HW_COMPOSITING
force_enabled by user: Force-enabled by pref
blocked by env: Acceleration blocked by platform
OPENGL_COMPOSITING
force_enabled by user: Force-enabled by pref
Now
- Open Firefox and browse a little
- Go suspend and resume from suspend
- In about:support click Toggle Device Reset or you will get severe screen corruption ( Bug 1511508 , nobody seems interested in)
- Now tricky part - in Yandex.Mail open any mail and click any link lead to other site.
- You got crash. Like this - https://crash-stats.mozilla.org/report/index/7cf4a4ad-caa4-4d7a-9913-9a84b0191114
Need to mention thing which may be important or may be not - i have notification from Yandex Mail via push, so after coming from suspend i get push about new mail, click on it, it opens Yandex.mail where i open new mail, click on button and its crashes me, dunno if this anyhow related to cause.
Comment 10•5 years ago
|
||
Correction: Further testing shows that steps 1-3 seems to be not related.
That IS related is notification thing.
I just started Firefox, got notification of new mail, clicked on it, Yandex.Mail opened, i clicked on new mail. on link inside and got crash https://crash-stats.mozilla.org/report/index/4da92377-6f60-4851-8224-9f82a0191115
Simple opening Yandex.Mail directly and clicking on links inside mails didnt produced crash so far.
Assignee | ||
Comment 11•5 years ago
|
||
(In reply to V. Korn from comment #10)
Correction: Further testing shows that steps 1-3 seems to be not related.
That IS related is notification thing.
I just started Firefox, got notification of new mail, clicked on it, Yandex.Mail opened, i clicked on new mail. on link inside and got crash https://crash-stats.mozilla.org/report/index/4da92377-6f60-4851-8224-9f82a0191115Simple opening Yandex.Mail directly and clicking on links inside mails didnt produced crash so far.
This is really helpful, thanks! Just to confirm, the Yandex.Mail tab is not open when you click the notification, right?
Assignee | ||
Comment 12•5 years ago
|
||
(In reply to V. Korn from comment #10)
Correction: Further testing shows that steps 1-3 seems to be not related.
That IS related is notification thing.
I just started Firefox, got notification of new mail, clicked on it, Yandex.Mail opened, i clicked on new mail. on link inside and got crash https://crash-stats.mozilla.org/report/index/4da92377-6f60-4851-8224-9f82a0191115Simple opening Yandex.Mail directly and clicking on links inside mails didnt produced crash so far.
I was able to reproduce using these steps. Thank you so much, looking now.
Assignee | ||
Comment 13•5 years ago
|
||
We may arrive at a situation where content attempts to create a window without a
parent opener. One such example is when we open a new window as a result of
clicking an OS notification. In these situations, the content-calculated remote
or fission chrome flags may disagree with what the parent calculated, and we'd
crash in ContentParent::CommonCreateWindow.
Updated•5 years ago
|
Assignee | ||
Comment 14•5 years ago
|
||
This ended up being more complex than it seemed, kmag said he may be able to provide more details.
Assignee | ||
Comment 15•5 years ago
|
||
Content process crashes but test doesn't really fail right now
Comment 16•5 years ago
|
||
Bugbug thinks this bug is a regression, but please revert this change in case of error.
Updated•5 years ago
|
Comment 17•5 years ago
|
||
FWIW, I have reproduced this twice today while browsing twitter, but I haven't seen it elsewhere and it's not 100% reproducible.
Comment 18•5 years ago
|
||
Sorry for the delay. This issue is really tricky.
The problem happens when a ServiceWorker calls the Clients.openWindow
method. In that case, we try to call the window provider in a content process, but without a parent window. That requests then bounces back up to the parent process, where all sorts of things start to go wrong.
The first problem is that we have no context window for the child to copy Fission and e10s flags from, so they're wrong from the start. But then we also have no context window for the parent to find an appropriate browser window to attach the new tab to, so it just grabs the most recent one. That window could be wrong in any number of ways, from incorrect Fission flags, to incorrect e10s flags, to incorrect private browsing flags. However, we ignore all of those problems for the initial window open, and just stick it in the browser window we find, no matter how poor a match it is.
When said window tries to open another window, though, we do have a parent, so we copy its Fission/e10s flags and send the open request to the parent process. In this case, the parent process can find a browser window based on the opener, and unlike the no-opener case, it does sanity check the window's flags against the browser window, finds them incorrect, and decides to crash.
So... as for how to fix this... I'm not sure. I think we probably want to stop requesting these window openings from a child process to begin with, but that probably makes returning a WindowClient
for them tricky. Either way, we need to choose appropriate flags for the new window, and I guess falling back to the default is our best option. But if we keep opening the window from a child process, that's only the first part of the fix. We also need to fix the parent side to find an appropriate browser window to attach the new tab to based on its flags, and probably to create one if it doesn't exist.
Updated•5 years ago
|
Comment 19•5 years ago
|
||
We can stop opening the windows in a child process. My understanding from https://bugzilla.mozilla.org/show_bug.cgi?id=1293277#c337 and related discussion was that a lot of extra work went into issuing the window open from the content process because there previously wasn't a way to open them in the parent.
I don't think the WindowClient issue is a problem, especially if we can just directly create a DocumentChannel and let it do its thing. DocumentChannel already knows how to create client id's on behalf of content processes. (Specifically, it knows how to add a ClientChannelHelper that knows how to do the right thing.)
A place to start from might be the existing non-e10s logic path of calling ClientOpenWindowInCurrentProcess
which bypasses the actors entirely. But obviously doing the right fission/documentchannel stuff.
Implementing this could potentially help avoid some of the complexity in bug 1600068 where Eden is trying to make the SW openWindow redirect case deal with process switches due to redirects.
Updated•5 years ago
|
Updated•5 years ago
|
Comment 20•5 years ago
|
||
Following up on this bug since it is marked a P1 but hasn't had any activity in 20 days. If someone could kindly update on whether we are still awaiting patch review or considering another path forward, it would be appreciated.
Assignee | ||
Comment 21•5 years ago
•
|
||
I'm currently working on implementing the solution proposed in comment 19 (some more context: https://mozilla.logbot.info/domfission/20191211#c16811715). I think I'll have a rough patch up in the next few days.
Updated•5 years ago
|
Comment 24•5 years ago
|
||
Promoting this bug from Fission M5 to M4.1 milestone since Kashav says this bug will fix some Service Workers mochitests (duplicate bug 1600068).
Comment 25•5 years ago
|
||
(In reply to :kashav from comment #21)
I'm currently working on implementing the solution proposed in comment 19 (some more context: https://mozilla.logbot.info/domfission/20191211#c16811715). I think I'll have a rough patch up in the next few days.
Kashav, do you have an update on this bug? Thanks
Comment 26•5 years ago
|
||
It looks like the work on changing shutdown hang signatures, or something, cause this signature to be split up into a gazillion different signatures. I added a few.
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Comment 27•5 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #26)
It looks like the work on changing shutdown hang signatures, or something, cause this signature to be split up into a gazillion different signatures. I added a few.
This is an unintended side-effect. I'm filing a bug to fix it.
Comment 28•5 years ago
|
||
Is this crash in code that is only enabled in the Nightly channel? Over the last six months, all (but one) crash reports with similar signatures are from Nightly (*.0a1) builds:
Comment hidden (obsolete) |
Comment hidden (obsolete) |
Comment hidden (obsolete) |
Assignee | ||
Comment 32•5 years ago
|
||
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Assignee | ||
Comment 33•5 years ago
|
||
Depends on D55044
Assignee | ||
Comment 34•5 years ago
|
||
Depends on D65026
Assignee | ||
Comment 35•5 years ago
|
||
Depends on D64392
Assignee | ||
Comment 36•5 years ago
|
||
This also removes LaunchObserver, which was only needed for Fennec.
Depends on D64392
Updated•5 years ago
|
Updated•5 years ago
|
Comment 37•5 years ago
|
||
Comment 38•5 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/7ceda016024f
https://hg.mozilla.org/mozilla-central/rev/aed6d12e8453
https://hg.mozilla.org/mozilla-central/rev/e5e5997b9679
https://hg.mozilla.org/mozilla-central/rev/561107bf64cc
https://hg.mozilla.org/mozilla-central/rev/3cfd12a6d13c
https://hg.mozilla.org/mozilla-central/rev/a3d0e1edc1ce
Updated•5 years ago
|
Updated•3 years ago
|
Description
•