Closed Bug 1578070 Opened 5 years ago Closed 5 years ago

Crash in [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed]

Categories

(Core :: DOM: Content Processes, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla76
Fission Milestone M4.1
Tracking Status
firefox-esr68 --- unaffected
firefox72 --- disabled
firefox73 --- disabled
firefox74 --- disabled
firefox75 --- disabled
firefox76 --- fixed

People

(Reporter: jseward, Assigned: u608768)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(6 files, 1 obsolete file)

This bug is for crash report bp-0b708881-01c2-4302-8627-350170190901.

This appears 11 times in 11 different installations in the Windows nightlies
of 20190829214656.

Top 10 frames of crashing thread:

0 ntdll.dll NtWaitForAlertByThreadId 
1 ntdll.dll RtlSleepConditionVariableSRW 
2 kernelbase.dll SleepConditionVariableSRW 
3 mozglue.dll mozilla::detail::ConditionVariableImpl::wait mozglue/misc/ConditionVariable_windows.cpp:50
4 xul.dll struct already_AddRefed<nsIRunnable> mozilla::ThreadEventQueue<mozilla::PrioritizedEventQueue>::GetEvent xpcom/threads/ThreadEventQueue.cpp:153
5 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1134
6 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:486
7 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:110
8 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:308
9 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:290

From the stack I have no idea what component is involved. Andrew, any idea?

Flags: needinfo?(continuation)

"CommonCreateWindow Unexpected aChromeFlags passed" makes it sound like DOM: Content Processes.

Kashav, could this be a regression from bug 1562264? It involved window flags. I'm not sure when this crash first showed up.

Component: Widget: Win32 → DOM: Content Processes
Flags: needinfo?(continuation)
Flags: needinfo?(kmadan)

So bug 1562264 introduced this assertion, but bug 1576204 made the failure condition more strict. I'll take a look.

Flags: needinfo?(kmadan)
Flags: needinfo?(kmadan)

This is still happening.

I haven't been able to reproduce this at all.

(In reply to V. Korn from comment #4)

Its not only Windows. i receiving similar crashes on Linux

Do you have STR?

Flags: needinfo?(kmadan) → needinfo?(phazon)

(In reply to :kashav from comment #6)

I haven't been able to reproduce this at all.

(In reply to V. Korn from comment #4)

Its not only Windows. i receiving similar crashes on Linux

Do you have STR?

Its pure random in my case.
Just got one today https://crash-stats.mozilla.org/report/index/997ccd62-98ee-48c0-89c1-800e70191111

Flags: needinfo?(phazon)

Updating platform to reflect that this is happening on all three platforms. There is one comment that it is happen on macOS on https://forums.macrumors.com/threads/macos-10-15-catalina-on-unsupported-macs.2183772/page-261, but the reporter said even that is intermittent.

OS: Windows 10 → All
Hardware: Unspecified → All

(In reply to :kashav from comment #6)

I haven't been able to reproduce this at all.

(In reply to V. Korn from comment #4)

Its not only Windows. i receiving similar crashes on Linux

Do you have STR?

You asked me if i can STR, well, cant say its 100% sure and that its usable , but in my case usually this lead to similar crash (just got another one):
prereq - make sure you have layers.acceleration.force-enabled true, this will lead to about:support like this

HW_COMPOSITING
force_enabled by user: Force-enabled by pref
blocked by env: Acceleration blocked by platform
OPENGL_COMPOSITING
force_enabled by user: Force-enabled by pref

Now

  1. Open Firefox and browse a little
  2. Go suspend and resume from suspend
  3. In about:support click Toggle Device Reset or you will get severe screen corruption ( Bug 1511508 , nobody seems interested in)
  4. Now tricky part - in Yandex.Mail open any mail and click any link lead to other site.
  5. You got crash. Like this - https://crash-stats.mozilla.org/report/index/7cf4a4ad-caa4-4d7a-9913-9a84b0191114

Need to mention thing which may be important or may be not - i have notification from Yandex Mail via push, so after coming from suspend i get push about new mail, click on it, it opens Yandex.mail where i open new mail, click on button and its crashes me, dunno if this anyhow related to cause.

Correction: Further testing shows that steps 1-3 seems to be not related.
That IS related is notification thing.
I just started Firefox, got notification of new mail, clicked on it, Yandex.Mail opened, i clicked on new mail. on link inside and got crash https://crash-stats.mozilla.org/report/index/4da92377-6f60-4851-8224-9f82a0191115

Simple opening Yandex.Mail directly and clicking on links inside mails didnt produced crash so far.

(In reply to V. Korn from comment #10)

Correction: Further testing shows that steps 1-3 seems to be not related.
That IS related is notification thing.
I just started Firefox, got notification of new mail, clicked on it, Yandex.Mail opened, i clicked on new mail. on link inside and got crash https://crash-stats.mozilla.org/report/index/4da92377-6f60-4851-8224-9f82a0191115

Simple opening Yandex.Mail directly and clicking on links inside mails didnt produced crash so far.

This is really helpful, thanks! Just to confirm, the Yandex.Mail tab is not open when you click the notification, right?

(In reply to V. Korn from comment #10)

Correction: Further testing shows that steps 1-3 seems to be not related.
That IS related is notification thing.
I just started Firefox, got notification of new mail, clicked on it, Yandex.Mail opened, i clicked on new mail. on link inside and got crash https://crash-stats.mozilla.org/report/index/4da92377-6f60-4851-8224-9f82a0191115

Simple opening Yandex.Mail directly and clicking on links inside mails didnt produced crash so far.

I was able to reproduce using these steps. Thank you so much, looking now.

Assignee: nobody → kmadan
Status: NEW → ASSIGNED

We may arrive at a situation where content attempts to create a window without a
parent opener. One such example is when we open a new window as a result of
clicking an OS notification. In these situations, the content-calculated remote
or fission chrome flags may disagree with what the parent calculated, and we'd
crash in ContentParent::CommonCreateWindow.

Fission Milestone: --- → M5
Priority: -- → P2

This ended up being more complex than it seemed, kmag said he may be able to provide more details.

Flags: needinfo?(kmaglione+bmo)

Content process crashes but test doesn't really fail right now

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression

FWIW, I have reproduced this twice today while browsing twitter, but I haven't seen it elsewhere and it's not 100% reproducible.

Sorry for the delay. This issue is really tricky.

The problem happens when a ServiceWorker calls the Clients.openWindow method. In that case, we try to call the window provider in a content process, but without a parent window. That requests then bounces back up to the parent process, where all sorts of things start to go wrong.

The first problem is that we have no context window for the child to copy Fission and e10s flags from, so they're wrong from the start. But then we also have no context window for the parent to find an appropriate browser window to attach the new tab to, so it just grabs the most recent one. That window could be wrong in any number of ways, from incorrect Fission flags, to incorrect e10s flags, to incorrect private browsing flags. However, we ignore all of those problems for the initial window open, and just stick it in the browser window we find, no matter how poor a match it is.

When said window tries to open another window, though, we do have a parent, so we copy its Fission/e10s flags and send the open request to the parent process. In this case, the parent process can find a browser window based on the opener, and unlike the no-opener case, it does sanity check the window's flags against the browser window, finds them incorrect, and decides to crash.

So... as for how to fix this... I'm not sure. I think we probably want to stop requesting these window openings from a child process to begin with, but that probably makes returning a WindowClient for them tricky. Either way, we need to choose appropriate flags for the new window, and I guess falling back to the default is our best option. But if we keep opening the window from a child process, that's only the first part of the fix. We also need to fix the parent side to find an appropriate browser window to attach the new tab to based on its flags, and probably to create one if it doesn't exist.

Flags: needinfo?(kmaglione+bmo)
See Also: → 1601978
Flags: needinfo?(bugmail)

We can stop opening the windows in a child process. My understanding from https://bugzilla.mozilla.org/show_bug.cgi?id=1293277#c337 and related discussion was that a lot of extra work went into issuing the window open from the content process because there previously wasn't a way to open them in the parent.

I don't think the WindowClient issue is a problem, especially if we can just directly create a DocumentChannel and let it do its thing. DocumentChannel already knows how to create client id's on behalf of content processes. (Specifically, it knows how to add a ClientChannelHelper that knows how to do the right thing.)

A place to start from might be the existing non-e10s logic path of calling ClientOpenWindowInCurrentProcess which bypasses the actors entirely. But obviously doing the right fission/documentchannel stuff.

Implementing this could potentially help avoid some of the complexity in bug 1600068 where Eden is trying to make the SW openWindow redirect case deal with process switches due to redirects.

Flags: needinfo?(bugmail)
See Also: → 1600068
Priority: P2 → P1

Following up on this bug since it is marked a P1 but hasn't had any activity in 20 days. If someone could kindly update on whether we are still awaiting patch review or considering another path forward, it would be appreciated.

I'm currently working on implementing the solution proposed in comment 19 (some more context: https://mozilla.logbot.info/domfission/20191211#c16811715). I think I'll have a rough patch up in the next few days.

Attachment #9109150 - Attachment is obsolete: true

Promoting this bug from Fission M5 to M4.1 milestone since Kashav says this bug will fix some Service Workers mochitests (duplicate bug 1600068).

Fission Milestone: M5 → M4.1

(In reply to :kashav from comment #21)

I'm currently working on implementing the solution proposed in comment 19 (some more context: https://mozilla.logbot.info/domfission/20191211#c16811715). I think I'll have a rough patch up in the next few days.

Kashav, do you have an update on this bug? Thanks

Flags: needinfo?(kmadan)

It looks like the work on changing shutdown hang signatures, or something, cause this signature to be split up into a gazillion different signatures. I added a few.

Crash Signature: [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed] → [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed] [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed | __poll ] [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed | _fini ] [@ IPCError-brows…
Crash Signature: IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed | mozilla::ipc::MessagePump::Run ] → IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed | mozilla::ipc::MessagePump::Run ] [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed | __GI___poll ]
Crash Signature: IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed | mozilla::ipc::MessagePump::Run ] [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed | __GI___poll ] → IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed | mozilla::ipc::MessagePump::Run ] [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed | __GI___poll ] [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFla…
Crash Signature: aChromeFlags passed | js::LifoAlloc::ensureUnusedApproximateColdPath ] [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed | __psynch_cvwait | <name omitted> | <name omitted> | mozilla::dom::ContentChild::ProvideWindowCommon ] → aChromeFlags passed | js::LifoAlloc::ensureUnusedApproximateColdPath ] [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed | __psynch_cvwait | <name omitted> | <name omitted> | mozilla::dom::ContentChild::ProvideWindowCommon ] [@ IPCE…

(In reply to Andrew McCreight [:mccr8] from comment #26)

It looks like the work on changing shutdown hang signatures, or something, cause this signature to be split up into a gazillion different signatures. I added a few.

This is an unintended side-effect. I'm filing a bug to fix it.

Crash Signature: [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed] [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed | __poll ] [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed | _fini ] [@ IPCError-brows… → [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed] [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed | _fini] [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed | __GI___poll] [@ IPCError…
Attachment #9112143 - Attachment description: Bug 1578070 - Add a test (WIP) → Bug 1578070 - Add a test, r=nika
Attachment #9129194 - Attachment description: Bug 1578070 - wip → Bug 1578070 - Use the parent process for Clients.openWindow(), r=nika
Attachment #9112143 - Attachment description: Bug 1578070 - Add a test, r=nika → Bug 1578070 - Add a test, r=asuth

Depends on D65026

Blocks: 1620052

This also removes LaunchObserver, which was only needed for Fennec.

Depends on D64392

Attachment #9130629 - Attachment description: Bug 1578070 - Rename ClientOpenWindowInCurrentProcess, r=nika → Bug 1578070 - Rename ClientOpenWindowInCurrentProcess and unskip test_openWindow.html, r=nika
Crash Signature: [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed] [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed | _fini] [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed | __GI___poll] [@ IPCError… → [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed]
Pushed by kmadan@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/7ceda016024f Add a test, r=asuth https://hg.mozilla.org/integration/autoland/rev/aed6d12e8453 Only fire MozLayerTreeReady / MozLayerTreeCleared for top-level document paints, r=mconley https://hg.mozilla.org/integration/autoland/rev/e5e5997b9679 Add ClientInfo to WindowGlobalParent, r=nika https://hg.mozilla.org/integration/autoland/rev/561107bf64cc Use the parent process for Clients.openWindow(), r=nika,asuth https://hg.mozilla.org/integration/autoland/rev/3cfd12a6d13c Use nsWindowWatcher::GetBrowsingContextByName in GeckoViewOpenWindow, r=farre https://hg.mozilla.org/integration/autoland/rev/a3d0e1edc1ce Rename ClientOpenWindowInCurrentProcess and unskip test_openWindow.html, r=nika
Flags: needinfo?(kmadan)
Flags: in-testsuite+
Regressed by: 1562264
Regressions: 1622749
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: