Closed Bug 1747475 Opened 2 years ago Closed 2 years ago

EGL/KDE X11/AMD mesa 21 r600: Crash in [@ nsWindow::EnsureGdkWindow] at "We're missing GdkWindow!"

Categories

(Core :: Widget: Gtk, defect)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
98 Branch
Tracking Status
firefox-esr91 --- unaffected
firefox95 --- unaffected
firefox96 --- unaffected
firefox97 --- fixed
firefox98 --- fixed

People

(Reporter: mccr8, Assigned: stransky)

References

(Regression)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(2 files, 1 obsolete file)

Crash report: https://crash-stats.mozilla.org/report/index/766cc2de-4246-4b42-8553-33b760211221

MOZ_CRASH Reason: MOZ_DIAGNOSTIC_ASSERT(mGdkWindow) (We're missing GdkWindow!)

Top 10 frames of crashing thread:

0 libxul.so nsWindow::EnsureGdkWindow widget/gtk/nsWindow.cpp:5262
1 libxul.so nsWindow::GetCompositorWidgetInitData widget/gtk/nsWindow.cpp:9014
2 libxul.so mozilla::gfx::GPUProcessManager::CreateTopLevelCompositor gfx/ipc/GPUProcessManager.cpp:866
3 libxul.so nsBaseWidget::CreateCompositor widget/nsBaseWidget.cpp:1304
4 libxul.so nsBaseWidget::GetWindowRenderer widget/nsBaseWidget.cpp:1370
5 libxul.so mozilla::layers::AnimationInfo::EnumerateGenerationOnFrame gfx/layers/AnimationInfo.cpp:196
6 libxul.so mozilla::RestyleManager::ProcessPostTraversal layout/base/RestyleManager.cpp:2844
7 libxul.so mozilla::RestyleManager::ProcessPostTraversal layout/base/RestyleManager.cpp:2864
8 libxul.so mozilla::RestyleManager::DoProcessPendingRestyles layout/base/RestyleManager.cpp:3071
9 libxul.so mozilla::PresShell::DoFlushPendingNotifications layout/base/PresShell.cpp:4258

First crash I can see is in the 20211220215127 build. Here's the changesets in that range. It looks like this release assert was added by bug 1746423, so I'll mark that as the regressor.

Summary: Crash in [@ nsWindow::EnsureGdkWindow] → Crash in [@ nsWindow::EnsureGdkWindow] at "We're missing GdkWindow!"

Set release status flags based on info from the regressing bug 1746423

Yes, Bug 1746423 added a check there. We failed silently before Bug 1746423. We need to re-configure CompositorWidgetInitData() when mGdkWindow is available.

Martin, based on comment 2, should this be assigned to you?

Flags: needinfo?(stransky)

Yes.
This happens when GPU process is used - and that's disabled by default now.

Flags: needinfo?(stransky)

(In reply to Martin Stránský [:stransky] (ni? me) from comment #4)

Yes.
This happens when GPU process is used - and that's disabled by default now.

Does this mean I enabled it somehow by myself? I think that third of these crashes is mine.

layers.gpu-process.force-enabled is false.

(In reply to gwarser from comment #5)
Please open about:support, click on "Copy text to clipboard" and paste it here. Thanks!

Attached file about:support
No longer blocks: gpu-process-linux-x11
OS: Unspecified → Linux
Hardware: Unspecified → x86_64
Summary: Crash in [@ nsWindow::EnsureGdkWindow] at "We're missing GdkWindow!" → EGL/KDE X11/AMD mesa 21 r600: Crash in [@ nsWindow::EnsureGdkWindow] at "We're missing GdkWindow!"

(In reply to gwarser from comment #5)
When does the crash occur? Do you know steps to reproduce?

Okay, it's something we should solve then.

Flags: needinfo?(stransky)

(In reply to Darkspirit from comment #8)

(In reply to gwarser from comment #5)
When does the crash occur? Do you know steps to reproduce?

It always happen when restoring minimized Firefox window. I'm not sure how long it need to be minimized or if I need to run some other app in the meantime. I will try pay more attention to it.

KDE taskbar window previews cause bug 1723323 on Nvidia, it seems to be a general KDE bug. Maybe the crash is how it manifests on AMD or in general now? (Haven't tested it so far.)

I tried with taskbar tooltips disabled and it just crashed too. Today it crashed twice with only New Tab page opened.

Has Regression Range: --- → yes

I'm confused, comment 4 suggests that this crash happens in a non-default configuration but is that not really the case? Just trying to understand the relative severity of this issue for 97 given that it goes to RC next week.

Flags: needinfo?(jan)

(In reply to Ryan VanderMeulen [:RyanVM] from comment #13)
about:support from comment 7 shows an affected user with default config (GPU process disabled).
comment 4 does not seem to be true then.

(In reply to gwarser from comment #12)
Does this crash still occur for you?
Is your KDE compositor enabled or disabled?
(bug 1750017 was fixed in the mean time)

Flags: needinfo?(jan) → needinfo?(gwarser)

My last two crashes are these

They seems to be from January 10. (BTW they look like duplicates).

I believe I've had taskbar tooltips disabled all the time starting from January 9 (comment 12), so maybe it's related after all, but restart is required to stop crashing?

I've just toggled tooltips back "on" and will see if it happens again.

Compositor is enabled.

Flags: needinfo?(gwarser)

Finally crashed again. I did not touched taskbar this time. Happened after clicking the "restore" button in top right to switch into "windowed" mode. https://crash-stats.mozilla.org/report/index/bcd55584-4a94-4fd4-a1fb-6f3270220128

Yes, it affects default configuration too. We add and assertion when GdkWindow/XWindow is missing when we try to get it by compositor - we failed silently before that.

Flags: needinfo?(stransky)

https://groups.google.com/a/mozilla.org/g/dev-platform/c/xns8e_n9f84/m/rCR4CC88AgAJ

MOZ_DIAGNOSTIC_ASSERT has recently changed from affecting devedition to
affect early beta. This change applies to Firefox 97, so it affects current (early) betas.

(In reply to gwarser from comment #16)
Please test https://beta.mozilla.org. EARLY_BETA has ended: https://fx-trains.herokuapp.com/release/?version=97
Does Beta crash or hang or is it fine?

Flags: needinfo?(gwarser)

I will run beta for some time, but I don't think I'm good test subject now - when overall number of crashes started to increase, it decreased a lot for me. My last two crashes are from January 10 and 28.

Flags: needinfo?(gwarser)

(In reply to gwarser from comment #16)

https://crash-stats.mozilla.org/report/index/bcd55584-4a94-4fd4-a1fb-6f3270220128

Adapter Device ID Redwood XT [Radeon HD 5670/5690/5730] (0x68d8)
"driverVendor": "mesa/r600",

That GPU is from 2010: https://www.techpowerup.com/gpu-specs/ati-redwood.g75
bug 1673939 (https://gitlab.freedesktop.org/mesa/mesa/-/issues/3720) blocked hardware WebRender for some mesa/r600 devices.
Maybe that list needs to be expanded: Are you able to reproduce bug 1752197 comment 0?

See Also: → 1673939

But apart from that, this crash seems to occur across all gpu vendors and even with Nvidia driver 495.

(In reply to Darkspirit from comment #20)

That GPU is from 2010: https://www.techpowerup.com/gpu-specs/ati-redwood.g75

Yes, such times, I'm afraid. And I don't really need anything more powerful.

bug 1673939 (https://gitlab.freedesktop.org/mesa/mesa/-/issues/3720) blocked hardware WebRender for some mesa/r600 devices.
Maybe that list needs to be expanded: Are you able to reproduce bug 1752197 comment 0?

I don't have any issues with distorted graphics at all. Page from 1752197 STR is displayed completely fine.


BTW, 97.0b9 20220127193706 is running in the background for two hours now, without issues.

This is a race condition which does not depend on actual GPU but on Gtk and the init sequence.

(In reply to Martin Stránský [:stransky] (ni? me) from comment #23)

This is a race condition which does not depend on actual GPU but on Gtk and the init sequence.

What would it cause in release, where diagnostic asserts do not seem to crash? A different crash, hang, empty window?

See Also: 1673939

(In reply to Darkspirit from comment #24)

(In reply to Martin Stránský [:stransky] (ni? me) from comment #23)

This is a race condition which does not depend on actual GPU but on Gtk and the init sequence.

What would it cause in release, where diagnostic asserts do not seem to crash? A different crash, hang, empty window?

The answer is here:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6ad1d89626120f9ac2c8ad3abf050dc4d9c53aac

A compositor can be created before we get a gtk widget map event. That leads to wrong XWindow reference passed to GtkCmmpositorWidget.
In this patch we destroy compositor on gtk map event (if there's any) to make sure it's recreated with correct XWindow reference.

Assignee: nobody → stransky
Status: NEW → ASSIGNED

Hm, the try doesn't look healthy.

(In reply to Martin Stránský [:stransky] (ni? me) from comment #25)

The answer is here:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6ad1d89626120f9ac2c8ad3abf050dc4d9c53aac

Looks like debug leaks/asserts? The opt test runs seem OK at least.

I think we can remove the assertion to revert to state before Bug 1746423 for now and investigate/fix that later.

Pushed by stransky@redhat.com:
https://hg.mozilla.org/integration/autoland/rev/8b04d03ac19e
[Linux] Don't assert/crash if we're missing mGdkWindow r=emilio
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 98 Branch

Comment on attachment 9261663 [details]
Bug 1747475 [Linux] Destroy compositor on gtk map event, r?emilio

Revision D137513 was moved to bug 1754711. Setting attachment 9261663 [details] to obsolete.

Attachment #9261663 - Attachment is obsolete: true
Crash Signature: [@ nsWindow::EnsureGdkWindow] → [@ nsWindow::EnsureGdkWindow] [@ _gdk_window_has_impl] [@ gdk_x11_window_get_xid]

This seems like a safe ride-along for a dot release given that it appears to be hitting in the wild a bit. Can you please nominate the patch for release approval, Martin?

Crash Signature: [@ nsWindow::EnsureGdkWindow] [@ _gdk_window_has_impl] [@ gdk_x11_window_get_xid] → [@ nsWindow::EnsureGdkWindow] [@ _gdk_window_has_impl] [@ gdk_x11_window_get_xid]
Flags: needinfo?(stransky)

Comment on attachment 9261742 [details]
Bug 1747475 [Linux] Don't assert/crash if we're missing mGdkWindow r?emilio

Beta/Release Uplift Approval Request

  • User impact if declined: crashes
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: none
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): null-check, effectively
  • String changes made/needed: none
Attachment #9261742 - Flags: approval-mozilla-release?
Flags: needinfo?(stransky)

Comment on attachment 9261742 [details]
Bug 1747475 [Linux] Don't assert/crash if we're missing mGdkWindow r?emilio

Approved for 97.0.1.

Attachment #9261742 - Flags: approval-mozilla-release? → approval-mozilla-release+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: