Closed Bug 1882779 Opened 1 year ago Closed 1 year ago

[Linux] Avoid compositor pause and make it thread safe

Categories

(Core :: Widget: Gtk, defect, P2)

defect

Tracking

()

RESOLVED FIXED
126 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox124 --- unaffected
firefox125 --- unaffected
firefox126 --- fixed

People

(Reporter: stransky, Assigned: stransky)

References

Details

(Keywords: topcrash, Whiteboard: [stockwell disable-recommended])

Crash Data

Attachments

(9 files, 1 obsolete file)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
  • We should avoid compositor pause but rather render to offscreen surface after XWindow hide.
  • We need to make compositor/nsWindow calls thread safe
Flags: needinfo?(stransky)

We need a bit rework how we manage popup close and sync compositor shutdown for it.
Beta will go with hotfix in Bug 1882021 where we force-update XWindow after popup open.
Nightly is partially covered by Bug 1882462 (destroy layer manager on popup close).

To correctly fix it we should update GtkCompositorWidget to be thread safe, fix rendering offscreen to EGLWindow and make sure XWindow is updated by GtkCompositorWidget::EnableRendering() wihout previous GtkCompositorWidget::DisableRendering() call.

We should not use GtkCompositorWidget::DisableRendering() to hold rendering to invalid XWindow as it leads to UI freezes/timeouts.

Duplicate of this bug: 1883250
Flags: needinfo?(stransky)
See Also: → 1883729
Duplicate of this bug: 1882538
Flags: needinfo?(stransky)
Assignee: nobody → stransky
Status: NEW → ASSIGNED
  • Merge nsWindow::DisableRendering() and nsWindow::OnUnmap() as it handles the same event (widget unmap)
  • Remove nsWindow map/unmap mContainer handlers and call them directly from mContainer. It ensures correct order
    where we stop rendering / clear resources and then release them.

Depends on D204066

  • Clean up native rendering resources stored at GtkCompositorWidget and mSurfaceProvider
  • Call SendResume() to all rendering backends, it leads to nullptr result from nsWindow::GetNativeData(NS_NATIVE_EGL_WINDOW) from compositor
    thread (as nsWindow is unmapped) and rendering backend will create offscreen rendering buffer then.

Depends on D204067

  • Rename mDestroyMutex to mWindowVisibilityMutex to make clear it protects window visibility changes
  • Make mIsMapped atomic - we want to read it from rendering thread. We don't protect nsWindow::IsMapped() getter by mutex
    as we know backend code can render to hidden window safely.

Depends on D204068

It makes sure underlying native surface (XWindow/wl_surface) stays allocated until compositor thread finishes rendering and returns
drawing target to us.

Depends on D204072

Attachment #9390223 - Attachment description: WIP: Bug 1882779 [Linux] remove disable rendering, remove map/unmap handlers → Bug 1882779 [Linux] remove disable rendering, remove map/unmap handlers r?emilio
Flags: needinfo?(stransky)
Pushed by stransky@redhat.com: https://hg.mozilla.org/integration/autoland/rev/ae204d74a9f7 [Linux] Always create offscreen rendering surface if we're missing native rendering window r=lsalzman,jgilbert https://hg.mozilla.org/integration/autoland/rev/5692d5a1b237 [Linux] remove disable rendering, remove map/unmap handlers r=emilio https://hg.mozilla.org/integration/autoland/rev/c621729acdc4 [Linux] Simplify OnUnmap handler r=emilio https://hg.mozilla.org/integration/autoland/rev/f36a7bafc019 [Linux] Remove nsWindow destroy mutex and protect window visibility state instead r=emilio https://hg.mozilla.org/integration/autoland/rev/dcc81c544c72 [Linux] Rename CompositorWidgetChild::DisableRendering() to CompositorWidgetChild::CleanupResources() and CompositorWidgetChild::EnableRendering() to CompositorWidgetChild::SetRenderingSurface() r=emilio https://hg.mozilla.org/integration/autoland/rev/d343fe633520 [Linux] Don't suspend rendering to hidden widget as backend code can handle it r=emilio https://hg.mozilla.org/integration/autoland/rev/c7d6f5ba608c [Linux] Lock WindowSurfaceProvider between StartRemoteDrawingInRegion() and EndRemoteDrawingInRegion() calls r=emilio
See Also: → 1885831

I can reproduce the layout/reftests/margin-collapsing/inline-horizontal-1.html failure, thanks.

Attachment #9390225 - Attachment description: Bug 1882779 [Linux] Remove nsWindow destroy mutex and protect window visibility state instead r?emilio → Bug 1882779 [Linux] Remove nsWindow destroy mutex r?emilio
Flags: needinfo?(stransky)
Pushed by stransky@redhat.com: https://hg.mozilla.org/integration/autoland/rev/c48c0bac61e6 [Linux] Always create offscreen rendering surface if we're missing native rendering window r=lsalzman,jgilbert https://hg.mozilla.org/integration/autoland/rev/69d389710699 [Linux] remove disable rendering, remove map/unmap handlers r=emilio https://hg.mozilla.org/integration/autoland/rev/69770ebd8903 [Linux] Simplify OnUnmap handler r=emilio https://hg.mozilla.org/integration/autoland/rev/8a52d2e3479d [Linux] Remove nsWindow destroy mutex r=emilio https://hg.mozilla.org/integration/autoland/rev/54a6067c4d01 [Linux] Rename CompositorWidgetChild::DisableRendering() to CompositorWidgetChild::CleanupResources() and CompositorWidgetChild::EnableRendering() to CompositorWidgetChild::SetRenderingSurface() r=emilio https://hg.mozilla.org/integration/autoland/rev/5550caf7701f [Linux] Don't suspend rendering to hidden widget as backend code can handle it r=emilio https://hg.mozilla.org/integration/autoland/rev/b6316d0df80a [Linux] Lock WindowSurfaceProvider between StartRemoteDrawingInRegion() and EndRemoteDrawingInRegion() calls r=emilio

The assertion failure is still happening as you can see here.
Should we reopen this bug or file a new one?

Flags: needinfo?(stransky)

Will look at it ASAP, Thanks.

Pushed by stransky@redhat.com: https://hg.mozilla.org/integration/autoland/rev/81b0c51c60e1 [Wayland] Ensure we create EGLWindow for visible MozContainer r=emilio
No longer duplicate of this bug: 1882538

(In reply to Sandor Molnar[:smolnar] from comment #25)

The assertion failure is still happening as you can see here.
Should we reopen this bug or file a new one?

Let's track it at Bug 1882538.

Flags: needinfo?(stransky)

Backed out the latest commit because many crashes with the signature [@ moz_container_wayland_get_egl_window] have been reported after this landed: https://hg.mozilla.org/mozilla-central/rev/e1c37d37e8e47757a40b72a59eca6d4c7415acfa

Crash report: https://crash-stats.mozilla.org/report/index/99f132fc-9ff0-4303-bc97-aaf840240324

MOZ_CRASH Reason: MOZ_DIAGNOSTIC_ASSERT(wl_container->ready_to_draw) (We can't draw to surface!)

Top 10 frames of crashing thread:

0  libxul.so  moz_container_wayland_get_egl_window  widget/gtk/MozContainerWayland.cpp:726
1  libxul.so  nsWindow::GetNativeData  widget/gtk/nsWindow.cpp:3433
2  libxul.so  mozilla::widget::GtkCompositorWidget::GetEGLNativeWindow  widget/gtk/GtkCompositorWidget.cpp:120
2  libxul.so  mozilla::gl::GLContextEGL::CreateEGLSurfaceForCompositorWidget  gfx/gl/GLContextProviderEGL.cpp:340
2  libxul.so  mozilla::wr::RenderCompositorEGL::CreateEGLSurface  gfx/webrender_bindings/RenderCompositorEGL.cpp:58
2  libxul.so  mozilla::wr::RenderCompositorEGL::Resume  gfx/webrender_bindings/RenderCompositorEGL.cpp:203
3  libxul.so  mozilla::wr::RendererOGL::Resume  gfx/webrender_bindings/RendererOGL.cpp:294
3  libxul.so  mozilla::wr::RenderThread::Resume  gfx/webrender_bindings/RenderThread.cpp:884
3  libxul.so  mozilla::wr::WebRenderAPI::Resume  gfx/webrender_bindings/WebRenderAPI.cpp:774
4  libxul.so  mozilla::wr::RenderThread::RunEvent  gfx/webrender_bindings/RenderThread.cpp:715
Crash Signature: [@ moz_container_wayland_get_egl_window]
Flags: needinfo?(stransky)
Status: RESOLVED → REOPENED
Flags: needinfo?(stransky)
Resolution: FIXED → ---

Looks like we need more patches here, Thanks.

Status: REOPENED → ASSIGNED
Duplicate of this bug: 1887610
Flags: needinfo?(stransky)
Flags: needinfo?(stransky)

Split window configuration (mGdkWindow setup) and compositor resume and use mutex to protect window config only.
That ensures we always have valid mGdkWindow but we don't deadlock if compositor resume is called from map/unmap during
forced EGLSurface update.

Attachment #9392538 - Attachment is obsolete: true

The bug is linked to a topcrash signature, which matches the following criterion:

  • Top 10 desktop browser crashes on nightly

For more information, please visit BugBot documentation.

Keywords: topcrash
Duplicate of this bug: 1851858
Pushed by stransky@redhat.com: https://hg.mozilla.org/integration/autoland/rev/79a66bf59a9a [Linux] Use mutex to protect nsWindow::GetNativeData(NS_NATIVE_EGL_WINDOW) r=emilio https://hg.mozilla.org/integration/autoland/rev/ba04f6471cad [Linux] Don't try to get NS_NATIVE_EGL_WINDOW for destroyed windows r=emilio
Status: ASSIGNED → RESOLVED
Closed: 1 year ago1 year ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: