Closed Bug 1677314 Opened 5 years ago Closed 4 years ago

MOZ_X11_EGL/Mate (no compositor)/Nvidia: Dragging tabs leads to EGL_BAD_MATCH (Successful alternative STR: GTK_CSD=1/enabled compositor/Nvidia -> Fallback to SW WR, WebGL still works)

Categories

(Core :: Widget: Gtk, defect, P2)

Firefox 84
x86_64
Linux
defect

Tracking

()

RESOLVED DUPLICATE of bug 1702546
Tracking Status
firefox84 --- disabled

People

(Reporter: alex, Assigned: rmader)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 1 obsolete file)

Attached file about:support.txt

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0

Steps to reproduce:

Webrender force-enabled, Nvidia proprietary driver, Linux X11 MATE (no compositing).

  • export MOZ_X11_EGL=1
  • ./firefox
  • Maximize the window
  • open 2 tabs
  • drag first tab to re-order it, now 2nd
  • restore window

Actual results:

There's a short grey flash and the following is printed in the terminal:

[GFX1-]: Failed to create EGLSurface!: 0x3009
[GFX1-]: Failed to create EGLSurface!: 0x3009
[GFX1-]: Failed GL context creation for WebRender: 0
[GFX1-]: FEATURE_FAILTURE_WEBRENDER_INITIALIZE_UNSPECIFIED
[GFX1-]: Failed to connect WebRenderBridgeChild.
[GFX1-]: Compositors might be mixed (5,2)

Now the rendering becomes corrupted, this is most easy to see by scrolling using the middle-click autoscroll or by clicking and dragging the scrollbar (scroll-wheel doesn't show it to the same extent)

example: https://i.abaines.me.uk/moz_egl.webm

Expected results:

The graphics should not become corrupted.

Mate desktop with disabled compositing, Debian Testing, GTX1060, same driver version: Unfortunately I wasn't able to reproduce this yet.

This might as well be related to visuals.
Let's see if fixing bug 1663273 helps, otherwise bug 1669275 would switch non-Mesa over to the new GLContextEGL::FindVisual.

Blocks: wr-nv-linux
OS: Unspecified → Linux
Hardware: Unspecified → x86_64
See Also: → 1663273, 1650583
Summary: MOZ_X11_EGL Linux Nvidia proprietary graphical corruption → WebRender/MOZ_X11_EGL/Mate (no compositor)/proprietary Nvidia: Dragging tabs leads to EGL_BAD_MATCH with fallback to glitchy OpenGL
Severity: -- → S3

So far I haven't been able to reproduce this with disabled compositor.

In the mean time:

  • NVIDIA 450.80.02

    Later Nvidia drivers allowed more concurrent GL contexts.


Flutter had a similar EGL_BAD_MATCH problem with gtk_window_set_titlebar:
https://github.com/flutter/flutter/issues/59960


bug 1663273 comment 80: Easily reproducible with enabled compositor and GTK_CSD=1 environment variable:

Ubuntu 21.04, GTX 1060, Nvidia driver 470, MATE desktop, default==compositor enabled (System > Preferences > Look and Feel > Windows)

build from comment 0:
GTK_CSD=1 MOZ_X11_EGL=1 mozregression --launch 20201114094625 --pref gfx.webrender.all:true -a about:support

Compositing OpenGL
(#0) Error Failed to create EGLSurface!: 0x3009
(#1) Error Failed to create EGLSurface!: 0x3009
(#2) Error Failed GL context creation for WebRender: 0
(#3) Error FEATURE_FAILTURE_WEBRENDER_INITIALIZE_UNSPECIFIED
(#4) Error Failed to connect WebRenderBridgeChild.

today:
GTK_CSD=1 MOZ_X11_EGL=1 mozregression --launch 2021-09-18 --pref gfx.webrender.all:true -a about:support

Compositing WebRender (Software)
Failure Log
(#0) Error Failed to create EGLSurface!: 0x3009
(#1) Error Failed to create EGLSurface
(#2) Error Fallback WR to SW-WR

See Also: → 1731125
Summary: WebRender/MOZ_X11_EGL/Mate (no compositor)/proprietary Nvidia: Dragging tabs leads to EGL_BAD_MATCH with fallback to glitchy OpenGL → MOZ_X11_EGL/Mate (no compositor)/Nvidia: Dragging tabs leads to EGL_BAD_MATCH (Successful alternative STR: GTK_CSD=1/enabled compositor/Nvidia)
Summary: MOZ_X11_EGL/Mate (no compositor)/Nvidia: Dragging tabs leads to EGL_BAD_MATCH (Successful alternative STR: GTK_CSD=1/enabled compositor/Nvidia) → MOZ_X11_EGL/Mate (no compositor)/Nvidia: Dragging tabs leads to EGL_BAD_MATCH (Successful alternative STR: GTK_CSD=1/enabled compositor/Nvidia -> Fallback to SW WR, WebGL still works)

This should be fixed by Bug 1730533 - we need to correctly re-create XWindow when GdkWindow is mapped/unmapped.

Gnome X11, Ubuntu 21.04, Nvidia GTX 1060, driver 470.63.01
At the moment, the try build (bug 1730533 comment 2) crashes as soon as I hover the window. (Also reproducible on Gnome Xwayland, Intel.)
$ GTK_CSD=1 MOZ_X11_EGL=1 mozregression --repo try --launch 7cebf0fe98b2a412308b0690351cb118b3057595 --pref gfx.webrender.all:true -a about:support -P stdout

**********
You should use a config file. Please use the --write-config command line flag to help you create one.
**********

 0:01.13 INFO: 7cebf0fe98b2a412308b0690351cb118b3057595 is not a release, assuming it's a hash...
 0:05.12 INFO: Downloading build from: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/K8IZDE9KQKW85E-AMSrbOg/runs/0/artifacts/public%2Fbuild%2Ftarget.tar.bz2
===== Downloaded 100% =====
 0:18.26 INFO: Running try build built on 2021-09-28 22:24:33.688000, revision 7cebf0fe
 0:27.36 INFO: Launching /tmp/tmpopfvt9bn/firefox/firefox
 0:27.36 INFO: Application command: /tmp/tmpopfvt9bn/firefox/firefox about:support -profile /tmp/tmpn7k577i_.mozrunner
 0:27.36 INFO: application_buildid: 20210928220623
 0:27.37 INFO: application_changeset: 7cebf0fe98b2a412308b0690351cb118b3057595
 0:27.37 INFO: application_name: Firefox
 0:27.37 INFO: application_repository: https://hg.mozilla.org/try
 0:27.37 INFO: application_version: 94.0a1
 0:27.82 INFO: b'ATTENTION: default value of option mesa_glthread overridden by environment.'
 0:27.82 INFO: b'ATTENTION: default value of option mesa_glthread overridden by environment.'
 0:28.26 INFO: b'[GFX1-]: Failed to create EGLSurface!: 0x3009'
 0:28.26 INFO: b'[GFX1-]: Failed to create EGLSurface'
 0:28.28 INFO: b'[GFX1-]: Fallback WR to SW-WR'
 0:50.71 INFO: b''
 0:50.71 INFO: b"(firefox:11255): Gdk-CRITICAL **: 03:02:36.121: gdk_x11_get_server_time: assertion 'GDK_IS_WINDOW (window)' failed"
 0:50.71 INFO: b''
 0:50.71 INFO: b"(firefox:11255): Gdk-CRITICAL **: 03:02:36.121: gdk_window_get_display: assertion 'GDK_IS_WINDOW (window)' failed"
 0:50.71 INFO: b''
 0:50.72 INFO: b"(firefox:11255): Gdk-CRITICAL **: 03:02:36.121: gdk_x11_display_get_xdisplay: assertion 'GDK_IS_DISPLAY (display)' failed"
 0:50.72 INFO: b'ExceptionHandler::GenerateDump cloned child 11473'
 0:50.72 INFO: b'ExceptionHandler::SendContinueSignalToChild sent continue signal to child'
 0:50.72 INFO: b'ExceptionHandler::WaitForContinueSignal waiting for continue signal...'
 0:50.92 INFO: b'Exiting due to channel error.'
 0:50.92 INFO: b'Exiting due to channel error.'
 0:50.92 INFO: b'Exiting due to channel error.'
Component: Graphics: WebRender → Widget: Gtk
Flags: needinfo?(stransky)

Also if it crashes for you, please run with G_DEBUG=fatal-criticals env variable and try to get a backtrace of the crash.

Priority: -- → P2

old try build (bug 1730533 comment 2) on Gnome Xwayland, Debian Testing, Intel:
$ G_DEBUG=fatal-criticals GTK_CSD=1 MOZ_X11_EGL=1 mozregression --repo try --launch 7cebf0fe98b2a412308b0690351cb118b3057595 --pref gfx.webrender.all:true -a about:support -P stdout --command 'gdb {binary}'

**********
You should use a config file. Please use the --write-config command line flag to help you create one.
**********

 0:01.53 INFO: 7cebf0fe98b2a412308b0690351cb118b3057595 is not a release, assuming it's a hash...
 0:05.35 INFO: Downloading build from: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/K8IZDE9KQKW85E-AMSrbOg/runs/0/artifacts/public%2Fbuild%2Ftarget.tar.bz2
===== Downloaded 100% =====
 0:16.78 INFO: Running try build built on 2021-09-28 22:24:33.688000, revision 7cebf0fe
 0:26.15 INFO: application_buildid: 20210928220623
 0:26.15 INFO: application_changeset: 7cebf0fe98b2a412308b0690351cb118b3057595
 0:26.15 INFO: application_name: Firefox
 0:26.15 INFO: application_repository: https://hg.mozilla.org/try
 0:26.15 INFO: application_version: 94.0a1
 0:26.15 INFO: Running test command: `gdb /tmp/tmpg5_lhp7p/firefox/firefox`
GNU gdb (Debian 10.1-2) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /tmp/tmpg5_lhp7p/firefox/firefox...
(No debugging symbols found in /tmp/tmpg5_lhp7p/firefox/firefox)
(gdb) run
Starting program: /tmp/tmpg5_lhp7p/firefox/firefox 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
process 7012 is executing new program: /tmp/tmpg5_lhp7p/firefox/firefox-bin
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff777e640 (LWP 7016)]
[Thread 0x7ffff777e640 (LWP 7016) exited]
[Detaching after fork from child process 7017]
[Detaching after fork from child process 7019]
[New Thread 0x7ffff777e640 (LWP 7020)]
[New Thread 0x7fffeb961640 (LWP 7021)]
[New Thread 0x7fffeae96640 (LWP 7024)]
[New Thread 0x7fffe66ff640 (LWP 7025)]
[New Thread 0x7fffe66be640 (LWP 7026)]
[New Thread 0x7fffe667d640 (LWP 7027)]
[New Thread 0x7fffe663c640 (LWP 7028)]
[New Thread 0x7fffe65fb640 (LWP 7029)]
[New Thread 0x7fffe65ba640 (LWP 7030)]
[New Thread 0x7fffe63ff640 (LWP 7031)]
[Detaching after fork from child process 7032]
[New Thread 0x7fffeb160640 (LWP 7033)]
[New Thread 0x7fffe5bfe640 (LWP 7034)]
[New Thread 0x7fffe6534640 (LWP 7035)]
[New Thread 0x7fffe5bbd640 (LWP 7036)]
[New Thread 0x7fffe51ff640 (LWP 7039)]
[New Thread 0x7fffe5000640 (LWP 7040)]
[New Thread 0x7fffe4e01640 (LWP 7041)]
[New Thread 0x7fffe4c02640 (LWP 7042)]
[New Thread 0x7fffe5b7c640 (LWP 7044)]
[Thread 0x7fffe63ff640 (LWP 7031) exited]
[New Thread 0x7fffe46ff640 (LWP 7045)]
[Thread 0x7fffe5bfe640 (LWP 7034) exited]
[New Thread 0x7fffe63ff640 (LWP 7046)]
[New Thread 0x7fffe5bfe640 (LWP 7047)]
[New Thread 0x7fffe46be640 (LWP 7048)]
[Detaching after fork from child process 7049]
[New Thread 0x7fffe467d640 (LWP 7051)]
[New Thread 0x7fffe3fff640 (LWP 7057)]
[New Thread 0x7fffe3fbe640 (LWP 7059)]
[New Thread 0x7fffe3cff640 (LWP 7060)]
[New Thread 0x7fffe3f7d640 (LWP 7061)]
[New Thread 0x7fffe3998640 (LWP 7062)]
[New Thread 0x7fffe3957640 (LWP 7063)]
[New Thread 0x7fffe0b32640 (LWP 7064)]
[Thread 0x7fffe0b32640 (LWP 7064) exited]
[New Thread 0x7fffe0b32640 (LWP 7065)]
[New Thread 0x7fffdff66640 (LWP 7066)]
[Thread 0x7fffdff66640 (LWP 7066) exited]
[New Thread 0x7fffdff66640 (LWP 7067)]
[Thread 0x7fffe0b32640 (LWP 7065) exited]
[New Thread 0x7fffe0b32640 (LWP 7068)]
[Thread 0x7fffdff66640 (LWP 7067) exited]
[New Thread 0x7fffdff66640 (LWP 7069)]
[Thread 0x7fffe0b32640 (LWP 7068) exited]
ATTENTION: default value of option mesa_glthread overridden by environment.
[New Thread 0x7fffe0b32640 (LWP 7070)]
ATTENTION: default value of option mesa_glthread overridden by environment.
[New Thread 0x7fffd6110640 (LWP 7071)]
[New Thread 0x7fffd590f640 (LWP 7072)]
ATTENTION: default value of option mesa_glthread overridden by environment.
[New Thread 0x7fffd578d640 (LWP 7073)]
[New Thread 0x7fffd4f5c640 (LWP 7074)]
[New Thread 0x7fffd475b640 (LWP 7075)]
[New Thread 0x7fffd21ff640 (LWP 7076)]
[New Thread 0x7fffd1ffe640 (LWP 7077)]
[New Thread 0x7fffd1bff640 (LWP 7078)]
[New Thread 0x7fffd17ff640 (LWP 7079)]
[New Thread 0x7fffd12ff640 (LWP 7080)]
[New Thread 0x7fffd10fe640 (LWP 7081)]
[New Thread 0x7fffd0bff640 (LWP 7082)]
[New Thread 0x7fffd09fe640 (LWP 7083)]
[New Thread 0x7fffd1dfd640 (LWP 7084)]
ATTENTION: default value of option mesa_glthread overridden by environment.
[New Thread 0x7fffd01be640 (LWP 7085)]
[New Thread 0x7fffcf7ff640 (LWP 7086)]
[New Thread 0x7fffd19fe640 (LWP 7087)]
[New Thread 0x7fffce7ff640 (LWP 7088)]
[New Thread 0x7fffcdffe640 (LWP 7089)]
[Thread 0x7fffce7ff640 (LWP 7088) exited]
[Thread 0x7fffcdffe640 (LWP 7089) exited]
[New Thread 0x7fffcd5bc640 (LWP 7090)]
[New Thread 0x7fffcd57b640 (LWP 7091)]
[Thread 0x7fffcd5bc640 (LWP 7090) exited]
[New Thread 0x7fffcd5bc640 (LWP 7092)]
[New Thread 0x7fffe39ff640 (LWP 7093)]
[New Thread 0x7fffccf88640 (LWP 7094)]
[New Thread 0x7fffcc0ff640 (LWP 7095)]
[New Thread 0x7fffcce9e640 (LWP 7096)]
[New Thread 0x7fffcce5d640 (LWP 7097)]
[New Thread 0x7fffcbbff640 (LWP 7098)]
[Detaching after fork from child process 7099]
[New Thread 0x7fffcbb61640 (LWP 7100)]
[New Thread 0x7fffcb8ff640 (LWP 7102)]
[New Thread 0x7fffcb8be640 (LWP 7103)]
[New Thread 0x7fffcb87d640 (LWP 7104)]
[New Thread 0x7fffcb6ff640 (LWP 7106)]
[New Thread 0x7fffcdffe640 (LWP 7118)]
[New Thread 0x7fffce7ff640 (LWP 7121)]
[New Thread 0x7fffcadff640 (LWP 7122)]
[Detaching after fork from child process 7123]
[New Thread 0x7fffcb6be640 (LWP 7125)]
[New Thread 0x7fffccfff640 (LWP 7126)]
[New Thread 0x7fffca7ff640 (LWP 7133)]
[New Thread 0x7fffca5fe640 (LWP 7135)]
[New Thread 0x7fffca1ff640 (LWP 7144)]
[New Thread 0x7fffc9981640 (LWP 7152)]
[New Thread 0x7fffc9940640 (LWP 7178)]
[New Thread 0x7fffc0c1f640 (LWP 7180)]
[New Thread 0x7fffc02fe640 (LWP 7181)]
[New Thread 0x7fffc02bd640 (LWP 7182)]
[New Thread 0x7fffc027c640 (LWP 7183)]
[New Thread 0x7fffc0077640 (LWP 7184)]
[Detaching after fork from child process 7185]
[New Thread 0x7fffbfbff640 (LWP 7193)]
[New Thread 0x7fffbfbbe640 (LWP 7200)]
[Thread 0x7fffc02fe640 (LWP 7181) exited]

(firefox:7012): Gdk-CRITICAL **: 13:16:27.199: gdk_x11_get_server_time: assertion 'GDK_IS_WINDOW (window)' failed
--Type <RET> for more, q to quit, c to continue without paging--bt full

Thread 1 "GeckoMain" received signal SIGTRAP, Trace/breakpoint trap.
0x00007ffff5b59e52 in g_logv () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
(gdb) bt full
#0  0x00007ffff5b59e52 in g_logv () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
No symbol table info available.
#1  0x00007ffff5b5a0bf in g_log () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
No symbol table info available.
#2  0x00007ffff63d572a in gdk_x11_get_server_time () from /lib/x86_64-linux-gnu/libgdk-3.so.0
No symbol table info available.
#3  0x00007ffff15a0ed1 in mozilla::TimeStamp mozilla::SystemTimeConverter<unsigned int, mozilla::TimeStamp>::GetTimeStampFromSystemTime<mozilla::CurrentX11TimeGetter>(unsigned int, mozilla::CurrentX11TimeGetter&) () from /tmp/tmpg5_lhp7p/firefox/libxul.so
No symbol table info available.
#4  0x00007ffff159ec20 in nsWindow::OnEnterNotifyEvent(_GdkEventCrossing*) () from /tmp/tmpg5_lhp7p/firefox/libxul.so
No symbol table info available.
#5  0x00007ffff15a67e2 in enter_notify_event_cb(_GtkWidget*, _GdkEventCrossing*) () from /tmp/tmpg5_lhp7p/firefox/libxul.so
No symbol table info available.
#6  0x00007ffff6844f54 in ?? () from /lib/x86_64-linux-gnu/libgtk-3.so.0
No symbol table info available.
#7  0x00007ffff5c46889 in ?? () from /lib/x86_64-linux-gnu/libgobject-2.0.so.0
No symbol table info available.
#8  0x00007ffff5c5e394 in g_signal_emit_valist () from /lib/x86_64-linux-gnu/libgobject-2.0.so.0
No symbol table info available.
#9  0x00007ffff5c5f1df in g_signal_emit () from /lib/x86_64-linux-gnu/libgobject-2.0.so.0
No symbol table info available.
#10 0x00007ffff67eea14 in ?? () from /lib/x86_64-linux-gnu/libgtk-3.so.0
No symbol table info available.
#11 0x00007ffff66a3b49 in gtk_main_do_event () from /lib/x86_64-linux-gnu/libgtk-3.so.0
No symbol table info available.
#12 0x00007ffff638b7a5 in ?? () from /lib/x86_64-linux-gnu/libgdk-3.so.0
No symbol table info available.
#13 0x00007ffff63bf2b2 in ?? () from /lib/x86_64-linux-gnu/libgdk-3.so.0
No symbol table info available.
#14 0x00007ffff5b5285b in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
No symbol table info available.
#15 0x00007ffff5b52b08 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
No symbol table info available.
#16 0x00007ffff5b52bbf in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
--Type <RET> for more, q to quit, c to continue without paging--

new try build (comment 5) on Gnome Xwayland, Debian Testing, Intel:

  • no crash anymore
  • new glitches: a new incarnation of bug 1630251?

$ GTK_CSD=1 MOZ_X11_EGL=1 mozregression --repo try --launch 5d2f2e7daeb2cddb8d7ccbb0f68d6f8f984087b1 --pref gfx.webrender.all:true -a about:support -P stdout


You should use a config file. Please use the --write-config command line flag to help you create one.


0:01.10 INFO: 5d2f2e7daeb2cddb8d7ccbb0f68d6f8f984087b1 is not a release, assuming it's a hash...
0:05.07 INFO: Downloading build from: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/QdjVZvMHQWidJO5TsM7K8g/runs/0/artifacts/public%2Fbuild%2Ftarget.tar.bz2
===== Downloaded 100% =====
0:16.16 INFO: Running try build built on 2021-09-29 07:56:21.410000, revision 5d2f2e7d
0:30.13 INFO: Launching /tmp/tmp2i0pinid/firefox/firefox
0:30.13 INFO: Application command: /tmp/tmp2i0pinid/firefox/firefox about:support -profile /tmp/tmp0bswhqjw.mozrunner
0:30.13 INFO: application_buildid: 20210929073307
0:30.13 INFO: application_changeset: 5d2f2e7daeb2cddb8d7ccbb0f68d6f8f984087b1
0:30.13 INFO: application_name: Firefox
0:30.13 INFO: application_repository: https://hg.mozilla.org/try
0:30.13 INFO: application_version: 94.0a1
0:31.01 INFO: b'ATTENTION: default value of option mesa_glthread overridden by environment.'
0:31.02 INFO: b'ATTENTION: default value of option mesa_glthread overridden by environment.'
0:31.07 INFO: b'ATTENTION: default value of option mesa_glthread overridden by environment.'
0:31.14 INFO: b'ATTENTION: default value of option mesa_glthread overridden by environment.'

Old try build (bug 1730533 comment 2) on Gnome X11, Ubuntu 21.04, Nvidia GTX 1060, driver 470.63.01.

Noticed: On Nvidia I need to hover the window to cause the crash. On Intel the crash happens on startup while my mouse is on the task bar.

$ G_DEBUG=fatal-criticals GTK_CSD=1 MOZ_X11_EGL=1 mozregression --repo try --launch 7cebf0fe98b2a412308b0690351cb118b3057595 --pref gfx.webrender.all:true -a about:support -P stdout --command 'gdb {binary}'

**********
You should use a config file. Please use the --write-config command line flag to help you create one.
**********

 0:01.82 INFO: 7cebf0fe98b2a412308b0690351cb118b3057595 is not a release, assuming it's a hash...
 0:06.24 INFO: Downloading build from: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/K8IZDE9KQKW85E-AMSrbOg/runs/0/artifacts/public%2Fbuild%2Ftarget.tar.bz2
===== Downloaded 100% =====
 0:18.99 INFO: Running try build built on 2021-09-28 22:24:33.688000, revision 7cebf0fe
 0:28.01 INFO: application_buildid: 20210928220623
 0:28.01 INFO: application_changeset: 7cebf0fe98b2a412308b0690351cb118b3057595
 0:28.01 INFO: application_name: Firefox
 0:28.01 INFO: application_repository: https://hg.mozilla.org/try
 0:28.01 INFO: application_version: 94.0a1
 0:28.01 INFO: Running test command: `gdb /tmp/tmpu7qldr6b/firefox/firefox`
GNU gdb (Ubuntu 10.1-2ubuntu2) 10.1.90.20210411-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /tmp/tmpu7qldr6b/firefox/firefox...
(No debugging symbols found in /tmp/tmpu7qldr6b/firefox/firefox)
(gdb) run
Starting program: /tmp/tmpu7qldr6b/firefox/firefox 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
process 4059 is executing new program: /tmp/tmpu7qldr6b/firefox/firefox-bin
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff777e640 (LWP 4063)]
[Thread 0x7ffff777e640 (LWP 4063) exited]
[Detaching after fork from child process 4064]
[Detaching after fork from child process 4065]
[New Thread 0x7ffff777e640 (LWP 4066)]
[New Thread 0x7fffebdbd640 (LWP 4067)]
[New Thread 0x7fffeb2ff640 (LWP 4068)]
[New Thread 0x7fffe6c6d640 (LWP 4069)]
[New Thread 0x7fffe6aff640 (LWP 4070)]
[New Thread 0x7fffe6abe640 (LWP 4071)]
[New Thread 0x7fffe6a7d640 (LWP 4072)]
[New Thread 0x7fffe68ff640 (LWP 4073)]
[New Thread 0x7fffe68be640 (LWP 4074)]
[New Thread 0x7fffe687d640 (LWP 4075)]
[New Thread 0x7fffec585640 (LWP 4076)]
[Detaching after fork from child process 4077]
[New Thread 0x7fffe59ff640 (LWP 4078)]
[New Thread 0x7fffeb52b640 (LWP 4079)]
[New Thread 0x7fffe59be640 (LWP 4080)]
[Thread 0x7fffe687d640 (LWP 4075) exited]
[New Thread 0x7fffe55ff640 (LWP 4081)]
[New Thread 0x7fffe5400640 (LWP 4082)]
[New Thread 0x7fffe5201640 (LWP 4083)]
[New Thread 0x7fffe5002640 (LWP 4084)]
[New Thread 0x7fffe597d640 (LWP 4085)]
[New Thread 0x7fffe4aff640 (LWP 4086)]
[Thread 0x7fffe59ff640 (LWP 4078) exited]
[New Thread 0x7fffe4abe640 (LWP 4087)]
[New Thread 0x7fffe59ff640 (LWP 4088)]
[New Thread 0x7fffe687d640 (LWP 4089)]
[New Thread 0x7fffe4a7d640 (LWP 4090)]
[New Thread 0x7fffe41ff640 (LWP 4091)]
[Detaching after fork from child process 4092]
[New Thread 0x7fffe41be640 (LWP 4093)]
[New Thread 0x7fffe3fbf640 (LWP 4095)]
[New Thread 0x7fffe3dff640 (LWP 4098)]
[New Thread 0x7fffe3dbe640 (LWP 4103)]
[New Thread 0x7fffe3d7d640 (LWP 4104)]
[New Thread 0x7fffe0d9e640 (LWP 4105)]
ATTENTION: default value of option mesa_glthread overridden by environment.
ATTENTION: default value of option mesa_glthread overridden by environment.
[New Thread 0x7fffd91ed640 (LWP 4106)]
[New Thread 0x7fffd89ec640 (LWP 4107)]
[New Thread 0x7fffd81eb640 (LWP 4108)]
[New Thread 0x7fffd79ea640 (LWP 4109)]
[New Thread 0x7fffd71e9640 (LWP 4110)]
[New Thread 0x7fffd69e8640 (LWP 4111)]
[New Thread 0x7fffd61e7640 (LWP 4112)]
[New Thread 0x7fffd59e6640 (LWP 4113)]
[New Thread 0x7fffd50a4640 (LWP 4114)]
[New Thread 0x7fffd48a3640 (LWP 4115)]
[New Thread 0x7fffd40a2640 (LWP 4116)]
[New Thread 0x7fffd38a1640 (LWP 4117)]
[New Thread 0x7fffd30a0640 (LWP 4118)]
[New Thread 0x7fffd305f640 (LWP 4119)]
[New Thread 0x7fffd301e640 (LWP 4120)]
[New Thread 0x7fffd2e1d640 (LWP 4121)]
[New Thread 0x7fffd2aff640 (LWP 4122)]
[New Thread 0x7fffd26ff640 (LWP 4123)]
[New Thread 0x7fffd21ff640 (LWP 4124)]
[New Thread 0x7fffd1ffe640 (LWP 4125)]
[New Thread 0x7fffd1aff640 (LWP 4126)]
[New Thread 0x7fffd18fe640 (LWP 4127)]
[New Thread 0x7fffd28fe640 (LWP 4128)]
[New Thread 0x7fffd1dba640 (LWP 4129)]
[New Thread 0x7fffcd87b640 (LWP 4131)]
[New Thread 0x7fffcd8bc640 (LWP 4130)]
[New Thread 0x7fffe3f7e640 (LWP 4132)]
[New Thread 0x7fffcd5ff640 (LWP 4133)]
[New Thread 0x7fffcc3ff640 (LWP 4134)]
[New Thread 0x7fffcc788640 (LWP 4135)]
[New Thread 0x7fffcc747640 (LWP 4136)]
[New Thread 0x7fffcbcff640 (LWP 4137)]
[New Thread 0x7fffcbcbe640 (LWP 4138)]
[New Thread 0x7fffcb6ff640 (LWP 4139)]
[New Thread 0x7fffcb4fe640 (LWP 4140)]
[New Thread 0x7fffcb2fd640 (LWP 4141)]
[Detaching after fork from child process 4142]
[New Thread 0x7fffcbc7d640 (LWP 4143)]
[New Thread 0x7fffcd547640 (LWP 4145)]
[New Thread 0x7fffcc7ff640 (LWP 4146)]
[Detaching after fork from child process 4156]
[New Thread 0x7fffcb07e640 (LWP 4157)]
[New Thread 0x7fffcabff640 (LWP 4165)]
[New Thread 0x7fffca9fe640 (LWP 4166)]
[GFX1-]: Failed to create EGLSurface!: 0x3009
[GFX1-]: Failed to create EGLSurface
[GFX1-]: Fallback WR to SW-WR
[Thread 0x7fffcb4fe640 (LWP 4140) exited]
[Thread 0x7fffcb6ff640 (LWP 4139) exited]
[Thread 0x7fffcb2fd640 (LWP 4141) exited]
[New Thread 0x7fffe3f36640 (LWP 4181)]
[New Thread 0x7fffcb2fd640 (LWP 4182)]
[New Thread 0x7fffcb6ff640 (LWP 4183)]
[New Thread 0x7fffcb4fe640 (LWP 4184)]
[New Thread 0x7fffcad6f640 (LWP 4188)]
[New Thread 0x7fffc467f640 (LWP 4196)]
[New Thread 0x7fffc6599640 (LWP 4197)]
[New Thread 0x7fffc6558640 (LWP 4198)]
[New Thread 0x7fffc3bff640 (LWP 4199)]
[New Thread 0x7fffc3bbe640 (LWP 4200)]
[Detaching after fork from child process 4201]
[New Thread 0x7fffc3b7d640 (LWP 4210)]
[New Thread 0x7fffc36ff640 (LWP 4213)]
[Thread 0x7fffe4aff640 (LWP 4086) exited]
[Detaching after fork from child process 4219]
[New Thread 0x7fffc32ff640 (LWP 4220)]
[New Thread 0x7fffc32be640 (LWP 4223)]
[New Thread 0x7fffe4aff640 (LWP 4224)]
[New Thread 0x7fffc327d640 (LWP 4228)]
[New Thread 0x7fffc2fff640 (LWP 4229)]
[New Thread 0x7fffc2fbe640 (LWP 4237)]
[New Thread 0x7fffcdb22640 (LWP 4242)]
[New Thread 0x7fffc1aff640 (LWP 4243)]
[New Thread 0x7fffc18fe640 (LWP 4245)]
[New Thread 0x7fffc16fd640 (LWP 4246)]
[New Thread 0x7fffc48ff640 (LWP 4251)]
[New Thread 0x7fffbcbff640 (LWP 4252)]
[New Thread 0x7fffb00ff640 (LWP 4253)]
[New Thread 0x7fffafefe640 (LWP 4254)]
[New Thread 0x7fffafcfd640 (LWP 4262)]
[Thread 0x7fffafcfd640 (LWP 4262) exited]
[New Thread 0x7fffc48be640 (LWP 4263)]
[New Thread 0x7fffafcfd640 (LWP 4264)]
[New Thread 0x7fffaf7ff640 (LWP 4266)]
[Thread 0x7fffafefe640 (LWP 4254) exited]
[New Thread 0x7fffc487d640 (LWP 4267)]
[New Thread 0x7fffc46ff640 (LWP 4268)]
[Thread 0x7fffe3dbe640 (LWP 4103) exited]
[New Thread 0x7fffe3dbe640 (LWP 4269)]
[New Thread 0x7fffc2f7d640 (LWP 4270)]
[Detaching after fork from child process 4272]
[New Thread 0x7fffcd5be640 (LWP 4275)]
[New Thread 0x7fffafefe640 (LWP 4295)]
[New Thread 0x7fffc1482640 (LWP 4306)]
[New Thread 0x7fffc1441640 (LWP 4307)]
[New Thread 0x7fffc0deb640 (LWP 4310)]
[New Thread 0x7fffc0daa640 (LWP 4311)]
[New Thread 0x7fffad1ff640 (LWP 4316)]
[New Thread 0x7fffad1be640 (LWP 4317)]
[Thread 0x7fffaf7ff640 (LWP 4266) exited]
[New Thread 0x7fffad17d640 (LWP 4318)]
[New Thread 0x7fffa9ab7640 (LWP 4319)]
[New Thread 0x7fffa9a76640 (LWP 4320)]
[Thread 0x7fffc48be640 (LWP 4263) exited]
[New Thread 0x7fffc48be640 (LWP 4321)]
[New Thread 0x7fffa96ff640 (LWP 4322)]
[New Thread 0x7fffc14fc640 (LWP 4323)]
[Thread 0x7fffc6599640 (LWP 4197) exited]
[New Thread 0x7fffc6599640 (LWP 4324)]
[Thread 0x7fffc14fc640 (LWP 4323) exited]
[New Thread 0x7fffc14fc640 (LWP 4325)]
[Thread 0x7fffc6599640 (LWP 4324) exited]
[New Thread 0x7fffc6599640 (LWP 4326)]
[Thread 0x7fffc14fc640 (LWP 4325) exited]
[New Thread 0x7fffc14fc640 (LWP 4327)]
[Thread 0x7fffc6599640 (LWP 4326) exited]
[New Thread 0x7fffc6599640 (LWP 4328)]
[Thread 0x7fffc14fc640 (LWP 4327) exited]

(firefox:4059): Gdk-CRITICAL **: 13:46:32.492: gdk_x11_get_server_time: assertion 'GDK_IS_WINDOW (window)' failed
--Type <RET> for more, q to quit, c to continue without paging--bt full

Thread 1 "GeckoMain" received signal SIGTRAP, Trace/breakpoint trap.
bt f0x00007ffff5ba0b24 in g_logv () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
(gdb) bt full
#0  0x00007ffff5ba0b24 in g_logv () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#1  0x00007ffff5ba0db3 in g_log () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#2  0x00007ffff64053da in gdk_x11_get_server_time () at /lib/x86_64-linux-gnu/libgdk-3.so.0
#3  0x00007ffff1c7ded1 in mozilla::TimeStamp mozilla::SystemTimeConverter<unsigned int, mozilla::TimeStamp>::GetTimeStampFromSystemTime<mozilla::CurrentX11TimeGetter>(unsigned int, mozilla::CurrentX11TimeGetter&) () at /tmp/tmpu7qldr6b/firefox/libxul.so
#4  0x00007ffff1c7bc20 in nsWindow::OnEnterNotifyEvent(_GdkEventCrossing*) () at /tmp/tmpu7qldr6b/firefox/libxul.so
#5  0x00007ffff1c837e2 in enter_notify_event_cb(_GtkWidget*, _GdkEventCrossing*) () at /tmp/tmpu7qldr6b/firefox/libxul.so
#6  0x00007ffff68754a8 in  () at /lib/x86_64-linux-gnu/libgtk-3.so.0
#7  0x00007ffff5caa724 in g_signal_emit_valist () at /lib/x86_64-linux-gnu/libgobject-2.0.so.0
#8  0x00007ffff5caa893 in g_signal_emit () at /lib/x86_64-linux-gnu/libgobject-2.0.so.0
#9  0x00007ffff683c754 in  () at /lib/x86_64-linux-gnu/libgtk-3.so.0
#10 0x00007ffff66df09f in gtk_main_do_event () at /lib/x86_64-linux-gnu/libgtk-3.so.0
#11 0x00007ffff63c0733 in  () at /lib/x86_64-linux-gnu/libgdk-3.so.0
#12 0x00007ffff63f7e36 in  () at /lib/x86_64-linux-gnu/libgdk-3.so.0
#13 0x00007ffff5b988eb in g_main_context_dispatch () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#14 0x00007ffff5bebd28 in  () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#15 0x00007ffff5b96023 in g_main_context_iteration () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#16 0x00007ffff1cb7b90 in nsAppShell::ProcessNextNativeEvent(bool) () at /tmp/tmpu7qldr6b/firefox/libxul.so
#17 0x00007ffff1c422a8 in nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal*, bool) () at /tmp/tmpu7qldr6b/firefox/libxul.so
#18 0x00007ffff1c4238d in non-virtual thunk to nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal*, bool) () at /tmp/tmpu7qldr6b/firefox/libxul.so
#19 0x00007fffef466d2f in nsThread::ProcessNextEvent(bool, bool*) () at /tmp/tmpu7qldr6b/firefox/libxul.so
#20 0x00007fffef46b388 in NS_ProcessNextEvent(nsIThread*, bool) () at /tmp/tmpu7qldr6b/firefox/libxul.so
#21 0x00007fffefa2e34e in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) () at /tmp/tmpu7qldr6b/firefox/libxul.so
#22 0x00007fffef9de916 in MessageLoop::Run() () at /tmp/tmpu7qldr6b/firefox/libxul.so
#23 0x00007ffff1c41fa9 in nsBaseAppShell::Run() () at /tmp/tmpu7qldr6b/firefox/libxul.so
#24 0x00007ffff2ed8ea4 in nsAppStartup::Run() () at /tmp/tmpu7qldr6b/firefox/libxul.so
#25 0x00007ffff2faa312 in XREMain::XRE_mainRun() () at /tmp/tmpu7qldr6b/firefox/libxul.so
#26 0x00007ffff2faac9a in XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&) () at /tmp/tmpu7qldr6b/firefox/libxul.so
#27 0x00007ffff2faafea in XRE_main(int, char**, mozilla::BootstrapConfig const&) () at /tmp/tmpu7qldr6b/firefox/libxul.so
#28 0x000055555557cb94 in main ()
(gdb) q

New try build (comment 5) on Gnome X11, Ubuntu 21.04, Nvidia GTX 1060, driver 470.63.01.

  • apparently no glitches
  • still a fallback to SW WR
  • no crash anymore when hovering the window

GTK_CSD=1 MOZ_X11_EGL=1 mozregression --repo try --launch 5d2f2e7daeb2cddb8d7ccbb0f68d6f8f984087b1 --pref gfx.webrender.all:true -a about:support -P stdout

0:28.29 INFO: b'ATTENTION: default value of option mesa_glthread overridden by environment.'
0:28.29 INFO: b'ATTENTION: default value of option mesa_glthread overridden by environment.'
0:28.73 INFO: b'[GFX1-]: Failed to create EGLSurface!: 0x3009'
0:28.73 INFO: b'[GFX1-]: Failed to create EGLSurface'
0:28.74 INFO: b'[GFX1-]: Fallback WR to SW-WR'

I had to correct a few comments:

  • Nvidia is Ubuntu 21.04, not Debian Testing. (Muscle memory, sorry)
  • comment 8: I replaced incorrect "no fallback to sw wr anymore" with "no crash anymore" because it crashed on startup on Intel, but now it doesn't crash anymore.

Nvidia:
comment 0 got 0x3009 (EGL_BAD_MATCH) when dragging tabs.
I could not reproduce it, but I can get 0x3009 (EGL_BAD_MATCH) when starting with GTK_CSD=1 env var.
I am not testing tab dragging, I am testing what I can reproduce (GTK_CSD).

The GTK_CSD should be more tested, we use that on KDE only, Gnome runs without it by default.
We may run more tests with that, perhaps mochitest/reftest suite.

elementary OS seemed to use it, it ran into bug 1683341.

(In reply to Martin Stránský [:stransky] (ni? me) from comment #13)

The GTK_CSD should be more tested, we use that on KDE only, Gnome runs without it by default.
We may run more tests with that, perhaps mochitest/reftest suite.

Martin, IIUC GTK_CSD implies GTK_DECORATION_CLIENT, which is used unconditionally on Wayland and roughly half all X11 DEs[1], including Mutter based ones (popOS). In order to reduce testing overhead and complexity, do you think it would make sense to simply make it the default (and focus on fixing bugs with it)? Some considerations:

  1. alpha visuals are used by default anyway, especially with HW-WR
  2. modern compositors likely benefit from not having to draw decorations themselves, compensating for overhead in Firefox
  3. using shaped textures can be very slow[2][3], e.g. on Mutter (where we don't use it[4])
  4. GTK >= 3 apps usually use CSD AFAIK

Do you know any cases where we definitely need GTK_DECORATION_SYSTEM?

1: https://searchfox.org/mozilla-central/source/widget/gtk/nsWindow.cpp#9077-9120
2: https://bugs.chromium.org/p/chromium/issues/detail?id=1198080
3: https://gitlab.gnome.org/GNOME/mutter/-/issues/1754
4: https://searchfox.org/mozilla-central/source/widget/gtk/nsWindow.cpp#9131-9140

Flags: needinfo?(stransky)

From my experience Mutter/X11 does not work well with GTK_DECORATION_CLIENT, not sure if that was XWindow or X.Org session.

Flags: needinfo?(stransky)

If the EGL_BAD_MATCH only happens on nvidia+EGL I think I know what's causing it. Will write a patch.

gdk_screen_get_rgba_visual internally uses GLX on X11, causing
errors when using the nvidia driver on EGL.

Use our own FindVisual() instead and stop using it on Wayland -
it's not needed there.

Depends on D126922

Assignee: nobody → robert.mader

Independently of bug 1730533 the patch above might help with the EGL_BAD_MATCH errors. Jan, could you give https://treeherder.mozilla.org/jobs?repo=try&revision=86f8f75c5413d7588e3d274adafaa14d8bb53d88 a try?

Mixing GLX/EGL on Nvida is only a problem with GLX_SGI_video_sync.
gdk_screen_get_rgba_visual definitely chooses the correct xvisual.
Then your fix seems to be "choosing the wrong xvisual", but I can test this try build later.

There seem to be two steps:

  1. We need to find the correct (transparent/opaque) xvisual for the widget and set it on the widget.
  • Mesa: gdk_screen_get_rgba_visual or GLX::FindVisual must be used on Mesa to set transparent Xvisual on the widget because Mesa lies about EGL_NATIVE_VISUAL_ID (it always wants opaque Xvisuals). If GLX is the only way to check visual compatibility between GL and X, then it's the solution until Mesa's EGL is fixed.
  • Nvidia:
    • a) gdk_screen_get_rgba_visual or GLX::FindVisual can be used here as well without problem.
    • b) Use EGL to do the same as gdk_screen_get_rgba_visual:
      I assume it would be: Get suitable egl fb configs, query each EGL_NATIVE_VISUAL_ID for XVisualInfo, use the fb config and xvisual if XVisualInfo.depth is desired depth (32 or 24). This egl fb config must be used in step two.
  1. Use remembered egl fb config or detect it:
    Query widget for XVisualID. Get suitable egl fb configs, compare EGL_NATIVE_VISUAL_ID with widget visual, choose egl fb config if they match. (Nvidia: First egl fb config wants opaque xvisual, second one wants transparent xvisual and matches.) If none matched (Mesa), choose the first egl fb config.

gdk_screen_get_rgba_visual definitely chooses the correct xvisual.
...
gdk_screen_get_rgba_visual or GLX::FindVisual can be used here as well without problem.

Err, how do you know? I'm pretty sure it didn't work before bug 1717816

(In reply to Robert Mader [:rmader] from comment #21)

gdk_screen_get_rgba_visual definitely chooses the correct xvisual.
...
gdk_screen_get_rgba_visual or GLX::FindVisual can be used here as well without problem.

Err, how do you know?

Screencast + comment in bug 1731125 comment 19:
I manually tested each suitable Nvidia EGL framebuffer config and found out that the fbconfig/xvisual combination it wants when transparency should work caused EGL_BAD_MATCH because setting the Xvisual on the widget didn't work. Then I set the xvisual earlier, not in the realize_callback anymore, and then transparency worked.
I am using gdk_screen_get_rgba_visual in bug 1731125 comment 23 (visualtest.c) for a transparent EGL window on Nvidia. It is anyway required on EGL/Mesa.
My example app doesn't even need bug 1646135 comment 19.

I'm pretty sure it didn't work before bug 1717816

Transparency on EGL/X11/Nvidia+Mesa was fixed by bug 1663003 comment 17.

  • 2020-09-10: Black window corners and popup borders.
  • 2020-09-11: Transparent window corners + popup borders with WR/EGL/X11/Nvidia.

bug 1663152 comment 11 fixed WebGL2/EGL/X11/Nvidia by requesting OpenGL 3.2 instead of 3.1.

Some change after this broke EGL/Nvidia in general.

It was bug 1717816 which broke transparency on EGL/X11/Nvidia. I just didn't complain because it created a need to properly fix the EGL codepath.
You did it at the same time when rightfully disabling GLX vsync for EGL/Nvidia.
Before and after bug 1717816, EGL/Nvidia was broken because of other reasons. Bug 1646135 and IIRC another patch fixed EGL/Nvidia again.

(In reply to Darkspirit from comment #22)

It was bug 1717816 which broke transparency on EGL/X11/Nvidia. I just didn't complain because it created a need to properly fix the EGL codepath.
You did it at the same time when rightfully disabling GLX vsync for EGL/Nvidia.
Before and after bug 1717816, EGL/Nvidia was broken because of other reasons. Bug 1646135 and IIRC another patch fixed EGL/Nvidia again.

Hm, that's very interesting. Could you give the following try build a go on nvidia and confirm it fixes the transparency issues? https://treeherder.mozilla.org/jobs?repo=try&revision=6e407dcfed4992e9b6d728a7a6483be33f128d42

Forgot to :ni

Flags: needinfo?(jan)

Jan, any news here? Did you get around to give the two builds above a quick test run?

I'll test it today.

I need to make my own Firefox build and test a bit around.

  • comment 19: "Gnome X11 with GTK_CSD=1/Nvidia" is still 0x3009 (EGL_BAD_MATCH). Gnome X11/Nvidia works, but has no transparency. This EGL code is not able to select correct visual, it makes no sense to use it. It would need to select the second egl fb config's xvisual.
  • comment 23 (gdk_screen_get_rgba_visual) works with GLX, but breaks EGL in general (EGL_BAD_MATCH). I need to find the bug with printf debugging.
    • I know from my test app that the visual from gdk_screen_get_rgba_visual is correct/desired.
    • The visual might be set to the wrong widget, CreateConfig might have a bug, it might not get a visual passed in or the visual of the wrong widget.

Btw, it is desired that only 1 of 3 swapInterval(0) that are set on Wayland is also set on X11?
https://searchfox.org/mozilla-central/search?q=fSwapInterval%280%29&path=&case=false&regexp=false

  1. CreateGLContextEGL calls gl::GLContextProviderEGL::CreateForCompositorWidget with nullptr as the widget.

  2. visual is 0 because there is no widget.

  3. CreateConfig:

    • aEnableDepthBuffer is false.
    • First fb config is chosen, likely RGBA8888 with opaque xVisual. The second would likely be the correct transparent one.
  4. RenderCompositor::Create has a widget.

https://searchfox.org/mozilla-central/rev/d37daf2f82ed22b6a2a5cbbb975423825dfd69fa/gfx/webrender_bindings/RenderThread.cpp#1164
CreateGLContextEGL
CreateForCompositorWidget
no aCompositorWidget
no window
GLContextEGLFactory::CreateImpl
no aWindow
calling CreateConfig with visualID 0
CreateConfig: aVisual: 0
want 32 depth
got 33 egl fb configs
config[0]
no aEnableDepthBuffer
found egl fb config
RenderCompositor::Create
RenderCompositor::Create: have aWidget
have eglCompositor
CreateEGLSurfaceForCompositorWidget
Crash Annotation GraphicsCriticalError: |[0][GFX1-]: Failed to create EGLSurface!: 0x3009 (t=7.61261) [GFX1-]: Failed to create EGLSurface!: 0x3009
Crash Annotation GraphicsCriticalError: |[0][GFX1-]: Failed to create EGLSurface!: 0x3009 (t=7.61261) |[1][GFX1-]: Failed to create EGLSurface (t=7.61264) [GFX1-]: Failed to create EGLSurface
Crash Annotation GraphicsCriticalError: |[0][GFX1-]: Failed to create EGLSurface!: 0x3009 (t=7.61261) |[1][GFX1-]: Failed to create EGLSurface (t=7.61264) |[2][GFX1-]: Fallback WR to SW-WR (t=8.11584) [GFX1-]: Fallback WR to SW-WR

(In reply to Darkspirit from comment #27)

I need to make my own Firefox build and test a bit around.

  • comment 19: "Gnome X11 with GTK_CSD=1/Nvidia" is still 0x3009 (EGL_BAD_MATCH). Gnome X11/Nvidia works, but has no transparency. This EGL code is not able to select correct visual, it makes no sense to use it. It would need to select the second egl fb config's xvisual.

Ok, good to know that that one doesn't help.

  • comment 23 (gdk_screen_get_rgba_visual) works with GLX, but breaks EGL in general (EGL_BAD_MATCH). I need to find the bug with printf debugging.
    • I know from my test app that the visual from gdk_screen_get_rgba_visual is correct/desired.

The difference might be that in your test app AFAICS you use gdk_screen_get_rgba_visual() before creating the EGL context. Normally we already have our global one (the one with no window). gdk_screen_get_rgba_visual() will then create a GLX context when we already have an EGL context and that appears to be what breaks on the Nvidia driver. That's the reason why we need to use EGL to select the visual. And why it needs to finally get fixed (likely by somehow specing how to select a visual with alpha support - random orderings are not really a good solution).

Btw, it is desired that only 1 of 3 swapInterval(0) that are set on Wayland is also set on X11?
https://searchfox.org/mozilla-central/search?q=fSwapInterval%280%29&path=&case=false&regexp=false

Hm, that indeed looks like an oversight to me. However, AFAICS the two cases are only hit when using the opengl layers - which I think only happens when running with gfx.webrender.software.opengl enabled. I don't think we want to officially use/support that on Linux, as it would increase our complexity matrix quite a bit again. On Android it makes sense, as they don't need the pure software backend.

(In reply to Robert Mader [:rmader] from comment #29)

  • comment 23 (gdk_screen_get_rgba_visual) works with GLX, but breaks EGL in general (EGL_BAD_MATCH). I need to find the bug with printf debugging.
    • I know from my test app that the visual from gdk_screen_get_rgba_visual is correct/desired.

The difference might be that in your test app AFAICS you use gdk_screen_get_rgba_visual() before creating the EGL context. Normally we already have our global one (the one with no window). gdk_screen_get_rgba_visual() will then create a GLX context when we already have an EGL context and that appears to be what breaks on the Nvidia driver. That's the reason why we need to use EGL to select the visual. And why it needs to finally get fixed (likely by somehow specing how to select a visual with alpha support - random orderings are not really a good solution).

I am confused or you are confused or we both are confused in different aspects:

EGL_BAD_MATCH is caused by trying to call eglCreateWindowSurface with an EGLconfig that is incompatible to the Xwindow.
https://searchfox.org/mozilla-central/rev/b822a27de3947d3f4898defac6164e52caf1451b/gfx/gl/GLContextProviderEGL.cpp#194

Therefore CreateConfig must choose the fb config that wants the xvisual that our Xwindow has or will have,
but aVisual is always 0 because nullptr is passed as widget. Because aVisual is 0, the code to exclude incompatible egl configs is never used.

It's EGL CreateConfig which is broken: aVisual must not be 0. There must be aVisual so that the code can skip the first fb config on Nvidia to select the second which is compatible.
Can the visual be remembered somewhere or can gdk_screen_get_rgba_visual also be callled when creating the shared context? Because the shared context needs an EGLconfig that is compatible to all windows it is used with. (IIUC: Either all windows are transparent, or none can be.)

I think I have read somewhere that GdkVisual is per-screen:
Does that mean that if the shared context is used for all windows and if a user has two Xscreens because he has two graphics cards with each one monitor, then a Firefox window on the second Xscreen/graphics card would need a different Xvisual, EGLconfig and EGLSurface than the other Firefox window on the other graphics card/Xscreen? But wouldn't that mean that at some point the window would be half visible on one graphics card/screen/monitor and on the other, which could never be compatible? Would it need to fall back to SW WR?
Or is it impossible to move a window between two Xscreens?

What I also did not understand when comparing Firefox to my demo app:
It might be required to listen for Gtk screen-changed events (IIUC: switching between virtual terminals (Ctrl+Alt+F<x>), suspend&resume?) and then to set new visuals on all widgets, to create a new compatible EGLconfig and to create a new EGLSurface from EGLconfig and Xwindow who are compatible to each other. Couldn't that be the underlying problem of bug 1279309?

Flags: needinfo?(jan)

Or other thought: Could eglCreateContext keep using its wrong EGLconfig, but eglCreateWindowSurface get a different EGLconfig that is compatible to its xWindow? CreateSurfaceFromNativeWindow gets passed in a widget, so it might be posstible to create a compatible EGLconfig at this point for its window?

(In reply to Darkspirit from comment #31)

Or other thought: Could eglCreateContext keep using its wrong EGLconfig, but eglCreateWindowSurface get a different EGLconfig that is compatible to its xWindow? CreateSurfaceFromNativeWindow gets passed in a widget, so it might be posstible to create a compatible EGLconfig at this point for its window?

printf debugging:

CreateGLContextEGL
CreateForCompositorWidget
no aCompositorWidget
no window
GLContextEGLFactory::CreateImpl
no aWindow
calling CreateConfig with visualID 0
CreateConfig: aVisual: 0
want 32 depth
got 33 egl fb configs
config[0]
no aEnableDepthBuffer
EGL_CONFIG_ID: 0x7 <---------------------- CreateConfig in GLContextProviderEGL.cpp
found egl fb config
x_visual_ptr->visualid: 0x82 <------------- gdk_screen_get_rgba_visual in nsWindow.cpp
RenderCompositor::Create
RenderCompositor::Create: have aWidget
have eglCompositor
CreateEGLSurfaceForCompositorWidget
Crash Annotation GraphicsCriticalError: |[0][GFX1-]: Failed to create EGLSurface!: 0x3009 (t=24.8237) [GFX1-]: Failed to create EGLSurface!: 0x3009
Crash Annotation GraphicsCriticalError: |[0][GFX1-]: Failed to create EGLSurface!: 0x3009 (t=24.8237) |[1][GFX1-]: Failed to create EGLSurface (t=24.8238) [GFX1-]: Failed to create EGLSurface
Crash Annotation GraphicsCriticalError: |[0][GFX1-]: Failed to create EGLSurface!: 0x3009 (t=24.8237) |[1][GFX1-]: Failed to create EGLSurface (t=24.8238) |[2][GFX1-]: Fallback WR to SW-WR (t=25.4984) [GFX1-]: Fallback WR to SW-WR
RenderCompositor::Create
RenderCompositor::Create

meaning:

  1. Because aVisual is 0, CreateConfig selects first suitable egl fb config 0x007 which wants opaque xvisual 0x02c.
  2. gdk_screen_get_rgba_visual sets transparent xvisual 0x082 on the widget which would be wanted by egl fb config 0x008.
  3. CreateSurfaceFromNativeWindow then fails to combine egl fb config 0x007 (which wants opaque xvisual 0x02c) with the Xwindow with transparent xvisual 0x082 (which would be supported by egl fb config 0x008).

Idea from comment 31: Could CreateSurfaceFromNativeWindow query its widget's xvisual to call CreateConfig for itself to get a compatible EGLconfig for itself?
Then, both egl fb configs, the one for the EGLContext (0x007) and the one for the EGLSurface (0x008) would have the same contents except that they want different xvisuals:

nvidia-settings --eglinfo
--fc- --vi- --vt-- buf lv rgb colorbuffer am lm dp st -bind cfrm sb sm cav -----pbuffer----- swapin nv   rn   su -transparent--
  id    id         siz l  lum  r  g  b  a sz sz th en  -  a            eat widt hght max-pxs  mx mn rd   ty   ty typ  r  g  b  
-------------------------------------------------------------------------------------------------------------------------------

0x007 0x02c 0x8002  32  0 rgb  8  8  8  8  0  0  0  0  .  . 0x4D  0  0   . 8000 8000 40000000  8  0  .   4d  807   .  0  0  0
0x008 0x082 0x8002  32  0 rgb  8  8  8  8  0  0  0  0  .  . 0x4D  0  0   . 8000 8000 40000000  8  0  .   4d  807   .  0  0  0

$ glxinfo
132 GLX Visuals
    visual  x   bf lv rg d st  colorbuffer  sr ax dp st accumbuffer  ms  cav
  id dep cl sp  sz l  ci b ro  r  g  b  a F gb bf th cl  r  g  b  a ns b eat
----------------------------------------------------------------------------
0x02c 24 tc  0  32  0 r  y .   8  8  8  8 .  s  4  0  0 16 16 16 16  0 0 None
0x082 32 tc  0  32  0 r  y .   8  8  8  8 .  s  4  0  0 16 16 16 16  0 0 None

IIUC we generally we don't want to create EGL contexts/configs for the visual, but rather the other way around: we create one global EGL context and then query the appropriate visual (and set it for each window via gtk_widget_set_visual()). From what you posted, I think what we get wrong is the initial EGL context creation: since switching to a global context, AFAICS the visualID argument for CreateConfig is simply dead[1].
Given that we now always want EGL configs with alpha and also always want visuals with alpha when using EGL, I think what we need to make sure is to hardcode choosing a config for the global config that matches with a visual with alpha (and then make sure we always query that visual again).
I'll check that out.

1: https://searchfox.org/mozilla-central/source/gfx/gl/GLContextProviderEGL.cpp#252-266

Status: UNCONFIRMED → RESOLVED
Closed: 4 years ago
Resolution: --- → DUPLICATE
Attachment #9244368 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: