Closed Bug 1693472 Opened 4 years ago Closed 4 years ago

Menu items under the cursor usually aren't highlighted in Firefox 86.0a1-87.0a1 on Wayland in Plasma

Categories

(Core :: Widget: Gtk, defect, P2)

Firefox 87
x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
89 Branch
Tracking Status
firefox86 --- wontfix
firefox87 --- wontfix
firefox88 --- wontfix
firefox89 --- fixed

People

(Reporter: matt.fagnani, Assigned: stransky)

References

(Blocks 1 open bug, )

Details

Attachments

(2 files)

Description:

I'm using Firefox Nightly 87.0a1 on Wayland with WebRender enabled in a Fedora 34 KDE Plasma installation. When I click on menus in the menu bar of Firefox Nightly 86.0a1 (2021-1-15) - 87.0a1 (2021-2-17), the menu items under the cursor usually aren't highlighted. Sometimes just the first item in the menu was highlighted. Occasionally multiple items in the menu were highlighted at the same time. The menus shown when right-clicking on a page are also affected by this issue. This problem started with 86.0a1 (2021-1-15). I ran the following to look for the commit involved
MOZ_ENABLE_WAYLAND=1 mozregression --good 2021-1-13 --bad 2021-1-18

The end of the process showed
13:17.81 INFO: Narrowed integration regression window from [25680895, 8a9e98e5] (4 builds) to [376c2ff6, 8a9e98e5] (2 builds) (~1 steps left)
13:17.81 INFO: No more integration revisions, bisection finished.
13:17.81 INFO: Last good revision: 376c2ff62872fbf7cc25803f9cbdcd42f8c1fe6c
13:17.81 INFO: First bad revision: 8a9e98e50d031c7a72429a08aad7325f50e34d2e
13:17.81 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=376c2ff62872fbf7cc25803f9cbdcd42f8c1fe6c&tochange=8a9e98e50d031c7a72429a08aad7325f50e34d2e

The first bad revision was https://hg.mozilla.org/integration/autoland/rev/8a9e98e50d031c7a72429a08aad7325f50e34d2e

Bug 1686703 [Wayland] Merge GetWaylandBufferWithSwitch()/GetWaylandBufferRecent() to GetWaylandBuffer(), r=jhorak

Depends on D101746

Differential Revision: https://phabricator.services.mozilla.com/D101747

The problem doesn't affect 86.0a1-87.0a1 on X in Plasma on Wayland with the same Firefox settings and system versions.

Steps to reproduce:

  1. Boot a Fedora 34 KDE Plasma installation updated to 2021-2-17
  2. Log in to Plasma 5.21.0 on Wayland
  3. Install or update to Firefox Nightly 87.0a1 (2021-2-17)
  4. Start Nightly on Wayland with WebRender enabled
    MOZ_ENABLE_WAYLAND=1 ./firefox
  5. Select the Bookmarks menu
  6. Move the cursor down over the Bookmarks menu
  7. Repeat 5-6 for other menus in the menu bar
  8. Go to a site such as https://bugzilla.mozilla.org
  9. Right-click somewhere on the page
  10. Move the cursor up and down the right-click menu

Versions:
Firefox Nightly 86.0a1 (2021-1-15) - 87.0a1 (2021-2-17)
gtk3-3.24.25-2.fc34.x86_64
plasma-desktop-5.21.0-2.fc34.x86_64
qt5-qtbase-5.15.2-13.fc34.x86_64
kf5-plasma-5.79.0-2.fc34.x86_64
mesa-dri-drivers-21.0.0~rc4-1.fc34.x86_64

Blocks: wayland-kde
Priority: -- → P2

Can you try to set widget.wayland-smooth-rendering to true at about:config and restart the browser?
Thanks.

Flags: needinfo?(matthew.fagnani)

No, setting widget.wayland-smooth-rendering doesn't resolve the issue. It's not a frame callback issue at quick glance. Firefox receives done events for all committed frame callbacks.

For some reason, Firefox just stops committing new buffers.

If I run Firefox with MOZ_LOG="WidgetWayland:5", I see lots of these log messages while hovering menu items:

[Parent 8601: Compositor]: D/WidgetWayland WindowSurfaceWayland::LockWaylandBuffer [0x7fad3e6ce800] Requesting buffer 490 x 728
[Parent 8601: Compositor]: D/WidgetWayland WindowSurfaceWayland::GetWaylandBuffer [0x7fad3e6ce800] Requested buffer [490 x 728] can switch 0
[Parent 8601: Compositor]: D/WidgetWayland     Recent WindowBackBuffer [0x7fad3d3cf9c0]
[Parent 8601: Compositor]: D/WidgetWayland         WindowBackBuffer [0][0x7fad3d3cf9c0] width 490 height 728 attached 1
[Parent 8601: Compositor]: D/WidgetWayland         WindowBackBuffer [1][0x7fad2c12fb40] width 490 height 728 attached 0
[Parent 8601: Compositor]: D/WidgetWayland         WindowBackBuffer [2] null
[Parent 8601: Compositor]: D/WidgetWayland     Buffer is attached and we can't switch, return null
[Parent 8601: Compositor]: D/WidgetWayland WindowSurfaceWayland::LockWaylandBuffer [0x7fad3e6ce800] Got buffer (nil)
[Parent 8601: Compositor]: D/WidgetWayland     Quit - no pending commit.

Note that contrary to other compositors, kwin doesn't release shared memory client buffers after uploading data to opengl texture, which is totally legit. This might be a culprit.

(In reply to Martin Stránský [:stransky] from comment #1)

Can you try to set widget.wayland-smooth-rendering to true at about:config and restart the browser?
Thanks.

Martin, there was no change in this menu highlighting problem in 87.0a1 (2021-2-19) on Wayland after I set widget.wayland-smooth-rendering to true in about:config and restarted. Should I make a report about this issue for kwin on bugs.kde.org given what Vlad mentioned? Thanks.

Flags: needinfo?(matthew.fagnani)

This highlighting problem still happens with Firefox Nightly 88.0a1 (2021-3-6) and earlier and 86.0 on Wayland with WebRender enabled in a Fedora 34 KDE Plasma installation and Plasma 5.21.2 and earlier. I reported this problem at https://bugs.kde.org/show_bug.cgi?id=434056

I haven't seen this problem in 88.0a1 (2021-3-7) and 86.0 on Wayland in GNOME 40 beta on Wayland in Fedora 34 when I tried to reproduce it a few times. Thanks.

Summary: Menu items under the cursor usually aren't highlighted in Firefox 86.0a1-87.0a1 on Wayland → Menu items under the cursor usually aren't highlighted in Firefox 86.0a1-87.0a1 on Wayland in Plasma

Other compositors handle shared memory client buffers differently, you may not hit this bug on GNOME or other desktop environments. But still, kwin does nothing wrong as far as I can tell.

89.0a1 (2021-3-26) and earlier on Wayland in Plasma 5.21.3 is still affected by this problem in Fedora 34. Thunderbird Daily 89.0a1 (2021-3-26) and earlier on Wayland in Plasma has had the same issue. Could Vlad's idea at https://bugs.kde.org/show_bug.cgi?id=434056#c4 be tried? Thanks.

See Also: → 1692748

(In reply to Vlad Zahorodnii [:zzag] from comment #5)

Note that contrary to other compositors, kwin doesn't release shared memory client buffers after uploading data to opengl texture, which is totally legit. This might be a culprit.

(In reply to Vlad Zahorodnii [:zzag] from comment #9)

Other compositors handle shared memory client buffers differently, you may not hit this bug on GNOME or other desktop environments. But still, kwin does nothing wrong as far as I can tell.

Yes, this sounds quite likely. If KWin doesn't release the shm-buffer early, as most other compositors do, we'd need to mirror EGL behaviour which IIRC uses three buffers (even though two might be sufficient here, not exactly sure). Martin, this is probably something you want to look into?

FTR: there's a new option called gfx.webrender.software.opengl (introduced in bug 1673342) which according to bug 1692748 comment 15 works around the problem in nightly - as we can normally rely on EGL being present on Wayland it could become the new default at some point. But as shm is the only required Wayland buffer format, we should still make sure it works everywhere according to spec, as fallback.

Forgot to NI - see above :)

Flags: needinfo?(stransky)

For the record: the Weston demo clients appear to use double-buffering[1]. According to the docs[2] WindowSurfaceWayland does the same - so in theory things should work IIUC. So this is maybe just a hickup in the logic somewhere :/

1: https://gitlab.freedesktop.org/wayland/weston/-/blob/8df8532ee5ccfdbf807073139b06bc44733bb170/clients/simple-damage.c#L81
2: https://searchfox.org/mozilla-central/source/widget/gtk/WindowSurfaceWayland.cpp#123-131

(In reply to Robert Mader [:rmader] from comment #16)

(In reply to W.Pelser from comment #14)

If I do in a terminal
:~> MOZ_LOG="WidgetWayland:5" /home/walther/.firefox/nightly/firefox/firefox
There is sometimes a endless amount of rows with the same contents, which do not stop. The lines are always like this:

Err yes, that's how it should be :)
This is verbose debug output which produces a lot of lines on every repaint.

Comment #14 is obsolete, as I found the old entry webgl.force-enabled true after having written comment #15. Since having deleted this one, I have similar output to comment #4

Matt, in the comment 0 you said that you observe the effect with WR enabled - however the bisected commit and the following discussion was about WindowSurfaceWayland which we don't use with HW-WR. WindowSurfaceWayland and all the custom buffer management is only used on the basic or SW-WR ("Webrender (software)" in about:support). Could you shortly clarify if you see the issue on HW-WR as well (" WebRender" in about:config)?

P.S.: HW-WR requires OpenGL 3.2, any hardware not supporting that should fall back to software-WR or basic

Flags: needinfo?(matthew.fagnani)

(In reply to Robert Mader [:rmader] from comment #18)

Matt, in the comment 0 you said that you observe the effect with WR enabled - however the bisected commit and the following discussion was about WindowSurfaceWayland which we don't use with HW-WR. WindowSurfaceWayland and all the custom buffer management is only used on the basic or SW-WR ("Webrender (software)" in about:support). Could you shortly clarify if you see the issue on HW-WR as well (" WebRender" in about:config)?

P.S.: HW-WR requires OpenGL 3.2, any hardware not supporting that should fall back to software-WR or basic

This problem has happened with WebRender (HW-WR) as I originally reported, OpenGL, and Basic (in Troubleshoot Mode) compositors in Nightly 89.0a1 and earlier on Wayland in Plasma. I set gfx.webrender.software.opengl=true and restarted Firefox, and the problem still happened. Compositing still showed just WebRender in Troubleshooting information with gfx.webrender.software.opengl=true though so it might've still been using HW-WR. The integrated Radeon R5 GPU in my laptop uses OpenGL 4.6 with the radeonsi driver from Mesa 21.0.1 according to Troubleshooting information. 89.0a1 and recent Nightly versions have defaulted to using WebRender HW-WR on my system.

The first bad commit in 86.0a1 (2021-1-15) "Bug 1686703 [Wayland] Merge GetWaylandBufferWithSwitch()/GetWaylandBufferRecent() to GetWaylandBuffer" appeared to reduce the number of Wayland buffer functions by one https://hg.mozilla.org/integration/autoland/rev/8a9e98e50d031c7a72429a08aad7325f50e34d2e Before 86.0a1 (2021-1-15), the right-click menus were frequently flickering with multiple simultaneously highlighted entries as I reported at https://bugzilla.mozilla.org/show_bug.cgi?id=1660567 Vlad wrote "Well, Firefox needs to create a new buffer if the existing one is held by the compositor." https://bugs.kde.org/show_bug.cgi?id=434056#c4 Could that approach be tried to address this issue? Thanks.

Flags: needinfo?(matthew.fagnani)

Matt: Thanks for clarifying. Again: if you use HW-WR, then your bisect is most likely wrong and the comments about new buffers are unrelated - WindowSurfaceWayland is simply not used with HW-WR.

Can I ask you for one more step? Running with MOZ_LOG="WidgetWayland:5" - if you use HW-WR, WindowSurfaceWayland should never show up in the log.

Flags: needinfo?(matthew.fagnani)

I ran 89.0a1 with MOZ_LOG="WidgetWayland:5" MOZ_ENABLE_WAYLAND=1 environment variables in Konsole with WebRender (HW-WR) compositing according to Troubleshooting information in Plasma 5.21.3 in Fedora 34. gfx.webrender.software.opengl was false. The attached log had 1321 lines containing WindowSurfaceWayland as calculated by grep WindowSurfaceWayland firefox-wayland-moz-log-menus-89.0a1-1.txt | wc -l

Flags: needinfo?(matthew.fagnani)

(In reply to Robert Mader [:rmader] from comment #20)

Matt: Thanks for clarifying. Again: if you use HW-WR, then your bisect is most likely wrong and the comments about new buffers are unrelated - WindowSurfaceWayland is simply not used with HW-WR.

Can I ask you for one more step? Running with MOZ_LOG="WidgetWayland:5" - if you use HW-WR, WindowSurfaceWayland should never show up in the log.

When I started 89.0a1 with MOZ_LOG="WidgetWayland:5" MOZ_ENABLE_WAYLAND=1 in the run I attached the attached log for, I selected the Help menu and then Troubleshooting information. I then clicked on the Bookmarks and Tools menus and moved the cursor over them and the items were usually not highlighted as reported.

Thanks, that's super helpful and I was able to verify this now.

So it looks like we are using either basic or SW-WR (both use WindowSurfaceWayland) for certain widgets, regardless of whether we use HW-WR or SW-WR+GL for the rest. That means the discussion above is indeed the right direction and also explains other bugs where people see similar behaviour for the whole browser window when not using HW-WR.

(In reply to Robert Mader [:rmader] from comment #23)

Thanks, that's super helpful and I was able to verify this now.

So it looks like we are using either basic or SW-WR (both use WindowSurfaceWayland) for certain widgets, regardless of whether we use HW-WR or SW-WR+GL for the rest. That means the discussion above is indeed the right direction and also explains other bugs where people see similar behaviour for the whole browser window when not using HW-WR.

I've never seen "Webrender (software)" in the Compositing field of the Troubleshooting information, and I've looked at that page many times in recent months. The software WebRender mentioned at https://bugzilla.mozilla.org/show_bug.cgi?id=1673342 looked like it was merged to Nightly around 2021-3-1. This problem with the menu items not being usually highlighted started with 86.0a1 (2021-1-15). This problem might be more likely with the Basic compositing of menus than SW-WR based on those observations and your message. No problem. Thanks.

If "gfx.webrender.software.unaccelerated-widget.force" is changed to true
The highlighting of the items in the sub windows of the Menu Bar is working. All other sub windows seem to work too.
CodeName=Firefox Nightly
Version=89.0a1
BuildID=20210328095625
SourceRepository=https://hg.mozilla.org/mozilla-central
SourceStamp=7798a3347ef9b969e74e86d1f57b322f18d6db0a
ID={ec8030f7-c20a-464f-9b0e-13a3a9e97384}

(In reply to W.Pelser from comment #25)

If "gfx.webrender.software.unaccelerated-widget.force" is changed to true
The highlighting of the items in the sub windows of the Menu Bar is working. All other sub windows seem to work too.
CodeName=Firefox Nightly
Version=89.0a1
BuildID=20210328095625
SourceRepository=https://hg.mozilla.org/mozilla-central
SourceStamp=7798a3347ef9b969e74e86d1f57b322f18d6db0a
ID={ec8030f7-c20a-464f-9b0e-13a3a9e97384}

This works not completely.
If I open a site with downloadable videos the highlighting of the right click items works from up to down, but sometimes the first items from above are not highlighted. F.I. on this site: https://www.3sat.de/kultur.
Same happens with right click on this (https://bugzilla.mozilla.org/show_bug.cgi?id=1693472) site: the items above Select All are not highlighted.
Same with the sub window History of the Menu Bar: Sometimes all items are highlighted, then move the cursor up and down and the effect is, that some items above and down are not highlighted any more.

I see, yes. That's definitely a bug at Firefox side then. It's interesting that KWin does not release the buffers - I wonder why it keeps it?
But well, there's perhaps no point to optimize the SW backend much as we prefer to use WebRender anyway so let's fix that.

Assignee: nobody → stransky
Flags: needinfo?(stransky)

(In reply to Martin Stránský [:stransky] (ni? me) from comment #27)

I see, yes. That's definitely a bug at Firefox side then. It's interesting that KWin does not release the buffers - I wonder why it keeps it?

With the current kwin architecture, compositing can be restarted at any moment. This, in its turn, means that kwin has to be able to upload data from shm buffers to opengl texture whenever it feels like it.

The broken scenario here is:

  1. shm backbuffer is updated and sent to compositor
  2. small screen area is updated, painting is cached
  3. Firefox is waiting for buffer return to flush cached painting

Robert, when we fail at 3) (buffer won't be returned), we need to allocate a new shm buffer, copy content from 1) to a new shm buffer + cached painting from 2) and submit as a new buffer, right? As the damage region is only optional we always need to submit fully updated wl_buffer, right?

Flags: needinfo?(robert.mader)

(In reply to Vlad Zahorodnii [:zzag] from comment #28)

(In reply to Martin Stránský [:stransky] (ni? me) from comment #27)

I see, yes. That's definitely a bug at Firefox side then. It's interesting that KWin does not release the buffers - I wonder why it keeps it?

With the current kwin architecture, compositing can be restarted at any moment. This, in its turn, means that kwin has to be able to upload data from shm buffers to opengl texture whenever it feels like it.

Well, I'd say keeping wl_buffers for that reason indefinitely is a bit "creative" but it isn't a protocol violation for sure :)

(In reply to Martin Stránský [:stransky] (ni? me) from comment #29)

Robert, when we fail at 3) (buffer won't be returned), we need to allocate a new shm buffer, copy content from 1) to a new shm buffer + cached painting from 2) and submit as a new buffer, right? As the damage region is only optional we always need to submit fully updated wl_buffer, right?

Correct, AFAIK.

When using HW-WR, we usually have two to three buffer and use buffer age to detect which areas need to get updated - not sure if it makes sense to try to copy that approach here, or keep the current one and add a slow fallback case.

Flags: needinfo?(robert.mader)

Let's try the direct draw on KWim first and we can fallback to wl_buffer copy later.

Vlad, does KWim always honor the damaged area (set by wl_surface_damage_buffer()) or does it also show / update screen from wl_buffer areas outside?
Thanks.

Flags: needinfo?(vlad.zahorodnii)

If a surface is damaged, kwin will try to repaint only the dirty region. If a surface is committed several times within a single vblank interval, kwin will repaint the accumulated dirty region.

In general, kwin tries not to go beyond the damaged region, but in rare cases it can do otherwise. For example, if the buffer transform or the buffer size changes, etc. In those cases, the old texture will be discarded and the data from the attached shm buffer will be re-uploaded to a new texture.

Flags: needinfo?(vlad.zahorodnii)
Attachment #9212839 - Attachment description: Bug 1693472 [Wayland] Always use direct drawing on KWim, r?jhorak → Bug 1693472 [Wayland] Always use direct drawing on KWin, r?jhorak
Pushed by stransky@redhat.com: https://hg.mozilla.org/integration/autoland/rev/6410ed97be3d [Wayland] Always use direct drawing on KWin, r=rmader
Status: UNCONFIRMED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 89 Branch

Eh. This will also make Firefox mis-behave when run under a a Wayland compositor that uses a software renderer, such as Weston with its Pixman renderer.

Let's try the direct draw on KWim first and we can fallback to wl_buffer copy later.

Can we open a new bug for this?

The current fix is not a fix at all. It's just a hacky workaround.

Just as a heads-up, wlroots (and probably other compositors as well) will at some point do the same as KWin. As stated elsewhere, wlroots will not set XDG_CURRENT_DESKTOP, specifically to prevent clients from doing gross workarounds like this.

Firefox just needs to have proper shm buffer management.

https://github.com/swaywm/wlroots/issues/2705

(In reply to emersion from comment #39)

Just as a heads-up, wlroots (and probably other compositors as well) will at some point do the same as KWin. As stated elsewhere, wlroots will not set XDG_CURRENT_DESKTOP, specifically to prevent clients from doing gross workarounds like this.

Firefox just needs to have proper shm buffer management.

https://github.com/swaywm/wlroots/issues/2705

But that "proper shm buffer management" means peformance impact. For instance for non-fullscreen video playback we would need to copy whole screen for every video frame. (copy whole wl_buffer to another one and then copy video frame into it). Is that the price you want to pay?

Actually I'm considering to enable such "hack" for all compositors.

"Proper buffer management" means having multiple buffers and alternate between them, waiting for them to be released before re-using them. I don't understand why fullscreen video playback needs to perform any additional copy.

"Proper buffer management" is actually implemented by all wl_shm Wayland clients. Except Firefox.

(In reply to Martin Stránský [:stransky] (ni? me) from comment #40)

But that "proper shm buffer management" means peformance impact. For instance for non-fullscreen video playback we would need to copy whole screen for every video frame. (copy whole wl_buffer to another one and then copy video frame into it). Is that the price you want to pay?

I don't think that's really the case. What we'd need to do is make it behave like GL buffers with partial damage. So let WR always directly paint to the buffer we want to commit next. Usually we'd have double-buffering I assume. So it would be:

  • full damage on buffer 1
  • full damage on buffer 2
  • only videa area damage on buffer 1
  • only videa area damage on buffer 2

In cases were we have different damage regions, we need to ask WR to update the union of the damage regions since the buffer was last used. We have the API for that already for GL (ShouldDrawPreviousPartialPresentRegions()).

Yes, that sounds correct to me if you want to implement damage tracking.

See https://bugzilla.mozilla.org/show_bug.cgi?id=1693472#c5

KWin does not release wl_buffer unless a new one is supplied and Firefox Basic compositor (which is default right now) does not do double buffering - only single buffer is used.

If damage region hint is only optional to fully comply with Wayland spec means to always submit valid wl_buffer (emulate double-buffering by buffer copy).

Firefox Basic compositor (which is default right now) does not do double buffering - only single buffer is used.

Hm, that's indeed a problem. However, the graphics team is pushing hard to use SW-WR instead of basic (and want to remove basic ASAP). So if we can make double buffering work with SW-WR, we can simply switch it on for Wayland (and make basic use the slow path as long as it still exists).

The only thing required for Wayland clients is to implement multiple buffering to ensure buffers are not re-used prior to wl_buffer.release. The damage region is optional.

See wl_buffer::release at https://wayland.freedesktop.org/docs/html/apa.html

It directly imply the possibility to re-use wl_buffers to optimize drawing. I think mutter/kwin already do their best to release wl_buffers asap or at least honor the damage area to improve performance.

If you believe 100% compatibility is the better way Firefox can surely imitate double buffering and do the wl_buffer copy for wlroots (when XDG_CURRENT_DESKTOP is not set) for Basic compositor.

I think mutter/kwin already do their best to release wl_buffers asap

The problem here is that in order to survive stuff like GPU hangs etc., most compositors, including Gnome, will most likely stop doing so eventually. I don't think we really have a choice here in the long run.

I opened bug 1708416 now with a RFC patch

See Also: → 1708416

Great, thanks!

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: