Closed Bug 1650246 Opened 4 years ago Closed 3 years ago

[X11][EGL] Incorrectly sized library and account menu until hovering them

Categories

(Core :: Graphics: WebRender, defect, P3)

x86_64
Linux
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox-esr68 --- unaffected
firefox-esr78 --- unaffected
firefox77 --- unaffected
firefox78 --- unaffected
firefox79 --- unaffected
firefox80 --- disabled
firefox81 --- disabled
firefox82 --- disabled
firefox83 --- disabled

People

(Reporter: jan, Unassigned)

References

(Blocks 2 open bugs)

Details

(Keywords: correctness, nightly-community)

Attachments

(2 files)

Gnome X11, Debian Testing, Intel HD Graphics 630 (KBL GT2), Mesa 20.1.2

Ctrl+Alt+Shift+R Screencast:
$ pip3 install --upgrade mozregression
$ MOZ_X11_EGL=1 mozregression --repo autoland --launch 26aa0ccc6c5e --pref gfx.webrender.all:true layers.gpu-process.enabled:false -a about:blank

With GLX, the top-left quarter of these menus is sometimes cut off (transparent), but with EGL they seem to be incorrectly sized until you hover them.

(Unrelated: On XWayland (GLX and EGL) these menus need about two seconds before they begin to react to hover actions.)

Can confirm, Mutter/GS/xserver master.

Similar glitches on Intel(R) HD Graphics 530 (SKL GT2) with Mesa 20.1.2 on X.Org 1.20.8 (with XFCE as a window manager/DE).
Visual glitches go away when I try with LIBGL_ALWAYS_SOFTWARE=1 though.

I see it with llvmpipe as well. This is on Gnome Xwayland, Xwayland shows a black background instead of transparency when the bug occurs.

LIBGL_ALWAYS_SOFTWARE=1 MOZ_X11_EGL=1 mozregression --repo autoland --launch 26aa0ccc6c5e --pref gfx.webrender.all:true layers.gpu-process.enabled:false -a about:blank

(For these tests I disabled the gpu process to not run into bug 1572625 comment 10, but actually it doesn't seem to matter whether it is enabled or not.)

Severity: -- → S3
Priority: -- → P3

(In reply to Rinat from comment #2)

Similar glitches on Intel(R) HD Graphics 530 (SKL GT2) with Mesa 20.1.2 on X.Org 1.20.8 (with XFCE as a window manager/DE).
Visual glitches go away when I try with LIBGL_ALWAYS_SOFTWARE=1 though.

Hm, running with LIBGL_ALWAYS_SOFTWARE=1 does result in falling back to the software renderer here - does Webrender work for you with LIBGL_ALWAYS_SOFTWARE=1 / LLVMpipe?

Using the software/basic renderer does indeed not show the issue here, as well as the OpenGL renderer. So this seems to be a WR only issue - can you confirm?

it happens to me in the close windows, in the res Idont know because I have not enought time to test. mesa and amdgpu with testing manjaro

I wouldn't be surprised if this was somehow related to bug 1502519 which occurs with GLX.
(Non-default) EGL/Win10 also has/had such a problem: bug 1517472.

(In reply to Robert Mader [:rmader] from comment #4)

Hm, running with LIBGL_ALWAYS_SOFTWARE=1 does result in falling back to the software renderer here

Yes, I forgot about that. It did fallback to Basic with LIBGL_ALWAYS_SOFTWARE=1. And when I force WebRender, there are the same glitches.

See Also: → 1652310
See Also: → 1653711

Might be related:
(Nicolas Silva [:nical] from bug 1652743 comment 10)

This bug is frustrating. I can't actually reproduce this one but I can fairly easily reproduce bug 1653711 which I am pretty sure is the same thing. I tried implementing gdk_window_get_origin in terms of gdk_window_get_position as suggested by Rinat, and I still get the same issue happening about 50% of the time. It's interesting because that new version doesn't do any caching, so it isn't a classical cache invalidation bug.

My current theory is that there is something very racy happening towards the creation of widgets, and this raciness is hidden by the many synchronous queries we are issuing via gdk_window_get_origin. I'm going to be away for two weeks starting next, so I'll probably have to back out the window origin caching and reevaluate later.

(Nicolas Silva [:nical] from bug 1652743 comment 11)

In fact it was pretty easy to confirm that the problem is caused by the absence of xserver syncrhonization rather than how the window origin is calculated: I tried first calling gdk_window_get_origin, then throwing away the result and computing the window origin by traversing the parent hierarchy and accumulating gdk_window_get_position, and the bug doesn't reproduce anymore.

See Also: → 1656211

Trying this on mesa-master this appears to look better - the content is not displaced any more on initial rendering. Instead, now the popus get updated very slow - like as if rendering was capped to 1fps or so.

(In reply to Robert Mader [:rmader] from comment #9)
Does Firefox also slow down as long as the main menu is opened? But bug 1657597 uses Mesa 20.0.8.

(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #10)

Does Firefox also slow down as long as the main menu is opened? But bug 1657597 uses Mesa 20.0.8.

Nope, can't confirm that. However, the slow update appears to be some kind of freeze - it also affects website content. But if there are animations in the website content, things "unfreeze" after around a second and everything becomes smooth. Further more, it also affect GLX and bug 1656211 is now way more visible to me, making the EGL backend now more correct than the GLX one to me. I'll check if disabling (GLX based) vsync works around the issue.

Please try again when Bug 1650583 lands.
Thanks.

Flags: needinfo?(jan)

Debian Testing, Gnome Wayland
Still easily reproducible with software Mesa. My impression is that is has become more unlikely with hardware Mesa.
Click into the page to unfocus the address bar, then open the library panel.
LIBGL_ALWAYS_SOFTWARE=1 MOZ_X11_EGL=1 mozregression --launch 20200930092918 --pref gfx.webrender.all:true -a about:blank

Flags: needinfo?(jan)

FTR, the slow updating I've experienced appears to be triggered by partial present. Setting gfx.webrender.max-partial-present-rects to 0 makes it smooth again for me.

See Also: → 1677892

I can't reproduce this any more, neither on hardware nor on software renderer. Rinat, can you confirm that?

Flags: needinfo?(ibragimovrinat)

can you confirm that?

I confirm. No visual artifacts in both hardware and software renderer. By software I mean using LIBGL_ALWAYS_SOFTWARE=1 environment variable. That's on Intel graphics. I used a release build from snapshot dated from 2021-02-27.

Also tried on a snapshot from around September 2020. No glitches on hardware renderer as well, but can definitely see incomplete rendering when forcing Mesa's software mode. So the issue's definitely been fixed at some point. Haven't found the exact commit yet.

Flags: needinfo?(ibragimovrinat)

(In reply to Rinat from comment #16)

Also tried on a snapshot from around September 2020. ... So the issue's definitely been fixed at some point. Haven't found the exact commit yet.

Thanks for testing and verifying that! Will close then.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WORKSFORME

https://github.com/mozilla/gecko-dev/commit/08bc6ad13e073c038e4b04a2a972de564fc2c105

This is the commit that made menu issues appear a lot less. Before that it was like the last frame in menu opening animation was missing. That issue was extremely easy to reproduce with forced software rendering. On first try "account" menu opens correctly, but after that, 100 out of 100 times there were visual issues. After the commit 08bc6ad1, menu almost always opens correctly. But if you try long enough, the same issue still can be seen. I usually see it after about 20 tries or so.

At repo tip (dated 2020-02-27), the menu animation is played only once, so there is no way to try the animation at least a couple of dozen times. So I'm not sure if the issue is actually fixed. But at least changes made it a lot less likely to appear.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: