Closed Bug 1514148 Opened 5 years ago Closed 5 years ago

WebRender seems to ignore Expose events on X11

Categories

(Core :: Graphics: WebRender, defect, P3)

66 Branch
Desktop
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla70
Tracking Status
firefox-esr60 --- disabled
firefox-esr68 --- disabled
firefox68 --- disabled
firefox69 --- disabled
firefox70 --- disabled

People

(Reporter: streetwalkermc, Assigned: sotaro)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: regression)

Attachments

(3 files, 1 obsolete file)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:66.0) Gecko/20100101 Firefox/66.0

Steps to reproduce:

- Turn on gfx.webrender.all
- Make sure no window repaints are necessary (throbber animations, blinking cursors, etc)
- Spawn/drag windows over Firefox or switch workspaces


Actual results:

Window contents are damaged and no repaint is attempted.


Expected results:

Window contents should be repainted upon X11 Expose events which are being sent under these circumstances.
Component: Untriaged → Graphics: WebRender
OS: Unspecified → Linux
Product: Firefox → Core
Hardware: Unspecified → x86_64
This sounds like a duplicate of bug 1479018. Do you have a similar setup? (non-compositing window manager)
Please open about:support, click on the "Copy text to clipboard" button, paste it into a text file and upload it here (Attach file). Thanks
Blocks: wr-linux
I can reproduce both with and without a compositor, but I don't usually use one. Unlike that report, layers.acceleration.force-enabled doesn't affect the issue for me. I'm using an RX 580 with the open source amdgpu driver stack.
Attached file about_support.txt (obsolete) —
(In reply to Dan Elkouby from comment #3)
> Created attachment 9031557 [details]
> about_support.txt

You are not using WebRender here. You need to set gfx.webrender.all to true and restart Nightly.
(It's the same as layers.acceleration.force-enabled;true + gfx.webrender.enabled;true.)
Which Linux distribution and desktop environment are you using?
Attached file about_support.txt
Oh right, I forgot to restart.
I'm using i3 on Arch Linux.

One more detail: switching focus through window manager actions or mousing over Firefox does not cause a redraw, only things like clicking on the window or mousing over an element that has a hover style will actually redraw.
Attachment #9031557 - Attachment is obsolete: true
Priority: -- → P3
See Also: → 1479018
Summary: WebRender seems to ignore Expose events on X11 → WebRender seems to ignore Expose events on X11 (i3 on Arch Linux, Radeon RX 580)
I can also reproduce this on my laptop (similar setup, but Kaby Lake integrated graphics). Again, no problem with plain opengl compositing, but webrender.all fails to redraw.
Can confirm this with Nightly & Mesa RX 580. With WebRender: The window's content turns completely black after Xscreensaver was enabled and needs some input to reappear again.

Same on Nightly as of today's update, with RX 580 and WebRender. On my case, I run a compositor (compton) and focus-follows-mouse on bspwm. On changing to a desktop with a Nightly window, the window will be transparent/invisible until mouse is moved to it, focusing it.

The problems I was having on nvidia seem to have be fixed by the patch from bug 1524168; people with AMD GPUs might want to try again with a more recently Nightly.

Still broken on today's nightly on AMD (mesa master) and Intel (mesa 18.3.4).

It seems the issue is gone for me with recent Nightlies.

Still not fixed for me. Here's a quick and consistent repro:

  • Open an empty Firefox window (ctrl+n)
  • Move it to an empty workspace
  • Open another, non-Firefox window on the same workspace and focus it (it's important that Firefox itself isn't focused)
  • Switch to another workspace, and back to the one with Firefox
    At this point, the Firefox window will show whatever the other workspace had there instead of its own contents (at least without a compositor running).

I think Firefox' window content also showed garbage for me when moving other windows on top of it with WR and without compositor, but I can't reproduce this anymore with 66 final.
Also no anomalies show up when moving the window to another workspace and trying various things to provoke the issue (KDE Plasma, compositor off).

Oops, I accidentally didn't actually enable WR by confusing one of its config entries.

Ok, I can totally confirm Dan's results. It's just important to restart Firefox after disabling/killing the Xorg compositor.

I can easily reproduce the issue by

  1. Turning off the Xorg compositor
  2. Start FF with WR
  3. Start "xscreensaver-demo" (part of xscreensaver)
  4. In xscreensaver-demo, click "File" and "Blacken screen now" (might be named slightly different, as I had to translate it loosely from German to English)
  5. After returning from the screensaver, the window of Firefox WR is completely black, until it receives some input or enough cursor movement.
  6. This doesn't happen without WR.

But: I can't reproduce it with an Xorg compositor enabled, be it KWin or Compton. That at least has changed for whatever reason.

But: I can't reproduce it with an Xorg compositor enabled, be it KWin or Compton. That at least has changed for whatever reason.

That's because xcomposite redirects painting to an off-screen buffer, so obscuring the window does not cause its contents to be damaged. Unmapping the window (by iconizing it if your DE supports it, or switching to another workspace, for example) will cause the associated pixmap to be lost which should make compositors show garbage until the contents are redrawn.
The X server normally sends an Expose event under those circumstances (when parts of the window that were previously obscured become visible, or when the window is mapped again), so that applications get the chance to redraw their contents. Handling those events is what I believe is currently broken, hence the ticket title. I'm pretty sure it used to work fine though.

Thanks for the explanation. I can also reproduce what you've just described by

  1. Open FF with WR
  2. Put a window on top of it, e.g. Dolphin file manager
  3. Switch to a different KDE Plasma Xorg virtual desktop and back to the previous with the FF window
  4. Now the window content of FF is broken as well.

This is quite an annoyance if this is your typical workflow.

Flags: needinfo?(jmuizelaar)

I can also reproduce this on NVIDIA Proprietary (418.56 with a GTX 1060 6GB) using i3 on Arch Linux - so it's not driver related at all.

Perhaps the status could be changed to confirmed?

I'm experiencing similar symptoms, but only happens when I change my display configuration, e.g. by attaching monitors, or changing the screen resolution.

I use i3 as a WM, and have "Intel Open Source Technology Center -- Mesa DRI Intel(R) UHD Graphics 620 (Kabylake GT2)" as my WebGL 1 Driver Renderer

Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: WebRender seems to ignore Expose events on X11 (i3 on Arch Linux, Radeon RX 580) → WebRender seems to ignore Expose events on X11 (i3 window manager)

If no one beats me to it, I'll investigate this sometime in the next week.

Flags: needinfo?(aosmond)

Something I didn't see mentioned yet, which I described in bug 1553171 (last duplicate), the refreshing does happen when a website itself refreshes, e.g. when a youtube video is running.
Maybe it's obsolete to mention but when having multiple firefox windows opened behaviour is the same, thus both not being refreshed.

Yes indeed, pretty much anything that changes the window contents will force a refresh.
I've noticed that the latest nightlies enabled webrender by default on both of my machines. I had to force it off to work around this bug, because it's otherwise extremely annoying to use.

In the latest nightlies, this is now reproducible with opengl compositing without webrender (layers.acceleration.force-enabled=true, gfx.webrender.all=false, gfx.webrender.force-disabled=true).

Is this problem fixed by starting Firefox with MOZ_GTK_TITLEBAR_DECORATION=none /path/to/firefox?

I still encounter the problem when starting with MOZ_GTK_TITLEBAR_DECORATION=none

I did some testing with this last weekend (first time using the i3 window manager), but did not get to a proper investigation. Still on my mind.

Assignee: nobody → aosmond

(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #27)

Is this problem fixed by starting Firefox with MOZ_GTK_TITLEBAR_DECORATION=none /path/to/firefox?

(In reply to maciej from comment #28)

I still encounter the problem when starting with MOZ_GTK_TITLEBAR_DECORATION=none

Same here, bug is still present with that variable set.

Here's what xev sees on the Firefox window when switching back from another workspace:

PropertyNotify event, serial 18, synthetic NO, window 0x800003,
    atom 0x15b (WM_STATE), time 4636930, state PropertyNewValue

VisibilityNotify event, serial 18, synthetic NO, window 0x800003,
    state VisibilityUnobscured

Expose event, serial 18, synthetic NO, window 0x800003,
    (0,0), width 2565, height 2126, count 0

And on the parent window (window manager borders):

MapNotify event, serial 18, synthetic NO, window 0xc0006e,
    event 0xc0006e, window 0xc0006e, override YES

VisibilityNotify event, serial 18, synthetic NO, window 0xc0006e,
    state VisibilityUnobscured

Expose event, serial 18, synthetic NO, window 0xc0006e,
    (0,0), width 2573, height 4, count 3

Expose event, serial 18, synthetic NO, window 0xc0006e,
    (0,4), width 4, height 2126, count 2

Expose event, serial 18, synthetic NO, window 0xc0006e,
    (2569,4), width 4, height 2126, count 1

Expose event, serial 18, synthetic NO, window 0xc0006e,
    (0,2130), width 2573, height 4, count 0

With WR enabled, no events are generated at all when another window is dragged over Firefox, and the contents are apparently undamaged. Contents are always lost when the window (or at least its parent) gets unmapped and remapped.
With OpenGL compositing enabled and WR disabled, dragging something over Firefox does generate expose events like in the first log; some times the window will refresh correctly and some times you can see a semi-persistent trail (a la Windows XP and older). Contents aren't always lost when the window's mapping gets toggled, behavior seems to vary depending on what I do.
Everything behaves as it should when using the old software compositor as far as I can tell.

At any rate my observation is that Firefox is being sent the appropriate events from the X server, but it's not actually updating its window contents. I'll have to try compiling from source to know what it actually sees, though. It could be a race condition, but I haven't seen this happen in any other application besides games, which swap buffers on their own schedule and sometimes stop doing so, e.g. on loading screens.

Just compiled my own build, I can reproduce this issue right out of the box in the fresh profile it created (webrender is enabled by default). I will try to investigate this later today.

(In reply to Andrew Osmond [:aosmond] from comment #31)

We already seem to handle something like the expose event here, which triggers on draw:

https://searchfox.org/mozilla-central/rev/da14c413ef663eb1ba246799e94a240f81c42488/widget/gtk/nsWindow.cpp#3873

Alright, initial investigation shows that this function doesn't always get called with webrender enabled, especially when switching workspaces. The callback does fire consistently with software compositing, and disabling it does break redrawing in the same way as with webrender. GDK_DEBUG=all shows that the expose events from X are being received, so I'm a bit puzzled as to why they don't propagate all the way down.

Ok so after fruitless attempts at finding the bug in the source or with a debugger, and a moment of suspicion towards GTK/GDK, I went and checked my chat logs. First instance of me complaining about this issue is on the 16th of October 2018. A quick look at my package manager logs reveals that there was no GTK update in October, so it can't be a GTK regression. Downloading old nightlies finally confirmed it: last good build is 2018-10-15-10-01-28, first broken build is 2018-10-15-22-33-36.
Now I need to figure out which commits these correspond to, and bisect the source code repo to finally pinpoint it. Hopefully the 12h narrows it down enough to make this quick enough.

Thank you!
20181015100128 = 4a230b07f0cbf48e87dcb4265ea2d00893bb1b62
20181015223336 = 4c11ab0cd98950983cfc957f579ace6c3e918a43
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=4a230b07f0cbf48e87dcb4265ea2d00893bb1b62&tochange=4c11ab0cd98950983cfc957f579ace6c3e918a43

To me as non-programmer it looks like bug 1498092 is the cause.

(Sotaro Ikeda [:sotaro] from bug 1500520 comment 5)

Since bug 1498092, RendererOGL::UpdateAndRender() is not called during handling new frame, instead just RendererOGL::Update() is called.

When the UpdateAndRender() was not called during resume from standby, DoNotifyWebRenderContextPurge() was not called since then. It seemed that glGetGraphicsResetStatus() was already cleared for the GLContext.
https://dxr.mozilla.org/mozilla-central/source/gfx/webrender_bindings/RendererOGL.cpp#165

Just to mention it: Commands to narrow it down with pre-built binaries would be:
sudo pacman -S python2-pip
sudo pip2 install -U mozregression
mozregression --good 2018-10-14 --bad 2018-10-15 --pref gfx.webrender.all:true
https://mozilla.github.io/mozregression/install.html
https://wiki.mozilla.org/Release_Management/Calendar

Thanks for your reply, everything looks very helpful.
That comment does sound like it's related.
I've already figured out the revisions for those build and have been trying to compile them from source but multiple things are failing and I don't think I have the patience to fix it. Best I can do is scrutinize the changelog and see if I can get something out of it.
I hope I can get to the bottom of this because this bug is extremely annoying and webrender offers a huge performance improvement in some scenarios.

Sotaro, it looks like your patch in bug bug 1498092 has caused this bug. Could you have a look?
Bug 1515253 looks also similar to bug 1500520 which was a regression of bug 1498092.

Flags: needinfo?(sotaro.ikeda.g)
See Also: → 1515253

I can't make sense of this myself. I see two problems:

  • in recent builds, expose_event_cb doesn't always get called if at all, but in the first broken build from october, it does (probably worth bisecting as well); basic compositing is also fine on that front
  • when the callback is triggered (with webrender), the window doesn't actually get redrawn

I've tried replacing all calls to ScheduleGenerateFrameAllRenderRoots and ScheduleGenerateFrame with ScheduleForcedGenerateFrame, but it doesn't seem to make a difference.

I think I'm going to leave this to someone who's more familiar with the code base, because I don't have a lot of time to dedicate to this.

KDE with disabled compositor:

mozregression --good 2018-10-14 --bad 2018-10-15 --pref gfx.webrender.all:true

7:06.27 INFO: Last good revision: 7035f4405e02b2dbcea1934aef9efe534e9ad549
7:06.27 INFO: First bad revision: 3ad88a9f8f35bbb61b1b6063a3e39450f8b78d43
7:06.27 INFO: Pushlog:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=7035f4405e02b2dbcea1934aef9efe534e9ad549&tochange=3ad88a9f8f35bbb61b1b6063a3e39450f8b78d43

3ad88a9f8f35bbb61b1b6063a3e39450f8b78d43 sotaro — Bug 1498092 - Add necessary forced frame rendering r=nical

Summary: WebRender seems to ignore Expose events on X11 (i3 window manager) → WebRender seems to ignore Expose events on X11
Depends on: 1571331

KDE with disabled compositor:
I have verified that bug 1571331 comment 3 fixes workspace switching. Thank you!

Assignee: aosmond → sotaro.ikeda.g
Has Regression Range: --- → yes
Has STR: --- → yes
Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(jmuizelaar)
Flags: needinfo?(aosmond)
Hardware: x86_64 → Desktop
Target Milestone: --- → mozilla70

Good! Thanks.

As I feared, it's still broken for me. This might be the same problem that's breaking things on opengl compositing. I'll try bisecting that one as well when I have a better internet connection.

Attached video 2019-08-07_19-16-11.mp4

Oh, i3 is quite nice. It has the same regression range as comment 40, but unlike KDE without compositor it is not fixed for i3. Window contents only reappear when window gets focused.

I was searching for "expose event", landed on https://developer.gnome.org/gtk2/stable/chap-drawing-model.html and asked myself if remote drawing of the GPU process could play a role on i3: It seems yes, so far I could not reproduce this problem with layers.gpu-process.enabled;false.

Bisected the issue with opengl compositing.

$ mozregression --good 2018-10-14 --bad 2019-08-01 --pref gfx.webrender.force-disabled:true layers.acceleration.force-enabled:true 
...
 9:50.55 INFO: Last good revision: a4daa44cdb9cd0ab8a1870a4105ff8f9103c193e
 9:50.55 INFO: First bad revision: 284dca344fcc2736acc3c2d8bc54befea0a8ce73
 9:50.55 INFO: Pushlog:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=a4daa44cdb9cd0ab8a1870a4105ff8f9103c193e&tochange=284dca344fcc2736acc3c2d8bc54befea0a8ce73

Narrows it down to one commit, which seems to confirm your suspicions. I can also confirm that the latest nightly works fine for me with layers.gpu-process.enabled;false.
As far as I can tell the bug as it was originally reported is fixed, despite finding another one with similar visible symptoms. Should we close this ticket and open another one against the right component?

Should we close this ticket and open another one against the right component?

Yes, original bug was fixed. It is nice to handle another problem by a new bug :)

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED

I've just opened bug 1572625.

Thank you!

Hi, I did report bug 1553171 which was later marked as an duplicate of this bug. Unfortunately until now I still get the same behaviour using i3 WM. I tried setting "layers.gpu-process.enabled;false" but this did not solve the problem. How was this bug fixed after all?

There are multiple causes for Firefox not repainting. I've been experiencing a new one this month which I haven't been able to bisect yet because it's pretty difficult to reproduce.

I have that too with 72.0b10 Webrender, randomly the window content disappears and shows underlying content. When moving the cursor or scrolling with the mouse wheel the website content gets shown again then. Though sometimes other UI elements like the tab bar remain broken and I need to e.g. resize the window to "fix" it. I've seen that also with KWin and Mutter compositing enabled.
Too bad that the X11 implementation starts regressing so much before it was even fixed and before the Wayland implementation can be considered complete. :(

Finally managed to bisect this one. See bug 1606224.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: