Open Bug 1592786 Opened 5 years ago Updated 2 years ago

XWayland: Compositing gets capped at 30fps after extended uptime on Linux

Categories

(Core :: Graphics, defect, P3)

Unspecified
Linux
defect

Tracking

()

Tracking Status
firefox72 --- wontfix

People

(Reporter: TD-Linux, Unassigned)

References

(Blocks 1 open bug)

Details

After I've had my computer on a couple of days, Firefox suddenly can no longer render at 60fps, instead only delivering frames at 30fps. Restarting Firefox does not fix the problem.

The Webrender profiler shows frames rendered at under 1ms, and the GPU is under very light load. Disabling Webrender (but still using the OpenGL compositor) looks the same.

I'm using gnome-shell and mesa amdgpu, with Firefox running under X.Org.

It seems like this is some bad interaction for frame timing between gnome-shell and Firefox. Not sure who is to blame yet, but filing the bug here. Note that other applications (like Blender) still happily render at 60fps.

Bug 1561120 was my previous report of this, but was somewhat misguided.

Next time you run into this issue, could you capture a profile using the extension at https://profiler.firefox.com/ ? It will tell us where Firefox is spending its time which may help narrow the problem down.

Flags: needinfo?(tdaede)
Priority: -- → P3

Here it is: https://perfht.ml/2WxthVS

I can definitely see the Vsync events in the compositor thread coming in at every 33ms, which is half as often as they should.

Flags: needinfo?(tdaede)

After turning on gfx debugging, I got this telltale message:

[GFX2-]: glXWaitVideoSync failed to increment the sync counter.

The corresponding code uses this extension to get the sync counter: https://www.khronos.org/registry/OpenGL/extensions/SGI/GLX_SGI_video_sync.txt

Among the many problems with this extension is that it doesn't specify how rollover works. Likely either my driver is bugged and stopped incrementing the sync counter, or rolled over and the code detected it as a failure. But the result is that the code adds a software 16ms delay, which adds to the existing 16ms sync delay and results in a capped 30fps framerate.

I added a printf to the vsync counter, and when I get capped at 30fps, it's always incrementing by 2. So it's not actually a rollover bug, but I'm not sure what it is :(

Changing
unsigned int nextSync = syncCounter + 1;
to
unsigned int nextSync = syncCounter;

in gfxPlatformGtk.cpp fixes my issues. I don't really understand why though, according to GLX_SGI_video_sync documentation the +1 should be correct.

I built a custom mesa and traced everything through the X server. I did discover that the GLX_SGI_video_sync is implemented with the Present extension. The code in mesa locks the dri3 drawable before issuing the Present request. If we already have another outstanding Present request on the same drawable, even on another thread, that one is going to have to return before ours does. An example of this would be the Present caused by swapping buffers....

Commenting out the locking in mesa makes it work at 60fps on about 50% of startups.

I think the idea of doing GLX calls on a separate thread with the same drawable is just untenable. Either we have to figure out how to get a separate drawable, or generate vsync events from our Present / swap buffer calls. Actually it does that even without the locking patched out. Hmm....

I should have tried this sooner, but this was all running under xwayland. Running under plain X11 makes this problem totally disappear. So it seems to be an XWayland bug, so I filed https://gitlab.freedesktop.org/xorg/xserver/-/issues/1025

Blocks: vsync
No longer blocks: gfx-driver-bug
Summary: Compositing gets capped at 30fps after extended uptime on Linux → XWayland: Compositing gets capped at 30fps after extended uptime on Linux
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.