XSync can hang indefinitely with OMTC enabled

RESOLVED FIXED in mozilla26

Status

()

Core
Graphics
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: billm, Assigned: billm)

Tracking

unspecified
mozilla26
x86_64
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

Created attachment 796958 [details] [diff] [review]
xsync-fix

I'm using desktop Firefox on Linux with OMTC and e10s. I don't know if this is related to e10s or not.

The symptom of the problem is that we occasionally will hang in an XSync call from the compositor thread. The XSync call only returns when some event happens, like moving the mouse or hitting a key. It causes the browser to feel very laggy.

This patch seems to fix the problem. It avoids some X calls that maybe were causing trouble. Matt also found this bug, which might be related:
https://bugs.freedesktop.org/show_bug.cgi?id=41870
Attachment #796958 - Flags: review?(matt.woodrow)
Interestingly, this patch also seems to make my red/blue problems in bug 895954 go away!
Another odd thing is that, without this patch, gDumpCompositorTree doesn't work quite right. When it tries to dump out the textures for the CompositableHosts, it always gets 0 for the texture size here:

http://mxr.mozilla.org/mozilla-central/source/gfx/gl/GLContext.cpp#1806

However, with the patch, the textures seem to get dumped correctly.
Comment on attachment 796958 [details] [diff] [review]
xsync-fix

Switching the review over to bjacob.

Benoit, the issue here is that we don't have X11 surfaces enabled for the IPDL layer (because the MOZ_LAYERS_ENABLE_XLIB_SURFACES env var is never set), but we're still trying to use them when creating TextureImageGLX's.

So we're using shmem to share surfaces, then on the compositor side we're allocating an Xlib pixman (via TextureImageGLX), copying into it (DirectUpdate), and then using texture_from_pixmap to draw it.

It also appears that Xlib pixmaps might have different byte orderings and we're not taking that into account, giving us reversed r/b channels in some cases.

This seems pretty bad, as we're risking all the driver bugs with texture_from_pixmap, without any of the gains (of drawing directly into an X pixmap).

I think we should take this patch (rather than invest too much effort for the time being into allocating pixmaps on the content side and sharing those through IPDL) for the time being at least.

Though as it stands, it will also change non-OMTC accelerated layers to do the same, which we may not want. You might have some ideas on how much we care about this, and how to avoid it.
Attachment #796958 - Flags: review?(matt.woodrow) → review?(bjacob)
Just wanted to mention that the r/b swapping in bug 895954 happens without e10s, so this is a general problem with Linux and OMTC.
It looks like TextureImageGLX reports its format as FORMAT_R8G8B8{A|X}8, which is the same as OSX (decided here - http://mxr.mozilla.org/mozilla-central/source/gfx/gl/GLContext.cpp#2403), but reversed from 'normal'.

When drawing component alpha layers (http://mxr.mozilla.org/mozilla-central/source/gfx/layers/opengl/CompositorOGL.cpp#1222 or http://mxr.mozilla.org/mozilla-central/source/gfx/layers/opengl/ThebesLayerOGL.cpp#169) we check the function that decided it for OSX, rather than just using the format of the texture object.

This means that we're picking the wrong component alpha shader program, and drawing the channels reversed.
Comment on attachment 796958 [details] [diff] [review]
xsync-fix

Review of attachment 796958 [details] [diff] [review]:
-----------------------------------------------------------------

I actually like that! Generally speaking we want to move towards not using X11 drawing anymore. I just hope that there won't bee any too bad short-term performance regression for anyone, but there's only one way to find out.
Attachment #796958 - Flags: review?(bjacob) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/605119645626

Yay! I assume that we'll be able to remove some code if this sticks?
Likely yes, but I am not the best person to tell --- Matt and Karl Tomlinson are the two people who've worked with that code the most.
https://hg.mozilla.org/mozilla-central/rev/605119645626
Assignee: nobody → wmccloskey
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla26
XSync doesn't do anything more than other requests that wait for a reply, so if XSync was hanging, the bug has probably just moved somewhere else.
I wonder whether this is related to bug 889869.

If https://bugs.freedesktop.org/show_bug.cgi?id=41870 is related, then the bug would be using XNextEvent from more than one thread, but I can't think why that would be happening.

Please don't remove the tfp code.
https://groups.google.com/d/msg/mozilla.dev.platform/LuOWo2P7yxM/DiTU7Htxeq0J
Depends on: 889869
You need to log in before you can comment on or make changes to this bug.