Link list races with GDK X11 error handler and off main thread Xlib

NEW
Unassigned

Status

()

Core
Graphics
P3
normal
2 years ago
9 months ago

People

(Reporter: karlt, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: gfx-noted)

Attachments

(1 attachment)

(Reporter)

Description

2 years ago
_gdk_x11_display_error_event walks a linked list on the thread that the
error handler is called.
https://git.gnome.org/browse/gtk+/tree/gdk/x11/gdkdisplay-x11.c?h=3.20.4#n2531

I don't think anything prevents that list being manipulated on the main thread
at the same time.

Using GDKs Display on another thread for a method waiting in _XReply()
is a problem, but it goes beyond that.
When _XGetRequest() finds the send queue full, it calls _XFlush() which calls
_XEventsQueued(dpy, QueuedAfterReading).  That means that any request, not
just those that wait, may trigger processing of error events.

We have the ability to hook the XSetErrorHandler dynamic lookup, but I don't
know that that helps.  GDK calls delete_outdated_error_traps() outside of the
calls to XSetErrorHandler() during push and pop.

Using async_handlers on the Display to simply intercept and ignore the error
would mean that GDK wouldn't find out whether the request succeeded.
I haven't looked into what problems that may cause.

I wonder how much synchronization would be required to use a separate display
on the compositor thread.
(Reporter)

Updated

2 years ago
See Also: → bug 1260559
Whiteboard: gfx-noted
Created attachment 8765078 [details] [diff] [review]
0001-atomicxerrorhandler.patch

Here's a draft implementation of an idea I had to solve this by blocking the X11 display during error handling. Ideally, we would use a separate display on the compositor thread and multiplex error events to either GDK (on the main thread's X display) or to our own scoped error handlers through this class.
(Reporter)

Comment 2

2 years ago
If separate display's are used on each thread, then these races with GDK's linked lists are avoided.  Using async_handlers as in [1] will avoid the problem with races on XSetErrorHandler callers.

When GDK's display is used on another thread, IIUC the approach in attachment 8765078 [details] [diff] [review] would only avoid races on GDK's linked lists if GDK could be persuaded to use AtomicXErrorHandler::PushTrap() when sending a request that might generate an error.

GDK calls XSetErrorHandler() on pop too early to know that any error has already been received.  However, it will have called XSync() if it needs the error code, and so perhaps that XSetErrorHandler() could still be intercepted to add an async_handler to ignore errors up to Display::request.

[1] http://searchfox.org/mozilla-central/rev/970569ad57ac4436ff31aa2ac63f38ed3ee2932d/media/webrtc/trunk/webrtc/modules/desktop_capture/x11/x_error_trap.cc#39
(Reporter)

Comment 3

2 years ago
(In reply to Karl Tomlinson (ni?:karlt) from comment #2)
> http://searchfox.org/mozilla-central/rev/
> 970569ad57ac4436ff31aa2ac63f38ed3ee2932d/media/webrtc/trunk/webrtc/modules/
> desktop_capture/x11/x_error_trap.cc#39

I suspect this approach could be extended to support the same display on multiple threads by using XLockDisplay on install and XUnlockDisplay on uninstall.

That's a solution to XSetErrorHandler races only, not to the GDK linked list races tracked here.
Priority: -- → P3
You need to log in before you can comment on or make changes to this bug.