Closed Bug 1045482 Opened 10 years ago Closed 10 years ago

Error in X11 if sharing a window via getUserMedia and you close it

Categories

(Core :: WebRTC: Audio/Video, defect)

x86_64
Linux
defect
Not set
minor

Tracking

()

RESOLVED FIXED
mozilla34
Tracking Status
firefox32 --- unaffected
firefox33 --- fixed
firefox34 --- fixed

People

(Reporter: jesup, Assigned: jesup)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 3 obsolete files)

If you're sharing a window under X11, and close it, you get a BadWindow error from X.  This will cause GDB to break, but it's non-fatal and everything seems to recover well.  It may be that given asynchronicity, it's impossible to avoid in all cases.

Low-priority bug given that it works mostly as expected.
A larger concern was that after it recovered from the error, it seemed impossible to get rid of the sharing indicator or request another share.
Blocks: 1040061
Bug 1042628 may be related, although the reporter said Firefox quits.
Attachment #8464873 - Flags: review?(karlt)
Attachment #8464873 - Attachment is obsolete: true
Attachment #8464873 - Flags: review?(karlt)
Comment on attachment 8464997 [details] [diff] [review]
Wrap X11 calls for window lists and XQueryPointer in error trapping to avoid asserts on X11 errors

See also https://code.google.com/p/webrtc/source/detail?spec=svn6802&r=6359
and https://code.google.com/p/webrtc/issues/detail?id=2022
Attachment #8464997 - Flags: review?(karlt)
XInitThreads() is only called when NIGHTLY_BUILD is set, so Xlib docs say not
to use Xlib on multiple threads, but for most calls on Displays, it should be
safe to use different Displays on different threads.

Even if XInitThreads() were called, it wouldn't help with the lack of thread safety around XSetErrorHandler() and in gdk_error_trap_push().
In the testcase of bug 1042628, I get an assertion failure at
http://hg.mozilla.org/mozilla-central/annotate/08c23f12a43e/media/webrtc/trunk/webrtc/modules/desktop_capture/x11/x_error_trap.cc#l30

This means the thread safety bug is being hit and so our crash reporter is not
reporting.

Interference of threads seems likely, not just unlucky:

Thread 1 installs error handler and records old handler.
Thread 1 sends request and waits for reply.
Thread 2 installs error handler and records thread 1's error handler.
Thread 2 sends request and waits for reply.
Thread 1 receives reply and installs old handler.
Thread 2 receives reply and installs thread 1's handler.

We need a different approach that doesn't use the global error handler
controlled by XSetErrorHandler.
To fully reproduce all consequences of failing to catch the error, remove
libgnomeui-2.so.0 because it loads libbonoboui, which suppresses uncaught
BadWindow errors.

With libgnomeui installed, the errors reduce to things like random widgets failing to draw natively themed in certain states.  This is because nsNativeThemeGTK::DrawWidgetBackground() assumes that any error which
occurred during its drawing was with the native theme, even though the error
was on a different thread and Display.
Error handling is now applied to the Display using async_handlers, instead of
replacing and trying to reinstate the XSetErrorHandler() global handler for
all Xlib Displays.

Based on code at
https://git.gnome.org/browse/gtk+/tree/gdk/x11/gdkasync.c?id=0e1a4248#n252
https://git.gnome.org/browse/gtk+/tree/gdk/x11/gdkasync.c?id=0e1a4248#n150
and use of _XAsyncErrorHandler in libX11.
Attachment #8465992 - Flags: review?(rjesup)
Attachment #8465992 - Attachment is obsolete: true
Attachment #8465992 - Flags: review?(rjesup)
Fixed an arithmetic error.

display->request is NextRequest(display) - 1.

This is the only way I know how to handle Xlib protocol errors in a
thread-safe manner.  I don't know of any reviewers who will know the Xlib
code here.  Perhaps jrmuizel might, or perhaps Sergey could review.
Attachment #8465997 - Flags: review?(rjesup)
Comment on attachment 8464997 [details] [diff] [review]
Wrap X11 calls for window lists and XQueryPointer in error trapping to avoid asserts on X11 errors

We can and should do this if XErrorTrap is fixed.  (Otherwise, this could cause more problems that it solves.)

XFixesSelectCursorInput() also needs protecting.  That is asynchronous and so
XErrorTrap won't catch the error unless an explicit XSync is added before
the trap is destroyed.

>     Bool result = XQueryPointer(display(), window_, &root_window, &child_window,
>                                 &root_x, &root_y, &win_x, &win_y, &mask);
>     CursorState state;
>-    if (!result) {
>+    if (!result || error_trap.GetLastErrorAndDisable() != 0) {

Your previous version was fine here.  GetLastErrorAndDisable() is unnecessary
because XQueryPointer will return False if its request generates a protocol
error.  Return values of synchronous requests (getting something from the
server) will reflect errors in the return code; async requests will not.

However, the extra code is mostly harmless if we fix XErrorTrap, so it is fine
to leave it there if you'd like to maintain consistency with upstream.

>+  XErrorTrap error_trap(options.x_display()->display());
>   window = GetTopLevelWindow(options.x_display()->display(), window);

I would put this in GetTopLevelWindow() to make it clearer exactly which
requests are being protected.

>+  XErrorTrap error_trap(options.x_display()->display());
>   return new MouseCursorMonitorX11(
>       options, DefaultRootWindow(options.x_display()->display()));

The MouseCursorMonitorX11 constructor doesn't do anything, so this XErrorTrap
is unnecessary.

r+ if the XErrorTrap is fixed and XErrorTrap is removed from CreateForScreen().  Other comments for your discretion.
Attachment #8464997 - Flags: review?(karlt) → review+
Comment on attachment 8465997 [details] [diff] [review]
make XErrorTrap installation and removal thread-safe

Review of attachment 8465997 [details] [diff] [review]:
-----------------------------------------------------------------

Make the origin of the code more explicit.

::: media/webrtc/trunk/webrtc/modules/desktop_capture/x11/x_error_trap.cc
@@ +35,2 @@
>  XErrorTrap::XErrorTrap(Display* display)
> +    : display_(display),

Perhaps comment on the thread-safety aspect being avoided here for future readers
Attachment #8465997 - Flags: review?(rjesup) → review+
Comment on attachment 8466034 [details] [diff] [review]
make XErrorTrap installation and removal thread-safe v1.2

https://hg.mozilla.org/integration/mozilla-inbound/rev/42f62e9467f4
Attachment #8466034 - Flags: checkin+
Attachment #8465997 - Attachment is obsolete: true
Flags: in-testsuite?
Keywords: leave-open
Assignee: nobody → rjesup
Keywords: leave-open
Whiteboard: [screensharing-uplift]
Comment on attachment 8466034 [details] [diff] [review]
make XErrorTrap installation and removal thread-safe v1.2

Approval Request Comment
[Feature/regressing bug #]: screensharing

[User impact if declined]: browser crash/exit if shared window is closed (linux only)

[Describe test coverage new/current, TBPL]: Manual testing - we have no way to programmatically target a window for sharing in tbpl and then close it.  Also, mirrors a change made upstream in webrtc.org - but we handled the threading issue, which they decided to ignore for now.

[Risks and why]: Low risk - wraps X11 calls with error traps.  Likely worst case problem would be missing an X11 call and not fully fixing the problem.

[String/UUID change made/needed]: none
Attachment #8466034 - Flags: review+
Attachment #8466034 - Flags: approval-mozilla-aurora?
Comment on attachment 8464997 [details] [diff] [review]
Wrap X11 calls for window lists and XQueryPointer in error trapping to avoid asserts on X11 errors

Approval Request Comment
see other patch
Attachment #8464997 - Flags: approval-mozilla-aurora?
(In reply to Karl Tomlinson (:karlt) from comment #12)
> XFixesSelectCursorInput() also needs protecting.  That is asynchronous and so
> XErrorTrap won't catch the error unless an explicit XSync is added before
> the trap is destroyed.

Calling CaptureCursor() before the XErrorTrap is destroyed makes XSync() unnecessary because XFixesGetCursorImage() is synchronous.

So, yes, looks good.
https://hg.mozilla.org/mozilla-central/rev/73f32f0b9296
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Attachment #8464997 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8466034 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
See Also: → 1147777
We should investigate upstreaming these changes. If they're not relevant any longer, we can remove them from our local copy of webrtc.org.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: