_XPrivSyncFunction: Assertion `dpy->synchandler == _XPrivSyncFunction' failed with Firefox 40.0.2 (and also latest aurora build); older libX11?

RESOLVED FIXED in Firefox 44

Status

()

defect
--
critical
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: tmstaedt, Assigned: lsalzman)

Tracking

({crash})

40 Branch
mozilla44
x86
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox44 fixed)

Details

(crash signature)

Attachments

(3 attachments, 1 obsolete attachment)

User Agent: Mozilla/5.0 (X11; Linux i686; rv:38.0) Gecko/20100101 Firefox/38.0
Build ID: 20150806103657

Steps to reproduce:

upgrade from version 39.0.3. Normal browsing, mouse movement on page


Actual results:

firefox suddenly crashes: Crash-ID: 853efa39-0142-4d62-bdc1-269802150819
Signature: libc-2.11.1.so@0x2d811
Also happens with latest aurora build, 41.0b2
Severity: normal → critical
OS: Unspecified → Linux
Hardware: Unspecified → x86
https://crash-stats.mozilla.com/report/index/853efa39-0142-4d62-bdc1-269802150819

You have a bunch of add-ons, are you able to reproduce the crash in safe mode or with a clean profile?

https://support.mozilla.org/en-US/kb/troubleshoot-firefox-issues-using-safe-mode

https://support.mozilla.org/en-US/kb/profile-manager-create-and-remove-firefox-profiles
Crash Signature: [@ libc-2.11.1.so@0x2d811 ]
Flags: needinfo?(tmstaedt)
Keywords: crash
Component: General → Widget: Gtk
Would be good to compare after restart with layers.offmainthreadcomposition.enabled set to false in about:config.
(In reply to Loic from comment #2)
> https://crash-stats.mozilla.com/report/index/853efa39-0142-4d62-bdc1-
> 269802150819
> 
> You have a bunch of add-ons, are you able to reproduce the crash in safe
> mode or with a clean profile?
> 
> https://support.mozilla.org/en-US/kb/troubleshoot-firefox-issues-using-safe-
> mode
> 
> https://support.mozilla.org/en-US/kb/profile-manager-create-and-remove-
> firefox-profiles

I'll try that, then may re-enable one-by-one the ones I really need.
But note, that these frequent, intermittent, "unprokoked" instabilities, started with version 40. Before, even with all those addons, firefox ran stable. Also note, that the esr version, which I had switched to, also did not show these problems. So, there may have been some "risky" changes of late.
Flags: needinfo?(tmstaedt)
(In reply to Karl Tomlinson (ni?:karlt) from comment #3)
> Would be good to compare after restart with
> layers.offmainthreadcomposition.enabled set to false in about:config.

All, right changed that in version 41.0b2.
(In reply to Thomas Mittelstaedt from comment #4)
> (In reply to Loic from comment #2)
> > https://crash-stats.mozilla.com/report/index/853efa39-0142-4d62-bdc1-
> > 269802150819
> > 
> > You have a bunch of add-ons, are you able to reproduce the crash in safe
> > mode or with a clean profile?
> > 
> > https://support.mozilla.org/en-US/kb/troubleshoot-firefox-issues-using-safe-
> > mode
> > 
> > https://support.mozilla.org/en-US/kb/profile-manager-create-and-remove-
> > firefox-profiles
> 
> I'll try that, then may re-enable one-by-one the ones I really need.

Ok, I have cleaned up my add-ons somewhat to keep only those I really (want to) use.
Just got another one with ID: afd4b87a-870c-4859-b2dd-cd14d2150821, using 41.0b2.
Had turned layers.offmainthreadcomposition.enabled back on. Will set it to false now and try to reproduce.
(In reply to Thomas Mittelstaedt from comment #7)
> Just got another one with ID: afd4b87a-870c-4859-b2dd-cd14d2150821, using
> 41.0b2.
> Had turned layers.offmainthreadcomposition.enabled back on. Will set it to
> false now and try to reproduce.

Tried the same window configuration, after having restarted the browser, but could not reproduce.
Now here is some more information which might help somebody. I could reproduce the crash, again with 41.0b2, having turned  layers.offmainthreadcomposition.enabled back on.

Started firefox in the terminal via:

MOZ_CRASHREPORTER_DISABLE=1 XRE_NO_WINDOWS_CRASH_DIALOG=1 XPCOM_DEBUG_BREAK=stack-and-abort ./firefox

Opened a lot of pages pressing the "Enable temporary restrictions" for the page with the NoScript add-on and got a core dump:

Core was generated by `./firefox'.
Program terminated with signal 6, Aborted.
#0  0x00987422 in __kernel_vsyscall ()
(gdb) bt
#0  0x00987422 in __kernel_vsyscall ()
#1  0x00bae230 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
#2  0x02f3b3bb in ?? () from /home/tom/firefox-41.0b2/libxul.so
#3  <signal handler called>
#4  0x00987422 in __kernel_vsyscall ()
#5  0x0022f3e1 in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#6  0x00232812 in *__GI_abort () at abort.c:92
#7  0x002284a8 in *__GI___assert_fail (assertion=0x542e228 "dpy->synchandler == _XPrivSyncFunction", file=0x542e2ef "../../src/XlibInt.c", line=595, function=0x542e2dc "_XPrivSyncFunction") at assert.c:81
#8  0x053b77fa in _XPrivSyncFunction (dpy=0xb75a4000) at ../../src/XlibInt.c:595
#9  0x0539c2ba in XGetWindowProperty (dpy=0xb75a4000, w=81805417, property=291, offset=0, length=4, delete=0, req_type=6, actual_type=0xbf8823ac, actual_format=0xbf8823a8, nitems=0xbf8823a4, bytesafter=0xbf8823a0, prop=0xbf88239c) at ../../src/GetProp.c:146
#10 0x00e4bd83 in IA__gdk_property_get (window=0x7391d0e0, property=0x50, type=0x6, offset=0, length=16, pdelete=0, actual_property_type=0xbf882410, actual_format_type=0xbf882414, actual_length=0xbf882418, data=0xbf88241c) at /home/tom/src/gtk+2.0-2.20.1/gdk/x11/gdkproperty-x11.c:593
#11 0x0290c42d in ?? () from /home/tom/firefox-41.0b2/libxul.so
#12 0x0290c4ca in ?? () from /home/tom/firefox-41.0b2/libxul.so
#13 0xbf8824e0 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) 

I'll attach the full backtrace as well
Posted file bt.txt
backtrace from gdb firefox corefile
(In reply to Thomas Mittelstaedt from comment #9)
> Now here is some more information which might help somebody. I could
> reproduce the crash, again with 41.0b2, having turned 
> layers.offmainthreadcomposition.enabled back on.

> 
> Opened a lot of pages pressing the "Enable temporary restrictions" for the
> page with the NoScript add-on and got a core dump:
> 

Correction: I mean, temporarily disabled restrictions for the page, which made a lot of advertisements load and consume CPU!
(In reply to Thomas Mittelstaedt from comment #10)
> Created attachment 8650872 [details]
> bt.txt
> 
> backtrace from gdb firefox corefile

Google for that assert and found the following:
http://lists.ximian.com/pipermail/gtk-sharp-list/2009-April/009601.html

"GTK+ isn't threadsafe; you can't safely touch GTK objects from other
threads."
I was also having these random crashes since I updated to 40.0. But only on a rather old installation of Linux (Ubuntu 10.04). On my other machines (Ubuntu 12.04 and 14.04) Firefox doesn't crash (at least, not this way).

There was also an "Assertion ... failed" message in the terminal right before each crash, with two different types of assertions:

bp-ccf66a05-9fe5-494c-b9c7-e34a92150813: firefox: ../../src/xcb_io.c:386: _XAllocID: Assertion `ret != inval_id' failed.
bp-ee92ccaa-dda9-46f5-a235-cc0b72150813: firefox: ../../src/xcb_io.c:386: _XAllocID: Assertion `ret != inval_id' failed.
bp-59942c70-0460-4bea-95a6-3c2132150813: firefox: ../../src/XlibInt.c:595: _XPrivSyncFunction: Assertion `dpy->synchandler == _XPrivSyncFunction' failed.
bp-69f47732-3e93-4cf3-9404-7c7c52150818: firefox: ../../src/XlibInt.c:595: _XPrivSyncFunction: Assertion `dpy->synchandler == _XPrivSyncFunction' failed.
bp-42b81f60-f825-46b6-9c23-3ffa42150818: firefox: ../../src/XlibInt.c:595: _XPrivSyncFunction: Assertion `dpy->synchandler == _XPrivSyncFunction' failed.
bp-56881781-4a8e-44d1-aaf1-dd7772150821: firefox: ../../src/xcb_io.c:386: _XAllocID: Assertion `ret != inval_id' failed.

(the last crash, 56881781-..., is from a clean profile + safe mode).

After I've set html5.offmainthread = false and layers.offmainthreadcomposition.enabled = false, there were no more crashes.
Blocks: 994541
Summary: Firefox 40.0.2 (and also latest aurora build) intermittently crashes without specific reason, problem with older libc6? → _XPrivSyncFunction: Assertion `dpy->synchandler == _XPrivSyncFunction' failed with Firefox 40.0.2 (and also latest aurora build); older libX11?
Also having the same issue on ubuntu 10.04. Only installed extension is ad-block plus. Have set all plugins to 'Ask to Activate'. Have tested with new clean profile, and in safemode with same crash.

Changed html5.offmainthread = false and layers.offmainthreadcomposition.enabled = false as tried by Roman Kozlov and will see if it crashes again.
I have been experiencing this issue since I updated from 39.x to 40.0.3 on Ubuntu 10.04. I've changed html5.offmainthread = false and layers.offmainthreadcomposition.enabled = false and just got the same crash (bp-db2739b1-401d-47de-a494-324382150921). Last time I also got

(firefox:3757): Gtk-CRITICAL **: gtk_clipboard_set_with_data: assertion `targets != NULL' failed
(firefox:3757): Gtk-CRITICAL **: gtk_clipboard_set_with_data: assertion `targets != NULL' failed

before the usual
firefox: ../../src/XlibInt.c:595: _XPrivSyncFunction: Assertion `dpy->synchandler == _XPrivSyncFunction' failed.

I don't know if they are related to the crash, because it had crashed before without showing the Gtk error.
(In reply to Mariano Marziali Bermúdez from comment #15)
> I have been experiencing this issue since I updated from 39.x to 40.0.3 on
> Ubuntu 10.04. I've changed html5.offmainthread = false and
> layers.offmainthreadcomposition.enabled = false and just got the same crash
> (bp-db2739b1-401d-47de-a494-324382150921). Last time I also got
> 
> (firefox:3757): Gtk-CRITICAL **: gtk_clipboard_set_with_data: assertion
> `targets != NULL' failed
> (firefox:3757): Gtk-CRITICAL **: gtk_clipboard_set_with_data: assertion
> `targets != NULL' failed
> 
> before the usual
> firefox: ../../src/XlibInt.c:595: _XPrivSyncFunction: Assertion
> `dpy->synchandler == _XPrivSyncFunction' failed.
> 
> I don't know if they are related to the crash, because it had crashed before
> without showing the Gtk error.

I guess last night I was too sleepy and I forgot to restart after changing about:config. So far it ran smoothly after disabling offmainthreads.
Just to clarify, html5.offmainthread setting has nothing to do with this crash. I tried it, because it had "offmainthread" in its name, but since then I was able to reproduce this crash with either value (true/false).
Same problem here. Trying the "layers.offmainthreadcomposition.enabled = false" workaround...
This attempts to get the unsafe gdk_property_get call onto the main thread only and only when we've actually gotten a property notify event telling us that the property has changed. Currently, we're both calling GetClientOffset off the main thread and querying the property far more than necessary.
Attachment #8669136 - Flags: review?(karlt)
In the process of tracking down GetClientOffset users, I found this strange GetClientBounds call sitting here doing nothing.

Some years ago, CompositorParent actually used the result to set mWidgetSize inside CompositorParent. In bug 827844, Then mWidgetSize was removed, leaving the call to GetClientBounds. This patch did the removal: https://bug827844.bmoattachments.org/attachment.cgi?id=699891

As far as I can see, this performs no useful side-effect either, other can causing unnecessary round-trips to the windowing system. But I don't think that was the intention...

So for the good of the kittens, and to allay future confusion in the code, let's just kill this call.
Assignee: nobody → lsalzman
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Attachment #8669313 - Flags: review?(jmuizelaar)
Attachment #8669313 - Flags: review?(jmuizelaar) → review+
Blocks: 1097114
(In reply to Lee Salzman [:eihrul] from comment #20)
> Created attachment 8669313 [details] [diff] [review]
> remove unnecessary GetClientBounds call in CompositorParent
> 
> In the process of tracking down GetClientOffset users, I found this strange
> GetClientBounds call sitting here doing nothing.
> 
> Some years ago, CompositorParent actually used the result to set mWidgetSize
> inside CompositorParent. In bug 827844, Then mWidgetSize was removed,
> leaving the call to GetClientBounds. This patch did the removal:
> https://bug827844.bmoattachments.org/attachment.cgi?id=699891
> 
> As far as I can see, this performs no useful side-effect either, other can
> causing unnecessary round-trips to the windowing system. But I don't think
> that was the intention...
> 
> So for the good of the kittens, and to allay future confusion in the code,
> let's just kill this call.

Another finding which seems worth nothing is that this particular caller was happening off the main thread, so in combination with the property calls in GetClientOffset, this seems like a possible suspected gremlin that could have triggered this.
Comment on attachment 8669313 [details] [diff] [review]
remove unnecessary GetClientBounds call in CompositorParent

nsIWidgets are positioned and call nsIWidgetListener::WindowMoved with the
top-left of the window manager decorations.  GetBounds() and GetScreenBounds()
both return this same position.  Inside these bounds is the native window for
drawing and events, the position of which is given by WidgetToScreenOffset().
These reference points changed for bug 668437.

GetClientOffset() seems to be named and documented wrt window origin.

GetClientBounds() existed before GetClientOffset().  It is similarly named and
not clearly documented, but one would assume from the existing API at the time
that the top left is relative to nsIWidget top left (including decorations),
and that is what was implemented for bug 668437.

On X11, an offset of 0 would be consistent with GetClientOffset()
documentation, and would provide the correct offset if used for event and
drawing positions in the native window.  However, a different offset is
required for event coordinates relative to the nsIWidget.  I suspect this may
be used to translated popup coords into "owner" window coords for situations
like in bug 668437.  As GetClientOffset() is currently implemented, drawing
code should not be using this method.  Similarly with the top-left from
GetClientBounds().

So thanks for finding this.
Comment on attachment 8669136 [details] [diff] [review]
only update nsWindow client offset when _NET_FRAME_EXTENTS property actually changes

This looks like should it should provide a nice optimization.  There may be a
risk of changing the order of when GetClientOffset() starts returning sane
values.  However, this already required waiting for the window manager to set
the property, so if that delay is a problem, then it is probably an existing
bug that can be handled separately.

>+  if (aEvent->atom == gdk_atom_intern("_NET_FRAME_EXTENTS", FALSE)) {
>+    UpdateClientOffset();
>+    return TRUE;
>+  }

Please return FALSE here, as other listeners may want to know about changes in this non-Gecko-specific property.  GtkWindow current doesn't listen for this property, but it could quite reasonably do so.
Attachment #8669136 - Flags: review?(karlt) → review+
Only changed it so that it returns FALSE to not eat the property notify event as Karl requested.

Carried over Karl's r+.
Attachment #8671880 - Flags: review+
Attachment #8669136 - Attachment is obsolete: true
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/29e5d93f022c
https://hg.mozilla.org/mozilla-central/rev/651b3818a851
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla44
I still have the very same issue in Firefox 42. Waiting for Firefox 44 to solve this, 3 months away yet :-(
(In reply to Jesus Cea from comment #28)
> I still have the very same issue in Firefox 42. Waiting for Firefox 44 to
> solve this, 3 months away yet :-(

You could try using the developer or beta editions until such time as 44 becomes stable.
You need to log in before you can comment on or make changes to this bug.