Closed Bug 1263515 Opened 8 years ago Closed 8 years ago

crash in mozalloc_abort | NS_DebugBreak | X11Error

Categories

(Core :: Graphics, defect)

x86
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox48 --- fixed

People

(Reporter: n.nethercote, Assigned: nical)

References

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is 
report bp-d891782d-13ac-49ba-9c7a-773042160408.
=============================================================

This signature started in Nightly 20160408030212 and there have been over 100 crash reports since, which is very high for Linux.

It's an abort that includes the text "X_GLXMakeCurrent: GLXBadDrawable".
karlt, any ideas what might have landed on March 7th to cause this?
Flags: needinfo?(karlt)
Most of these have "GL Layers!" in the App Notes and an error related to GL.  I assume that means that the user has forced on GL layers through a pref in about:config.  There does seem to be more than one user having this trouble.

There are a couple of different errors here that do seem more likely in the default config, but they are in the minority.

X_ChangeProperty: BadWindow
https://crash-stats.mozilla.com/report/index/8774d6dd-7e8c-4925-9aae-afbbd2160409

X_ShmAttach: BadAccess
https://crash-stats.mozilla.com/report/index/c461ad88-640f-4a46-b9d6-406f42160410
https://crash-stats.mozilla.com/report/index/6cb2aa02-fc0d-4b0d-a194-5dc922160407
Flags: needinfo?(karlt)
I don't know what might have changed on March 7, sorry.
It could even be a few users changing their prefs, but that seems less likely.
I have this 100% reproducible on trying to download a file, and I do have:

layers.acceleration.force-enabled;true

Let me know what you need to debug this.
Here's a backtrace from a debug build:

(gdb) bt
#0  0x00007f13a4f3df2d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f13a4f3ddc4 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:137
#2  0x00007f1398d3fd67 in ah_crap_handler(int) (signum=11) at /home/morbo/hg/firefox/toolkit/xre/nsSigHandlers.cpp:103
#3  0x00007f1398d28466 in nsProfileLock::FatalSignalHandler(int, siginfo_t*, void*) (signo=11, info=0x7f137f2f5070, context=0x7f137f2f4f40) at /home/morbo/hg/firefox/toolkit/profile/nsProfileLock.cpp:191
#4  0x00007f1399d578d7 in AsmJSFaultHandler(int, siginfo_t*, void*) (signum=<optimized out>, info=0x7f137f2f5070, context=0x7f137f2f4f40) at /home/morbo/hg/firefox/js/src/asmjs/WasmSignalHandlers.cpp:1175
#5  0x00007f13a5c648d0 in <signal handler called> () at /lib/x86_64-linux-gnu/libpthread.so.0
#6  0x0000000000405d7c in mozalloc_abort(char const*) (msg=msg@entry=0x7f137f2f5520 "[Parent 2264] ###!!! ABORT: X_GLXMakeCurrent: GLXBadDrawable; id=0x0\nRe-running with MOZ_X_SYNC=1 in the environment may give a more helpful backtrace.: file /home/morbo/hg/firefox/toolkit/xre/nsX11Er"...) at /home/morbo/hg/firefox/memory/mozalloc/mozalloc_abort.cpp:33
#7  0x00007f1395824f20 in NS_DebugBreak(uint32_t, char const*, char const*, char const*, int32_t) (aMsg=0x7f137f2f5520 "[Parent 2264] ###!!! ABORT: X_GLXMakeCurrent: GLXBadDrawable; id=0x0\nRe-running with MOZ_X_SYNC=1 in the environment may give a more helpful backtrace.: file /home/morbo/hg/firefox/toolkit/xre/nsX11Er"...)
    at /home/morbo/hg/firefox/xpcom/base/nsDebugImpl.cpp:447
#8  0x00007f1395824f20 in NS_DebugBreak(uint32_t, char const*, char const*, char const*, int32_t) (aSeverity=<optimized out>, aStr=0x7f137af0f1a8 "X_GLXMakeCurrent: GLXBadDrawable; id=0x0\nRe-running with MOZ_X_SYNC=1 in the environment may give a more helpful backtrace.", aExpr=0x0, aFile=0x7f139a219920 "/home/morbo/hg/firefox/toolkit/xre/nsX11ErrorHandler.cpp", aLine=157) at /home/morbo/hg/firefox/xpcom/base/nsDebugImpl.cpp:403
#9  0x00007f1398d41790 in X11Error(Display*, XErrorEvent*) (display=<optimized out>, event=event@entry=0x7f137f2f6310)
    at /home/morbo/hg/firefox/toolkit/xre/nsX11ErrorHandler.cpp:157
#10 0x00007f1398d41bf1 in GdkErrorHandler(gchar const*, GLogLevelFlags, gchar const*, gpointer) (log_domain=log_domain@entry=0x7f13a35cface "Gdk", log_level=log_level@entry=6, message=<optimized out>, 
    message@entry=0x7f1375046800 "The program 'firefox' received an X Window System error.\nThis probably reflects a bug in the program.\nThe error was 'GLXBadDrawable'.\n  (Details: serial 31140 error_code 171 request_code 155 (GLX) min"..., user_data=user_data@entry=0x0) at /home/morbo/hg/firefox/toolkit/xre/nsGDKErrorHandler.cpp:83
#11 0x00007f139fd1a7d4 in g_logv (log_domain=0x7f13a35cface "Gdk", log_level=G_LOG_LEVEL_ERROR, format=<optimized out>, args=args@entry=0x7f137f2f6450) at /build/glib2.0-BMd9vh/glib2.0-2.46.0/./glib/gmessages.c:1060
#12 0x00007f139fd1a9ff in g_log (log_domain=<optimized out>, log_level=<optimized out>, format=<optimized out>)
    at /build/glib2.0-BMd9vh/glib2.0-2.46.0/./glib/gmessages.c:1119
#13 0x00007f13a359fcd3 in  () at /usr/lib/x86_64-linux-gnu/libgdk-3.so.0
#14 0x00007f13a35aac59 in  () at /usr/lib/x86_64-linux-gnu/libgdk-3.so.0
#15 0x00007f13a2e4045d in _XError () at /usr/lib/x86_64-linux-gnu/libX11.so.6
#16 0x00007f13a2e3d3c7 in  () at /usr/lib/x86_64-linux-gnu/libX11.so.6
#17 0x00007f13a2e3e535 in _XReply () at /usr/lib/x86_64-linux-gnu/libX11.so.6
#18 0x00007f1379f53178 in  () at /usr/lib/x86_64-linux-gnu/libGL.so.1
#19 0x00007f1379f53c79 in  () at /usr/lib/x86_64-linux-gnu/libGL.so.1
#20 0x00007f13966f29db in mozilla::gl::GLXLibrary::xMakeCurrent(_XDisplay*, unsigned long, __GLXcontextRec*) (this=0x7f139b9fec80 <mozilla::gl::sGLXLibrary>, display=0x7f13a4cf5000, drawable=31457358, context=0x7f136fb98140)
    at /home/morbo/hg/firefox/gfx/gl/GLContextProviderGLX.cpp:531
#21 0x00007f13966f2a6f in mozilla::gl::GLContextGLX::MakeCurrentImpl(bool) (this=0x7f137c985000, aForce=<optimized out>) at /home/morbo/hg/firefox/gfx/gl/GLContextProviderGLX.cpp:897
#22 0x00007f13968472ec in mozilla::layers::CompositorOGL::CleanupResources() (aForce=false, this=0x7f137c985000)
    at /home/morbo/hg/firefox/objdir-desktop/dist/include/GLContext.h:3219
#23 0x00007f13968472ec in mozilla::layers::CompositorOGL::CleanupResources() (this=this@entry=0x7f136b603200)
    at /home/morbo/hg/firefox/gfx/layers/opengl/CompositorOGL.cpp:209
#24 0x00007f13968474d5 in mozilla::layers::CompositorOGL::Destroy() (this=0x7f136b603200)
    at /home/morbo/hg/firefox/gfx/layers/opengl/CompositorOGL.cpp:165
#25 0x00007f139682b352 in mozilla::layers::CompositorBridgeParent::ActorDestroy(mozilla::ipc::IProtocolManager<mozilla::ipc::IProtocol>::ActorDestroyReason) (this=0x7f137ab7c800, why=<optimized out>)
    at /home/morbo/hg/firefox/gfx/layers/ipc/CompositorBridgeParent.cpp:967
#26 0x00007f1396259fdb in mozilla::layers::PCompositorBridgeParent::DestroySubtree(mozilla::ipc::IProtocolManager<mozilla::ipc::IProtocol>::ActorDestroyReason) (this=this@entry=0x7f137ab7c800, why=why@entry=mozilla::ipc::IProtocolManager<mozilla::ipc::IProtocol>::NormalShutdown)
    at /home/morbo/hg/firefox/objdir-desktop/ipc/ipdl/PCompositorBridgeParent.cpp:1400
#27 0x00007f139625a00a in mozilla::layers::PCompositorBridgeParent::OnChannelClose() (this=0x7f137ab7c800)
    at /home/morbo/hg/firefox/objdir-desktop/ipc/ipdl/PCompositorBridgeParent.cpp:1305
#28 0x00007f1395e56f10 in mozilla::ipc::MessageChannel::NotifyChannelClosed() (this=this@entry=0x7f137ab7c868)
    at /home/morbo/hg/firefox/ipc/glue/MessageChannel.cpp:2214
#29 0x00007f1395e56f79 in mozilla::ipc::MessageChannel::NotifyMaybeChannelError() (this=this@entry=0x7f137ab7c868)
    at /home/morbo/hg/firefox/ipc/glue/MessageChannel.cpp:2055
#30 0x00007f1395e57102 in mozilla::ipc::MessageChannel::OnNotifyMaybeChannelError() (this=0x7f137ab7c868)
---Type <return> to continue, or q <return> to quit---
    at /home/morbo/hg/firefox/ipc/glue/MessageChannel.cpp:2090
#31 0x00007f1395e5b974 in RunnableMethod<mozilla::ipc::MessageChannel, void (mozilla::ipc::MessageChannel::*)(), mozilla::Tuple<> >::Run() (arg=..., method=<optimized out>, obj=<optimized out>)
    at /home/morbo/hg/firefox/ipc/chromium/src/base/task.h:28
#32 0x00007f1395e5b974 in RunnableMethod<mozilla::ipc::MessageChannel, void (mozilla::ipc::MessageChannel::*)(), mozilla::Tuple<> >::Run() (arg=..., method=<optimized out>, obj=<optimized out>)
    at /home/morbo/hg/firefox/ipc/chromium/src/base/task.h:46
#33 0x00007f1395e5b974 in RunnableMethod<mozilla::ipc::MessageChannel, void (mozilla::ipc::MessageChannel::*)(), mozilla::Tuple<> >::Run() (this=<optimized out>) at /home/morbo/hg/firefox/ipc/chromium/src/base/task.h:289
#34 0x00007f1395dee288 in MessageLoop::RunTask(Task*) (this=0x7f137f2f6d40, task=0x7f136ba655c0)
    at /home/morbo/hg/firefox/ipc/chromium/src/base/message_loop.cc:349
#35 0x00007f1395df0fd6 in MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) (this=this@entry=0x7f137f2f6d40, pending_task=...) at /home/morbo/hg/firefox/ipc/chromium/src/base/message_loop.cc:357
#36 0x00007f1395df11df in MessageLoop::DoWork() (this=0x7f137f2f6d40)
    at /home/morbo/hg/firefox/ipc/chromium/src/base/message_loop.cc:444
#37 0x00007f1395deda65 in base::MessagePumpDefault::Run(base::MessagePump::Delegate*) (this=0x7f13834349d0, delegate=0x7f137f2f6d40) at /home/morbo/hg/firefox/ipc/chromium/src/base/message_pump_default.cc:34
#38 0x00007f1395dee0cc in MessageLoop::RunInternal() (this=this@entry=0x7f137f2f6d40)
    at /home/morbo/hg/firefox/ipc/chromium/src/base/message_loop.cc:230
#39 0x00007f1395dee458 in MessageLoop::Run() (this=0x7f137f2f6d40)
    at /home/morbo/hg/firefox/ipc/chromium/src/base/message_loop.cc:223
#40 0x00007f1395dee458 in MessageLoop::Run() (this=this@entry=0x7f137f2f6d40)
    at /home/morbo/hg/firefox/ipc/chromium/src/base/message_loop.cc:203
#41 0x00007f1395dfe7d8 in base::Thread::ThreadMain() (this=0x7f13834348e0)
    at /home/morbo/hg/firefox/ipc/chromium/src/base/thread.cc:174
#42 0x00007f1395df7252 in ThreadFunc(void*) (closure=<optimized out>)
    at /home/morbo/hg/firefox/ipc/chromium/src/base/platform_thread_posix.cc:36
#43 0x00007f13a5c5d0a4 in start_thread (arg=0x7f137f2f7700) at pthread_create.c:309
#44 0x00007f13a4f6c87d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111


#20 0x00007f13966f29db in mozilla::gl::GLXLibrary::xMakeCurrent (this=0x7f139b9fec80 <mozilla::gl::sGLXLibrary>, 
    display=0x7f13a4cf5000, drawable=31457358, context=0x7f136fb98140)
    at /home/morbo/hg/firefox/gfx/gl/GLContextProviderGLX.cpp:531
531         Bool result = xMakeCurrentInternal(display, drawable, context);
(gdb) 
#21 0x00007f13966f2a6f in mozilla::gl::GLContextGLX::MakeCurrentImpl (this=0x7f137c985000, aForce=<optimized out>)
    at /home/morbo/hg/firefox/gfx/gl/GLContextProviderGLX.cpp:897
897             succeeded = mGLX->xMakeCurrent(mDisplay, mDrawable, mContext);
(gdb) 
#22 0x00007f13968472ec in MakeCurrent (aForce=false, this=0x7f137c985000)
    at /home/morbo/hg/firefox/objdir-desktop/dist/include/GLContext.h:3219
3219            return MakeCurrentImpl(aForce);
(gdb) 
#23 mozilla::layers::CompositorOGL::CleanupResources (this=this@entry=0x7f136b603200)
    at /home/morbo/hg/firefox/gfx/layers/opengl/CompositorOGL.cpp:209
209       mGLContext->MakeCurrent();
(gdb) 
#24 0x00007f13968474d5 in mozilla::layers::CompositorOGL::Destroy (this=0x7f136b603200)
    at /home/morbo/hg/firefox/gfx/layers/opengl/CompositorOGL.cpp:165
165         CleanupResources();
(gdb) 
#25 0x00007f139682b352 in mozilla::layers::CompositorBridgeParent::ActorDestroy (this=0x7f137ab7c800, 
    why=<optimized out>) at /home/morbo/hg/firefox/gfx/layers/ipc/CompositorBridgeParent.cpp:967
967         mCompositor->Destroy();
The first bad revision is:
changeset:   322237:577472ad5c38
user:        Nicolas Silva <nsilva@mozilla.com>
date:        Tue Nov 24 14:50:51 2015 +1300
summary:     Bug 1215265 - Shut PCompositorBridge down properly. r=sotaro
Blocks: 1215265
nical, can you take a look? Thank you.
Flags: needinfo?(nical.bugzilla)
Assignee: nobody → nical.bugzilla
Flags: needinfo?(nical.bugzilla)
The widget (and its Drawable) seems to be already dead when we try to call MakeCurrent on the GLContext which tries to access that drawable. Ideally MakeCurrent would just return false without causing the X11 error, but I am not sure we can easily get this guarantee, so the fix will probably be to make sure we synchronously call mCompositor::Destroy on the compositor thread before nsBaseWidget::DestroyCompositor returns on the main thread.

Also, we use a GL compositor for small widgets like drop-down menus (these short-lived widgets are what's crashing when we destroy them) which is way over-kill. It's certanly worth having these small widgets use a basic compositor instead to avoid the performance and reliability issues of switching between too many gl contexts.
Calling Destroy earlier fixes the issue as far as I can reproduce it. "Earlier" here means at the end of RecvWillClose.

RecvWillClose is called when the widget starts shutting down and it is a synchronous message which ensures the widget doesn't complete its shutdown before the message is received and handled by the CompositorParent. The gtk widgetery makes sure this is called before destroying the gtk Drawable and friends. So from RecvWillClose we are safe as far as the widget's lifetime goes.
Currently we call Compositor::Destroy() from the ActorDestroy handler which happens later (hence the error because we refer to already-dead x11 resources). I thought that closing the channel was synchronous and would cause ActorDestroy to run before the widget is completely destroyed but it was a false assumption.

It is still a good thing to keep the call to Compositor::Destroy in ActorDestroy as well, because in case of abnormal shutdown, ActorDestroy will run but not RecvWillStop, and when this happens the widget should still be alive.

Note that it is important to destroy the compositor at the end of RecvWillStop because RecvWillStop also destroys things like the layer tree, which generates some calls to the compositor.
Attachment #8739946 - Flags: review?(jnicol)
Attachment #8739946 - Flags: review?(jnicol) → review+
https://hg.mozilla.org/mozilla-central/rev/0cfe55a2eb1f
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: