Open
Bug 1069523
Opened 10 years ago
Updated 2 years ago
Frequent hang in compositor thread with DRI3 drivers
Categories
(Core :: Graphics: Layers, defect)
Tracking
()
NEW
Tracking | Status | |
---|---|---|
e10s | - | --- |
People
(Reporter: johns, Assigned: handyman)
References
Details
(Whiteboard: [upstream DRI3 bug])
For the last two weeks or so I've been hitting this hang fairly frequently with hardware acceleration turned on in linux: Parent: > #0 0x00007ffff7bc9b2f in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib/libpthread.so.0 > #1 0x00007ffff7e979ae in PR_WaitCondVar (cvar=0x7fffd4021280, timeout=<optimized out>) at nsprpub/pr/src/pthreads/ptsynch.c:385 > #2 0x00007ffff22b3a26 in Wait (this=<optimized out>, aInterval=<optimized out>) at objdir/ipc/glue/../../dist/include/mozilla/CondVar.h:79 > #3 operator-> (this=<optimized out>) at ../../dist/include/mozilla/Monitor.h:40 > #4 WaitForSyncNotify (this=<optimized out>, this=<optimized out>) at ipc/glue/MessageChannel.cpp:1431 > #5 mozilla::ipc::MessageChannel::SendAndWait (this=0x7fffd400b060, aMsg=<optimized out>, aReply=0x7fffffff5a08) at ipc/glue/MessageChannel.cpp:723 > #6 0x00007ffff22b3446 in mozilla::ipc::MessageChannel::Send (this=0x7fffd400b060, aMsg=0x7fffbe07bc40, aReply=0x7fffffff5a08) at ipc/glue/MessageChannel.cpp:630 > #7 0x00007ffff242bd17 in mozilla::layers::PLayerTransactionChild::SendUpdate (this=0x7fffd4387df0, cset=..., id=<optimized out>, targetConfig=..., isFirstPaint=<optimized out>, scheduleComposite=<optimized out>, paintSequenceNumber=<optimized out>, isRepeatTransaction=<optimized out>, transactionStart=..., reply=<optimized out>) at objdir/ipc/ipdl/./PLayerTransactionChild.cpp:244 > #8 0x00007ffff288c535 in mozilla::layers::ClientLayerManager::ForwardTransaction (this=0x7fffd4a3ef80, aScheduleComposite=<optimized out>) at gfx/layers/ipc/ShadowLayers.cpp:650 > #9 0x00007ffff288b41c in mozilla::layers::ClientLayerManager::EndTransaction (this=0x7fffd4a3ef80, aCallback=0x7ffff3c7e7c0 <mozilla::FrameLayerBuilder::DrawThebesLayer(mozilla::layers::ThebesLayer*, gfxContext*, nsIntRegion const&, mozilla::layers::DrawRegionClip, nsIntRegion const&, void*)>, aCallbackData=0x7fffffff7b60, aFlags=mozilla::layers::LayerManager::END_DEFAULT) at gfx/layers/client/ClientLayerManager.cpp:292 > #10 0x00007ffff3cd67ad in nsDisplayList::PaintForFrame (this=<optimized out>, aBuilder=0x7fffffff7b60, aCtx=<optimized out>, aForFrame=<optimized out>, aFlags=<optimized out>) at layout/base/nsDisplayList.cpp:1352 > #11 0x00007ffff3cf0f21 in nsLayoutUtils::PaintFrame (aRenderingContext=0x0, aFrame=0x7fffdc9bd4e0, aDirtyRegion=..., aBackstop=<optimized out>, aFlags=<optimized out>) at layout/base/nsDisplayList.cpp:1198 > #12 0x00007ffff3c57cd3 in PresShell::Paint (this=0x7fffdc735800, aViewToPaint=<optimized out>, aDirtyRegion=..., aFlags=1) at layout/base/nsPresShell.cpp:6230 > #13 0x00007ffff35caea7 in GetViewManager (aWidget=0x7fffde5078f0, this=<optimized out>) at view/nsViewManager.cpp:443 > #14 nsViewManager::ProcessPendingUpdatesForView (this=0x7fffdd816d00, aView=<optimized out>, aFlushDirtyRegion=<optimized out>) at view/nsViewManager.cpp:384 > #15 0x00007ffff3c69353 in nsRefreshDriver::Tick (this=0x7fffdc735000, aNowEpoch=<optimized out>, aNowTime=...) at layout/base/nsRefreshDriver.cpp:1341 > #16 0x00007ffff3c6af09 in TickDriver (this=0x7fffb36c6cc0, driver=<optimized out>, jsnow=<optimized out>, driver=<optimized out>, jsnow=<optimized out>, now=...) at layout/base/nsRefreshDriver.cpp:173 > #17 Tick (this=<optimized out>) at layout/base/nsRefreshDriver.cpp:164 > #18 mozilla::RefreshDriverTimer::TimerTick (aTimer=0x7fffd402128c, aClosure=<optimized out>) at layout/base/nsRefreshDriver.cpp:190 > #19 0x00007ffff1fe5087 in nsTimerEvent::Run (this=0x7fffc126c7f0) at xpcom/threads/nsTimerImpl.cpp:618 > #20 0x00007ffff1fe2811 in nsThread::ProcessNextEvent (this=0x7ffff6c4d140, aMayWait=<optimized out>, aResult=0x7fffffff9587) at xpcom/threads/nsThread.cpp:823 > #21 0x00007ffff22b6e97 in mozilla::ipc::MessagePump::Run (this=0x7fffe74674c0, aDelegate=0x7ffff6c92500) at xpcom/glue/nsThreadUtils.cpp:265 > #22 0x00007ffff35df1cc in nsBaseAppShell::Run (this=0x7fffe3cc7400) at ipc/chromium/src/base/message_loop.cc:234 > #23 0x00007ffff42dfc0e in nsAppStartup::Run (this=0x7fffe1a47060) at toolkit/components/startup/nsAppStartup.cpp:280 > #24 0x00007ffff4334abc in XREMain::XRE_main (this=0x7fffffff9a60, argc=<optimized out>, argv=<optimized out>, aAppData=<optimized out>) at toolkit/xre/nsAppRunner.cpp:4123 > #25 0x00007ffff4334f19 in XRE_main (argc=128, argv=0x17f, aAppData=0xffffffffffffffff, aFlags=<optimized out>) at toolkit/xre/nsAppRunner.cpp:4408 > #26 0x00000000004049e3 in do_main (argc=<optimized out>, argv=<optimized out>, xreDirectory=0x7ffff6c4c780) at browser/app/nsBrowserApp.cpp:282 > #27 main (argc=<optimized out>, argv=<optimized out>) at browser/app/nsBrowserApp.cpp:643 Child: > #0 0x00007ffff1e1581d in poll () from /usr/lib/libc.so.6 > #1 0x00007ffff4a053d9 in PollWrapper (ufds=0x7fffdc715560, nfsd=4, timeout_=-1) at widget/gtk/nsAppShell.cpp:44 > #2 0x00007fffefaebf04 in ?? () from /usr/lib/libglib-2.0.so.0 > #3 0x00007fffefaec01c in g_main_context_iteration () from /usr/lib/libglib-2.0.so.0 > #4 0x00007ffff4a0537c in nsAppShell::ProcessNextNativeEvent (this=<optimized out>, mayWait=<optimized out>) at widget/gtk/nsAppShell.cpp:156 > #5 0x00007ffff49e6597 in nsBaseAppShell::OnProcessNextEvent (this=0x7fffe1e1ea90, thr=0x7fffe8357a80, mayWait=true, recursionDepth=<optimized out>) at widget/xpwidgets/nsBaseAppShell.cpp:140 > #6 0x00007ffff49e66dd in non-virtual thunk to nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal*, bool, unsigned int) () at Unified_cpp_widget_xpwidgets0.cpp:315 > #7 0x00007ffff33e96e0 in nsThread::ProcessNextEvent (this=0x7fffe8357a80, aMayWait=true, aResult=<optimized out>) at xpcom/threads/nsThread.cpp:794 > #8 0x00007ffff36bde97 in mozilla::ipc::MessagePump::Run (this=0x7fffe83ac240, aDelegate=0x7fffffffca40) at xpcom/glue/nsThreadUtils.cpp:265 > #9 0x00007ffff49e61cc in nsBaseAppShell::Run (this=0x7fffe1e1ea90) at ipc/chromium/src/base/message_loop.cc:234 > #10 0x00007ffff36be51e in mozilla::ipc::MessagePumpForChildProcess::Run (this=<optimized out>, aDelegate=<optimized out>) at toolkit/xre/nsEmbedFunctions.cpp:713 > #11 0x00007ffff573f399 in XRE_InitChildProcess (aArgc=<optimized out>, aArgv=<optimized out>) at ipc/chromium/src/base/message_loop.cc:234 > #12 0x00000000004034f8 in content_process_main (argc=<optimized out>, argv=<optimized out>) at ipc/app/../contentproc/plugin-container.cpp:158 > #13 main (argc=-596552352, argv=0x7fffffffde38) at ipc/app/MozillaRuntimeMain.cpp:11
Reporter | ||
Comment 1•10 years ago
|
||
billm suggested I grab the compositor stack as well:
> #0 0x00007ffff7bc9b2f in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib/libpthread.so.0
> #1 0x00007fffed721c49 in ?? () from /usr/lib/libxcb.so.1
> #2 0x00007fffed722ea9 in xcb_wait_for_special_event () from /usr/lib/libxcb.so.1
> #3 0x00007fffeb081254 in ?? () from /usr/lib/libGL.so.1
> #4 0x00007fffeb081855 in ?? () from /usr/lib/libGL.so.1
> #5 0x00007fffeb082245 in ?? () from /usr/lib/libGL.so.1
> #6 0x00007fffd32ac187 in ?? () from /usr/lib/xorg/modules/dri/i965_dri.so
> #7 0x00007fffd32ac4b5 in ?? () from /usr/lib/xorg/modules/dri/i965_dri.so
> #8 0x00007fffd32a0f9d in ?? () from /usr/lib/xorg/modules/dri/i965_dri.so
> #9 0x00007ffff28e26aa in raw_fClear (this=<optimized out>, mask=<optimized out>, this=<optimized out>, mask=<optimized out>) at /home/nephyrin/moz/ff-neph-custom-refox/gfx/layers/../../dist/include/GLContext.h:938
> #10 operator-> (this=<optimized out>, this=<optimized out>, mask=16640, this=<optimized out>) at ../../dist/include/GLContext.h:945
> #11 mozilla::layers::CompositorOGL::BeginFrame (this=0x7fffd5d98bc0, aInvalidRegion=..., aClipRectIn=<optimized out>, aRenderBounds=..., aClipRectOut=<optimized out>, aRenderBoundsOut=<optimized out>) at /home/nephyrin/moz/moz-git-build-refox/gfx/layers/opengl/CompositorOGL.cpp:776
> #12 0x00007ffff28b4568 in mozilla::layers::LayerManagerComposite::EndTransaction (this=0x7fffd68222d0, aCallback=<optimized out>, aCallbackData=<optimized out>, aFlags=<optimized out>) at /home/nephyrin/moz/moz-git-build-refox/gfx/layers/composite/LayerManagerComposite.cpp:650
> #13 0x00007ffff28cca77 in operator-> (this=<optimized out>, aFlags=mozilla::layers::LayerManager::END_DEFAULT, this=<optimized out>) at /home/nephyrin/moz/moz-git-build-refox/gfx/layers/composite/LayerManagerComposite.cpp:210
> #14 mozilla::layers::CompositorParent::CompositeToTarget (this=0x7fffd6aac800, aTarget=0x0, aRect=<optimized out>) at /home/nephyrin/moz/moz-git-build-refox/gfx/layers/ipc/CompositorParent.cpp:706
> #15 0x00007ffff22a4437 in MessageLoop::DeferOrRunPendingTask (this=0x7fffdb0fdd28, pending_task=...) at /home/nephyrin/moz/moz-git-build-refox/ipc/chromium/src/base/message_loop.cc:362
> #16 0x00007ffff22a4b67 in MessageLoop::DoDelayedWork (this=0x7fffdb0fdd28, next_delayed_work_time=<optimized out>) at /home/nephyrin/moz/moz-git-build-refox/ipc/chromium/src/base/message_loop.cc:475
> #17 0x00007ffff22a560c in base::MessagePumpDefault::Run (this=0x7fffda594ce0, delegate=0x7fffdb0fdd28) at /home/nephyrin/moz/moz-git-build-refox/ipc/chromium/src/base/message_pump_default.cc:39
> #18 0x00007ffff22a86a7 in base::Thread::ThreadMain (this=0x7fffdbb46b80) at /home/nephyrin/moz/moz-git-build-refox/ipc/chromium/src/base/message_loop.cc:234
> #19 0x00007ffff2298207 in ThreadFunc (closure=0x7fffd0f80e4c) at /home/nephyrin/moz/moz-git-build-refox/ipc/chromium/src/base/platform_thread_posix.cc:39
> #20 0x00007ffff7bc5124 in start_thread () from /usr/lib/libpthread.so.0
> #21 0x00007ffff6ecc4bd in clone () from /usr/lib/libc.so.6
Reporter | ||
Comment 2•10 years ago
|
||
The hanging dri function is dri3_find_back, which has comment:
> Find an idle back buffer. If there isn't one, then
> wait for a present idle notify event from the X server
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → davidp99
Comment 3•10 years ago
|
||
I seem to have the same issue, but it's not e10s at all for me - browser.tabs.remote.autostart is set to false
This happens on Fedora 21 x86_64 with Intel HD 3000/mobile i5 like all the time - every 5 minutes. Firefox is literally unusable. I have layers acceleration force enabled, which may or may not be related I guess.
Sometimes it hangs up the whole GNOME desktop and then if I don't shoot firefox down from a TTY, it will lock up the entire system - so this is probably a GFX driver issue and not a firefox bug (or not only).
FWIW, here is the backtrace:
> (gdb) bt
> #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
> #1 0x00007fffecc2a399 in _xcb_conn_wait (c=0x7ffff6ae6000, cond=<optimized out>, vector=0x0, count=0x0)
> at xcb_conn.c:415
> #2 0x00007fffecc2b609 in xcb_wait_for_special_event (c=0x7fffcbef35ec, c@entry=0x7ffff6ae6000, se=0x80)
> at xcb_in.c:715
> #3 0x00007fffea7a5e14 in dri3_find_back (c=c@entry=0x7ffff6ae6000, priv=priv@entry=0x7fffcbeee180)
> at dri3_glx.c:1191
>
> #4 0x00007fffea7a64ac in dri3_get_buffer (format=format@entry=4107,
> buffer_type=buffer_type@entry=dri3_buffer_back, loaderPrivate=loaderPrivate@entry=0x7fffcbeee180,
> driDrawable=<optimized out>) at dri3_glx.c:1217
> #5 0x00007fffea7a6ff2 in dri3_get_buffers (driDrawable=<optimized out>, format=4107, stamp=0x7fffcba2c770,
> loaderPrivate=0x7fffcbeee180, buffer_mask=<optimized out>, buffers=0x7fffd40fd8b0) at dri3_glx.c:1394
> #6 0x00007fffcdbd0d77 in intel_update_image_buffers (drawable=<optimized out>, brw=<optimized out>)
> at brw_context.c:1452
> #7 intel_update_renderbuffers (context=0x7fffcbef35ec, context@entry=0x7fffcbef5070, drawable=0x7fffcba2c740)
> at brw_context.c:1144
> #8 0x00007fffcdbd10a5 in intel_prepare_render (brw=brw@entry=0x7fffcba02028) at brw_context.c:1165
> #9 0x00007fffcdbc5add in brw_clear (ctx=0x7fffcba02028, mask=18) at brw_clear.c:234
> #10 0x00007ffff1f2a4f9 in mozilla::layers::CompositorOGL::BeginFrame(nsIntRegion const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const*, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits>*, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits>*) ()
> from /home/jonas/firefox/libxul.so
> #11 0x00007ffff1f12611 in mozilla::layers::LayerManagerComposite::Render() () from /home/jonas/firefox/libxul.so
> ---Type <return> to continue, or q <return> to quit---
> #12 0x00007ffff1f12835 in mozilla::layers::LayerManagerComposite::EndTransaction(void (*)(mozilla::layers::ThebesLayer*, gfxContext*, nsIntRegion const&, mozilla::layers::DrawRegionClip, nsIntRegion const&, void*), void*, mozilla::layers::LayerManager::EndTransactionFlags) () from /home/jonas/firefox/libxul.so
> #13 0x00007ffff1f128ed in mozilla::layers::LayerManagerComposite::EndEmptyTransaction(mozilla::layers::LayerManager::EndTransactionFlags) () from /home/jonas/firefox/libxul.so
> #14 0x00007ffff1f22af4 in mozilla::layers::CompositorParent::CompositeToTarget(mozilla::gfx::DrawTarget*, nsIntRect const*) () from /home/jonas/firefox/libxul.so
> #15 0x00007ffff1b82385 in MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) ()
> from /home/jonas/firefox/libxul.so
> #16 0x00007ffff175b933 in MessageLoop::DoDelayedWork(base::TimeTicks*) () from /home/jonas/firefox/libxul.so
> #17 0x00007ffff1b82536 in base::MessagePumpDefault::Run(base::MessagePump::Delegate*) ()
> from /home/jonas/firefox/libxul.so
> #18 0x00007ffff1b826f7 in MessageLoop::Run() () from /home/jonas/firefox/libxul.so
> #19 0x00007ffff1b865af in base::Thread::ThreadMain() () from /home/jonas/firefox/libxul.so
> #20 0x00007ffff1b77a0a in ThreadFunc(void*) () from /home/jonas/firefox/libxul.so
> #21 0x00007ffff7bc657a in start_thread (arg=0x7fffd40fe700) at pthread_create.c:310
> #22 0x00007ffff6cc853d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
> (gdb)
Reporter | ||
Comment 4•10 years ago
|
||
So as a workaround, using an intel driver compiled with --disable-dri3 avoids this codepath and hang, and --disable-dri3 appears to be default on at least Arch linux. DRI3 is likely to be the default at some point in the future, however.
Comment 5•10 years ago
|
||
This seems to be the according upstream bug in the intel gfx driver: https://bugs.freedesktop.org/show_bug.cgi?id=84252
Reporter | ||
Comment 6•10 years ago
|
||
(In reply to Jonas Thiem from comment #5) > This seems to be the according upstream bug in the intel gfx driver: > https://bugs.freedesktop.org/show_bug.cgi?id=84252 Yes, that's definitely the issue I was seeing.
Whiteboard: [upstream DRI3 bug]
Reporter | ||
Comment 7•10 years ago
|
||
I can also confirm comment 3 that this is not e10s specific, likely just OMTC
Summary: [e10s] Frequent hang in PLayerTransactionChild::SendUpdate → Frequent hang in PLayerTransactionChild::SendUpdate
Reporter | ||
Updated•10 years ago
|
Summary: Frequent hang in PLayerTransactionChild::SendUpdate → Frequent hang in compositor thread with DRI3 drivers
Updated•8 years ago
|
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•