[Parent 2439] ###!!! ABORT: actor has been |delete|d: file /Users/gal/workspace/B2G/objdir-gecko/ipc/ipdl/PGrallocBufferParent.cpp, line 373 [Parent 2439] ###!!! ABORT: actor has been |delete|d: file /Users/gal/workspace/B2G/objdir-gecko/ipc/ipdl/PGrallocBufferParent.cpp, line 373 Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 2439.2461] 0x410c984c in mozalloc_abort (msg=<value optimized out>) at /Users/gal/workspace/B2G/gecko/memory/mozalloc/mozalloc_abort.cpp:23 23 MOZ_CRASH(); (gdb) bt 20 #0 0x410c984c in mozalloc_abort (msg=<value optimized out>) at /Users/gal/workspace/B2G/gecko/memory/mozalloc/mozalloc_abort.cpp:23 #1 0x40050404 in _normal_unlock (mutex=0x3) at bionic/libc/bionic/pthread.c:973 #2 pthread_mutex_unlock (mutex=0x3) at bionic/libc/bionic/pthread.c:1120 #3 0x21212122 in ?? () #4 0x21212122 in ?? () Backtrace stopped: previous frame identical to this frame (corrupt stack?) (gdb)
STR: open NYT.com in otoro and click on the first ad that appears (MarcJacobs.com).
This is breaking on the first abort (that for some reason didn't abort us): #0 mozilla::layers::PGrallocBufferParent::Write (this=0x3593950, __v=0x3593950, __msg=0x28d2010, __nullable=false) at /Users/gal/workspace/B2G/objdir-gecko/ipc/ipdl/PGrallocBufferParent.cpp:373 #1 0x40c303f4 in mozilla::layers::PGrallocBufferParent::Send__delete__ (actor=0x3593950) at /Users/gal/workspace/B2G/objdir-gecko/ipc/ipdl/PGrallocBufferParent.cpp:70 #2 0x40d2ad94 in mozilla::layers::ShadowLayerManager::PlatformDestroySharedSurface (this=<value optimized out>, aSurface=0x3427fc4) at /Users/gal/workspace/B2G/gecko/gfx/layers/ipc/ShadowLayerUtilsGralloc.cpp:200 #3 0x40d28af0 in mozilla::layers::ShadowLayerManager::DestroySharedSurface (this=0x3593950, aSurface=0x3593950, aDeallocator=0x28d2010) at /Users/gal/workspace/B2G/gecko/gfx/layers/ipc/ShadowLayers.cpp:552 #4 0x40d29fb2 in mozilla::layers::ShadowLayersParent::DestroySharedSurface (this=0x28d2010, aSurface=0x3593950) at /Users/gal/workspace/B2G/gecko/gfx/layers/ipc/ShadowLayersParent.cpp:503 #5 0x40d1daae in mozilla::layers::ShadowThebesLayerOGL::DestroyFrontBuffer (this=0x3427dc0) at /Users/gal/workspace/B2G/gecko/gfx/layers/opengl/ThebesLayerOGL.cpp:1067 #6 0x40d1d8fc in mozilla::layers::ShadowThebesLayerOGL::Destroy (this=0x3593950) at /Users/gal/workspace/B2G/gecko/gfx/layers/opengl/ThebesLayerOGL.cpp:1082 #7 0x40d18486 in ContainerDestroy<mozilla::layers::ShadowContainerLayerOGL> (this=0x34ff448) at /Users/gal/workspace/B2G/gecko/gfx/layers/opengl/ContainerLayerOGL.cpp:97 #8 mozilla::layers::ShadowContainerLayerOGL::Destroy (this=0x34ff448) at /Users/gal/workspace/B2G/gecko/gfx/layers/opengl/ContainerLayerOGL.cpp:365 #9 0x40d18740 in ~ShadowContainerLayerOGL (this=0x3593950, __in_chrg=<value optimized out>) at /Users/gal/workspace/B2G/gecko/gfx/layers/opengl/ContainerLayerOGL.cpp:347 #10 0x40d187a0 in ~ShadowContainerLayerOGL (this=0x3593950, __in_chrg=<value optimized out>) at /Users/gal/workspace/B2G/gecko/gfx/layers/opengl/ContainerLayerOGL.cpp:348 #11 0x40620210 in gfxPattern::Release (this=0x3593950) at /Users/gal/workspace/B2G/gecko/image/src/Decoder.h:83 #12 0x40d2a30c in ~nsRefPtr (this=0x1f42758, __in_chrg=<value optimized out>) at ../../dist/include/nsAutoPtr.h:874 #13 ~ShadowLayersParent (this=0x1f42758, __in_chrg=<value optimized out>) at /Users/gal/workspace/B2G/gecko/gfx/layers/ipc/ShadowLayersParent.cpp:106 #14 0x40d2a338 in ~ShadowLayersParent (this=0x3593950, __in_chrg=<value optimized out>) at /Users/gal/workspace/B2G/gecko/gfx/layers/ipc/ShadowLayersParent.cpp:106 #15 0x40d26f3c in mozilla::layers::CrossProcessCompositorParent::DeallocPLayers (this=<value optimized out>, aLayers=0x1f42758) at /Users/gal/workspace/B2G/gecko/gfx/layers/ipc/CompositorParent.cpp:995 #16 0x40c2fbd6 in mozilla::layers::PCompositorParent::DeallocSubtree (this=0x3661170) at /Users/gal/workspace/B2G/objdir-gecko/ipc/ipdl/PCompositorParent.cpp:752 #17 0x40c2fce4 in mozilla::layers::PCompositorParent::OnChannelError (this=0x3661170) at /Users/gal/workspace/B2G/objdir-gecko/ipc/ipdl/PCompositorParent.cpp:592 #18 0x40c07a64 in mozilla::ipc::AsyncChannel::NotifyMaybeChannelError (this=0x3661178) at /Users/gal/workspace/B2G/gecko/ipc/glue/AsyncChannel.cpp:549 #19 0x40c07bec in mozilla::ipc::AsyncChannel::OnNotifyMaybeChannelError (this=0x3661178) at /Users/gal/workspace/B2G/gecko/ipc/glue/AsyncChannel.cpp:514 #20 0x40bf4762 in DispatchToMethod<mozilla::dom::ContentParent, void (mozilla::dom::ContentParent::*)()> (this=<value optimized out>) at /Users/gal/workspace/B2G/gecko/ipc/chromium/src/base/tuple.h:383 #21 RunnableMethod<mozilla::dom::ContentParent, void (mozilla::dom::ContentParent::*)(), Tuple0>::Run (this=<value optimized out>) at /Users/gal/workspace/B2G/gecko/ipc/chromium/src/base/task.h:307 #22 0x40cd19f4 in MessageLoop::RunTask (this=0x45979df4, task=0x0) at /Users/gal/workspace/B2G/gecko/ipc/chromium/src/base/message_loop.cc:326 #23 0x40cd2842 in MessageLoop::DeferOrRunPendingTask (this=0x3661178, pending_task=<value optimized out>) at /Users/gal/workspace/B2G/gecko/ipc/chromium/src/base/message_loop.cc:334 #24 0x40cd3420 in MessageLoop::DoWork (this=0x45979df4) at /Users/gal/workspace/B2G/gecko/ipc/chromium/src/base/message_loop.cc:434 #25 0x40cd3690 in base::MessagePumpDefault::Run (this=0x249f3b8, delegate=0x45979df4) ---Type <return> to continue, or q <return> to quit--- at /Users/gal/workspace/B2G/gecko/ipc/chromium/src/base/message_pump_default.cc:23 #26 0x40cd19a4 in MessageLoop::RunInternal (this=0x4c595) at /Users/gal/workspace/B2G/gecko/ipc/chromium/src/base/message_loop.cc:208 #27 0x40cd1a5a in MessageLoop::RunHandler (this=0x45979df4) at /Users/gal/workspace/B2G/gecko/ipc/chromium/src/base/message_loop.cc:201 #28 MessageLoop::Run (this=0x45979df4) at /Users/gal/workspace/B2G/gecko/ipc/chromium/src/base/message_loop.cc:175 #29 0x40cd9d44 in base::Thread::ThreadMain (this=0x249eeb8) at /Users/gal/workspace/B2G/gecko/ipc/chromium/src/base/thread.cc:156 #30 0x40ce3bb8 in ThreadFunc (closure=0x1) at /Users/gal/workspace/B2G/gecko/ipc/chromium/src/base/platform_thread_posix.cc:31 #31 0x400d90d8 in __thread_entry (func=0x40ce3bb1 <ThreadFunc>, arg=0x249eeb8, tls=<value optimized out>) at bionic/libc/bionic/pthread.c:217 #32 0x400d8c2c in pthread_create (thread_out=<value optimized out>, attr=0xbecf3488, start_routine=0x40ce3bb1 <ThreadFunc>, arg=0x249eeb8) at bionic/libc/bionic/pthread.c:357 #33 0x00000000 in ?? ()
Created attachment 645337 [details] [diff] [review] patch Very unlikely to be the right fix, but presented here for general amusement, and so that my demo doesn't crash.
Still crashes with the patch.
Created attachment 645341 [details] [diff] [review] another attempt This is definitely not the right approach. I think the content process dies and we fail. Just a guess.
I can reproduce.
The problem here is that we have a dtor called during an unexpected time, which is trying to do nontrivial cleanup. As PGrallocBuffer is part of the protocol tree, it's already been cleaned up and marked dead, automatically. That wouldn't have happened for shmem because they're basically ignored during shutdown. This is kind of a latent bug.
Created attachment 645365 [details] [diff] [review] Correct-ish patch Needs to be implemented more widely, but this is what we want to do.
Rebuilding to double-check it works atm.
Yes, this fixes the ThebesLayerOGL crash.
The underlying bug here is that the destruction protocol for basic layers is different than GL layers. The GL layers call Destroy() always from dtors, but shadow layers need to skip that during emergency shutdown, because it's not strictly safe to free surfaces etc. I think right solution here is to "fix" the shadow GL layers to behave like shadow basic layers.
Created attachment 645578 [details] [diff] [review] Fix shutdown protocol of shadow OGL layers I think the last person to review shadow-layers destruction code was vlad, 1.5 or 2 years ago, so I'm not sure who to ping for review anymore. Sorry. We would never run into this case with omtc because there's no such thing as "emergency shutdown" cross-thread. We should have hit this if we had GL enabled for xul-fennec tests, which I guess we don't.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #12) > I think the last person to review shadow-layers destruction code was vlad, > 1.5 or 2 years ago, so I'm not sure who to ping for review anymore. Sorry. How about ... vlad?
I don't think he's seen this code for 9 months or so. But sure.
And, thanks for reviewing anyway.
This appears to be causing Layer leaking on OSX bots: https://tbpl.mozilla.org/php/getParsedLog.php?id=13825210&tree=Mozilla-Inbound
Should have backed this out with great prejudice!
Created attachment 645673 [details] [diff] [review] Fix shutdown protocol of shadow OGL layers, now with less leaky
I suspect this also caused Android R1 orange: https://tbpl.mozilla.org/php/getParsedLog.php?id=13826523&tree=Mozilla-Inbound https://tbpl.mozilla.org/php/getParsedLog.php?id=13827014&tree=Mozilla-Inbound
Comment on attachment 645673 [details] [diff] [review] Fix shutdown protocol of shadow OGL layers, now with less leaky I like less leaky!