Crash [@ ShadowLayerParent::ActorDestroy] with OGL layers

RESOLVED FIXED

Status

()

defect
RESOLVED FIXED
9 years ago
7 years ago

People

(Reporter: cjones, Assigned: cjones)

Tracking

unspecified
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(fennec2.0b4+)

Details

Attachments

(1 attachment, 2 obsolete attachments)

Tricky one here.  Here's the relevant info from valgrind

==1712== Invalid read of size 8
==1712==    at 0x716EB82: mozilla::gl::GLContext::IsDestroyed() (GLContext.h:477)
==1712==    by 0x7169F6D: mozilla::gl::BasicTextureImage::~BasicTextureImage() (GLContext.cpp:537)
==1712==    by 0x716E4BF: mozilla::gl::TextureImage::Release() (GLContext.h:152)
==1712==    by 0x71A136A: nsRefPtr<mozilla::gl::TextureImage>::~nsRefPtr() (nsAutoPtr.h:969)
==1712==    by 0x71B270D: mozilla::layers::ThebesLayerBufferOGL::~ThebesLayerBufferOGL() (ThebesLayerOGL.cpp:155)
==1712==    by 0x71B3213: mozilla::layers::ShadowBufferOGL::~ShadowBufferOGL() (ThebesLayerOGL.cpp:584)
==1712==    by 0x71B266F: mozilla::layers::ThebesLayerBufferOGL::Release() (ThebesLayerOGL.cpp:146)
==1712==    by 0x71B31A4: nsRefPtr<mozilla::layers::ShadowBufferOGL>::assign_assuming_AddRef(mozilla::layers::ShadowBufferOGL*) (nsAutoPtr.h:957)
==1712==    by 0x71B312D: nsRefPtr<mozilla::layers::ShadowBufferOGL>::assign_with_AddRef(mozilla::layers::ShadowBufferOGL*) (nsAutoPtr.h:941)
==1712==    by 0x71B3016: nsRefPtr<mozilla::layers::ShadowBufferOGL>::operator=(mozilla::layers::ShadowBufferOGL*) (nsAutoPtr.h:1025)
==1712==    by 0x71B21FA: mozilla::layers::ShadowThebesLayerOGL::Destroy() (ThebesLayerOGL.cpp:717)
==1712==    by 0x71A35A0: void mozilla::layers::ContainerDestroy<mozilla::layers::ShadowContainerLayerOGL>(mozilla::layers::ShadowContainerLayerOGL*) (ContainerLayerOGL.cpp:125)
==1712==    by 0x71A245B: mozilla::layers::ShadowContainerLayerOGL::Destroy() (ContainerLayerOGL.cpp:375)
==1712==    by 0x71A2399: mozilla::layers::ShadowContainerLayerOGL::~ShadowContainerLayerOGL() (ContainerLayerOGL.cpp:357)
==1712==    by 0x5A7341B: mozilla::layers::Layer::Release() (Layers.h:473)
==1712==    by 0x5C02ED4: nsRefPtr<mozilla::layers::Layer>::assign_assuming_AddRef(mozilla::layers::Layer*) (nsAutoPtr.h:957)
==1712==    by 0x719D251: nsRefPtr<mozilla::layers::Layer>::assign_with_AddRef(mozilla::layers::Layer*) (nsAutoPtr.h:941)
==1712==    by 0x719C79E: nsRefPtr<mozilla::layers::Layer>::operator=(mozilla::layers::Layer*) (nsAutoPtr.h:1025)
==1712==    by 0x71BA807: mozilla::layers::ShadowLayerParent::ActorDestroy(mozilla::ipc::IProtocolManager<mozilla::ipc::RPCChannel::RPCListener>::ActorDestroyReason) (ShadowLayerParent.cpp:89)
==1712==    by 0x6E3F724: mozilla::layers::PLayerParent::DestroySubtree(mozilla::ipc::IProtocolManager<mozilla::ipc::RPCChannel::RPCListener>::ActorDestroyReason) (PLayerParent.cpp:310)
==1712==    by 0x6E3EEA7: mozilla::layers::PLayerParent::OnMessageReceived(IPC::Message const&) (PLayerParent.cpp:88)
==1712==    by 0x6E24E9F: mozilla::dom::PContentParent::OnMessageReceived(IPC::Message const&) (PContentParent.cpp:628)
==1712==    by 0x6DD062E: mozilla::ipc::AsyncChannel::OnDispatchMessage(IPC::Message const&) (AsyncChannel.cpp:262)
==1712==    by 0x6DD971B: mozilla::ipc::RPCChannel::OnMaybeDequeueOne() (RPCChannel.cpp:438)
==1712==    by 0x6DDF5FF: void DispatchToMethod<mozilla::ipc::RPCChannel, bool (mozilla::ipc::RPCChannel::*)()>(mozilla::ipc::RPCChannel*, bool (mozilla::ipc::RPCChannel::*)(), Tuple0 const&) (tuple.h:383)
==1712==    by 0x6DDF54F: RunnableMethod<mozilla::ipc::RPCChannel, bool (mozilla::ipc::RPCChannel::*)(), Tuple0>::Run() (task.h:307)
==1712==    by 0x6DDB0CE: mozilla::ipc::RPCChannel::RefCountedTask::Run() (RPCChannel.h:450)
==1712==    by 0x6DDB1D1: mozilla::ipc::RPCChannel::DequeueTask::Run() (RPCChannel.h:475)
==1712==    by 0x70A6541: MessageLoop::RunTask(Task*) (message_loop.cc:343)
==1712==    by 0x70A65B1: MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) (message_loop.cc:351)
==1712==  Address 0x1cd19620 is 832 bytes inside a block of size 1,480 free'd
==1712==    at 0x4C270BD: free (vg_replace_malloc.c:366)
==1712==    by 0x706D33B: free (nsTraceMalloc.c:1303)
==1712==    by 0x9207033: moz_free (mozalloc.cpp:92)
==1712==    by 0x718946D: mozilla::gl::GLContextGLX::~GLContextGLX() (mozalloc.h:253)
==1712==    by 0x5F1933D: mozilla::gl::GLContext::Release() (GLContext.h:393)
==1712==    by 0x5F1A6AA: nsRefPtr<mozilla::gl::GLContext>::~nsRefPtr() (nsAutoPtr.h:969)
==1712==    by 0x6C16EF2: nsWindow::Destroy() (nsWindow.cpp:719)
==1712==    by 0x6170918: nsView::~nsView() (nsView.cpp:289)
==1712==    by 0x6170AF7: nsIView::Destroy() (nsView.cpp:337)
==1712==    by 0x5B952E0: nsFrame::DestroyFrom(nsIFrame*) (nsFrame.cpp:462)
==1712==    by 0x5C21098: nsSplittableFrame::DestroyFrom(nsIFrame*) (nsSplittableFrame.cpp:75)
==1712==    by 0x5B8D475: nsContainerFrame::DestroyFrom(nsIFrame*) (nsContainerFrame.cpp:292)
==1712==    by 0x5C45793: ViewportFrame::DestroyFrom(nsIFrame*) (nsViewportFrame.cpp:73)
==1712==    by 0x5A991F5: nsIFrame::Destroy() (nsIFrame.h:553)
==1712==    by 0x5AE2770: nsFrameManager::Destroy() (nsFrameManager.cpp:257)
==1712==    by 0x5B0DD92: PresShell::Destroy() (nsPresShell.cpp:2012)
==1712==    by 0x5ADACB4: DocumentViewerImpl::DestroyPresShell() (nsDocumentViewer.cpp:4279)
==1712==    by 0x5AD0A21: DocumentViewerImpl::Destroy() (nsDocumentViewer.cpp:1622)
==1712==    by 0x687B2FC: nsDocShell::Destroy() (nsDocShell.cpp:4525)
==1712==    by 0x694BE37: nsXULWindow::Destroy() (nsXULWindow.cpp:528)
==1712==    by 0x695AC75: nsWebShellWindow::Destroy() (nsWebShellWindow.cpp:832)
==1712==    by 0x69438A2: nsChromeTreeOwner::Destroy() (nsChromeTreeOwner.cpp:382)
==1712==    by 0x61C1D40: nsGlobalWindow::ReallyCloseWindow() (nsGlobalWindow.cpp:6250)
==1712==    by 0x61C19D3: nsGlobalWindow::FinalClose() (nsGlobalWindow.cpp:6196)
==1712==    by 0x61C187B: nsGlobalWindow::ForceClose() (nsGlobalWindow.cpp:6148)
==1712==    by 0x69724C6: nsAppStartup::CloseAllWindows() (nsAppStartup.cpp:380)
==1712==    by 0x6971EE5: nsAppStartup::Quit(unsigned int) (nsAppStartup.cpp:284)
==1712==    by 0x7053DB6: NS_InvokeByIndex_P (xptcinvoke_x86_64_unix.cpp:208)
==1712==    by 0x678982B: CallMethodHelper::Invoke() (xpcwrappednative.cpp:3064)
==1712==    by 0x67876C0: CallMethodHelper::Call() (xpcwrappednative.cpp:2326)


What's happening is that the chrome process starts shutting down and destroys its top-level window.  This gets us into widget code where it calls layerManager->Destroy(), then destroys the GLContext.  That layerManager->Destroy() is supposed to end up calling Destroy() on all live OGL layers, which causes them to drop their GL resources while the GLContext is still alive.  However, this doesn't work out of the box for layers that are newly disconnected from the layer tree, because they were removed but something else holds a ref keeping them alive.

For same-process layers, I suspect that the only other code keeping "orphaned" layers alive is FrameLayerBuilder, and that's destroyed in the same task with the GLContext so all is well.

For orphaned shadow layers, however, we get the crash above.  The reason is that shadow layers are only deleted outside of transactions, so there's a small window in which live but orphaned.  They end up being deleted when their owning IPDL actor is deleted, PLayer:__delete__.  In the crash above, this __delete__ was coming in after the task that deleted the GLContext, and ~ShadowThebesLayer blew up.

The underlying problem here is that we don't properly LayerOGL::Destroy() shadow layers then the GLContext is destroyed.  Luckily, we have a notification nsFrameLoader::Destroy() in the same task, which ends up sending PBrowser:Destroy to the content process.  Inside TabParent::Destroy, we have the ability to enumerate all live shadow layers.  So, my patch enumerates them and calls Destroy() on all.
Nasty use-after-free crash involving GL resources.  Needs to block.
tracking-fennec: --- → ?
There's a fair amount of goop in this patch, but the idea is pretty simple --- enumerate all shadow layers and Destroy() them in the same task that ends up deleting the GLContext.
Assignee: nobody → jones.chris.g
Attachment #501352 - Flags: review?(vladimir)
Comment on attachment 501352 [details] [diff] [review]
Ensure LayerOGL::Destroy is called for "orphaned" shadow OGL layers before the GLContext is deleted

Crud sorry, forgot canvas/image layers.
Attachment #501352 - Attachment is obsolete: true
Attachment #501352 - Flags: review?(vladimir)
tracking-fennec: ? → 2.0b4+
There was an additional edge case here causing null derefs.
Attachment #501355 - Attachment is obsolete: true
Attachment #501411 - Flags: review?(vladimir)
Attachment #501355 - Flags: review?(vladimir)
Sigh.  I found one more issue but it's minor enough that I won't repost.  Exercise for the reviewer ;).
Comment on attachment 501411 [details] [diff] [review]
Ensure LayerOGL::Destroy is called for "orphaned" shadow OGL layers before the GLContext is deleted, v3

looks fine, but I didn't find your exercise for the reviewer..
Attachment #501411 - Flags: review?(vladimir) → review+
Oh sorry, patch didn't have enough context.  There's another |if (mLayer)| guard now needed in ShadowLayerParent::ActorDestroy, for the same reason as in ShadowLayerParent::Destroy.
http://hg.mozilla.org/mozilla-central/rev/9a10911e5b29
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.