Last Comment Bug 696013 - Nightly crashes @ mozilla::layers::BasicShadowCanvasLayer::DestroyFrontBuffer
: Nightly crashes @ mozilla::layers::BasicShadowCanvasLayer::DestroyFrontBuffer
Status: VERIFIED FIXED
[mobile-crash], [inbound]
: crash, reproducible
Product: Core
Classification: Components
Component: Graphics (show other bugs)
: Trunk
: ARM Android
: -- critical (vote)
: mozilla10
Assigned To: Oleg Romashin (:romaxa)
:
Mentors:
http://randomibis.com/coolclock/
: 698462 (view as bug list)
Depends on:
Blocks: 689045
  Show dependency treegraph
 
Reported: 2011-10-20 02:52 PDT by Cristian Nicolae (:xti)
Modified: 2013-12-10 10:00 PST (History)
15 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
fixed


Attachments
Fix proposal, for canvas buffer leak, assume that disconnect call is expected in normal situation (1.10 KB, patch)
2011-10-20 14:20 PDT, Oleg Romashin (:romaxa)
no flags Details | Diff | Splinter Review
Fix proposal, for canvas and other shadow layers buffer leak (2.56 KB, patch)
2011-10-20 15:52 PDT, Oleg Romashin (:romaxa)
no flags Details | Diff | Splinter Review
Fix for canvas and other shadow layers buffer leak (2.33 KB, patch)
2011-10-25 19:26 PDT, Oleg Romashin (:romaxa)
cjones.bugs: review+
Details | Diff | Splinter Review
Fix for canvas and other shadow layers buffer leak. to push (2.05 KB, patch)
2011-10-25 20:05 PDT, Oleg Romashin (:romaxa)
no flags Details | Diff | Splinter Review

Description Cristian Nicolae (:xti) 2011-10-20 02:52:16 PDT
Build ID: Mozilla/5.0 (Android;Linux armv7l;rv:10.0a1)Gecko/20111019
Firefox/10.0a1 Fennec/10.0a1
Devices: Samsung Galaxy S, HTC Desire Z
OS: Android 2.2, Android 2.3.3

Steps to reproduce:
1. Open Fennec App
2. Browse to http://randomibis.com/coolclock/
3. Let the page opened for a couple of minutes

Expected result:
After step 3, nothing should happen.

Actual result:
After step 3, Fennec will crash.

Notes:
- this issue is reproducible always only if JavaScript is enabled.
- if sync is connected before step 2, it will be disconnected when the app will reopen back after the crash.
- here are some crash reports:

https://crash-stats.mozilla.com/report/index/bp-dcf9a9c0-6566-4b03-876a-c0f2c2111020
https://crash-stats.mozilla.com/report/index/bp-f5f86f15-efae-4790-8857-add622111020
https://crash-stats.mozilla.com/report/index/bp-f5f324d6-7b32-4ff6-9b46-116372111020
https://crash-stats.mozilla.com/report/index/bp-43c057b8-6dda-4dab-860e-1e1892111020
Comment 1 Naoki Hirata :nhirata (please use needinfo instead of cc) 2011-10-20 05:29:40 PDT
Crashing Thread

Frame 	Module 	Signature [Expand] 	Source
0 	libxul.so 	mozilla::layers::BasicShadowCanvasLayer::DestroyFrontBuffer 	gfx/layers/basic/BasicLayers.cpp:3025
1 	libxul.so 	mozilla::layers::BasicShadowCanvasLayer::~BasicShadowCanvasLayer 	gfx/layers/basic/BasicLayers.cpp:3011
2 	libxul.so 	mozilla::layers::BasicShadowCanvasLayer::~BasicShadowCanvasLayer 	mozalloc.h:253
3 	libxul.so 	mozilla::layers::Layer::Release 	Layers.h:544
4 	libxul.so 	mozilla::layers::ContainerRemoveChild<mozilla::layers::BasicShadowContainerLayer> 	gfx/layers/basic/BasicLayers.cpp:403
5 	libxul.so 	mozilla::layers::BasicShadowContainerLayer::~BasicShadowContainerLayer 	gfx/layers/basic/BasicLayers.cpp:2852
6 	libxul.so 	mozilla::layers::BasicShadowContainerLayer::~BasicShadowContainerLayer 	mozalloc.h:253
7 	libxul.so 	mozilla::layers::Layer::Release 	Layers.h:544
8 	libxul.so 	mozilla::layers::ContainerRemoveChild<mozilla::layers::BasicContainerLayer> 	gfx/layers/basic/BasicLayers.cpp:403
9 	libxul.so 	mozilla::layers::BasicContainerLayer::~BasicContainerLayer 	gfx/layers/basic/BasicLayers.cpp:290
10 	libxul.so 	mozilla::layers::BasicShadowableContainerLayer::~BasicShadowableContainerLayer 	gfx/layers/basic/BasicLayers.cpp:2111
11 	libxul.so 	mozilla::layers::BasicShadowableContainerLayer::~BasicShadowableContainerLayer 	mozalloc.h:253
12 	libxul.so 	mozilla::layers::Layer::Release 	Layers.h:544
13 	libxul.so 	mozilla::layers::ContainerRemoveChild<mozilla::layers::BasicContainerLayer> 	gfx/layers/basic/BasicLayers.cpp:403
14 	libxul.so 	mozilla::layers::BasicShadowableContainerLayer::RemoveChild 	gfx/layers/basic/BasicLayers.cpp:2153
15 	libxul.so 	mozilla::FrameLayerBuilder::BuildContainerLayerFor 	layout/base/FrameLayerBuilder.cpp:1630
16 	libxul.so 	nsDisplayList::PaintForFrame 	layout/base/nsDisplayList.cpp:598
17 	libxul.so 	nsDisplayList::PaintRoot 	layout/base/nsDisplayList.cpp:539
18 	libxul.so 	nsLayoutUtils::PaintFrame 	layout/base/nsLayoutUtils.cpp:1701
19 	libxul.so 	PresShell::Paint 	layout/base/nsPresShell.cpp:5390
20 	libxul.so 	nsViewManager::RenderViews 	view/src/nsViewManager.cpp:417
21 	libxul.so 	nsViewManager::Refresh 	view/src/nsViewManager.h:238
22 	libxul.so 	nsViewManager::DispatchEvent 	nsCOMPtr.h:515
23 	libxul.so 	HandleEvent 	nsCOMPtr.h:515
24 	libxul.so 	nsWindow::DispatchEvent 	widget/src/android/nsWindow.cpp:632
25 	libxul.so 	nsWindow::DrawTo 	widget/src/android/nsWindow.cpp:933
26 	libxul.so 	nsWindow::DrawTo 	widget/src/android/nsWindow.cpp:970
27 	libxul.so 	nsWindow::OnDraw 	widget/src/android/nsWindow.cpp:1077
28 	libxul.so 	nsWindow::OnGlobalAndroidEvent 	widget/src/android/nsWindow.cpp:837
29 	libxul.so 	nsAppShell::ProcessNextNativeEvent 	widget/src/android/nsAppShell.cpp:408
30 	libxul.so 	nsBaseAppShell::DoProcessNextNativeEvent 	widget/src/xpwidgets/nsBaseAppShell.cpp:172
31 	libxul.so 	nsBaseAppShell::OnProcessNextEvent 	widget/src/xpwidgets/nsBaseAppShell.cpp:312
32 	libxul.so 	mozilla::dom::ContentParent::OnProcessNextEvent 	dom/ipc/ContentParent.cpp:1144
33 	libxul.so 	nsThread::ProcessNextEvent 	nsTArray.h:170
34 	libxul.so 	NS_ProcessNextEvent_P 	obj-firefox/xpcom/build/nsThreadUtils.cpp:245
35 	libxul.so 	nsXMLHttpRequest::Send 	content/base/src/nsXMLHttpRequest.cpp:2559
36 	libxul.so 	nsIXMLHttpRequest_Send 	obj-firefox/js/src/xpconnect/src/dom_quickstubs.cpp:28459
37 	libxul.so 	js::Interpret 	js/src/jscntxtinlines.h:298
38 	libxul.so 	js::RunScript 	js/src/jsinterp.cpp:585
39 	libxul.so 	js::InvokeKernel 	js/src/vm/Stack.h:984
40 	libxul.so 	js_fun_apply 	js/src/vm/Stack.h:289
41 	libxul.so 	js::Interpret 	js/src/jscntxtinlines.h:298
42 	libxul.so 	js::RunScript 	js/src/jsinterp.cpp:585
43 	libxul.so 	js::Invoke 	js/src/vm/Stack.h:984
44 	libxul.so 	JS_CallFunctionValue 	js/src/jscntxt.h:1242
45 	libxul.so 	nsXPCWrappedJSClass::CallMethod 	js/src/xpconnect/src/xpcwrappedjsclass.cpp:1662
46 	libxul.so 	nsXPCWrappedJS::CallMethod 	js/src/xpconnect/src/xpcwrappedjs.cpp:586
47 	libxul.so 	PrepareAndDispatch 	xpcom/reflect/xptcall/src/md/unix/xptcstubs_arm.cpp:133
48 	libxul.so 	libxul.so@0x96330c 	
49 	libxul.so 	NS_InvokeByIndex_P 	xpcom/reflect/xptcall/src/md/unix/xptcinvoke_arm.cpp:195
50 	libxul.so 	XPCConvert::JSData2Native 	js/src/xpconnect/src/xpcconvert.cpp:634
51 	libxul.so 	XPCWrappedNative::CallMethod 	js/src/xpconnect/src/xpcwrappednative.cpp:3148
52 	libxul.so 	XPC_WN_CallMethod 	js/src/xpconnect/src/xpcwrappednativejsops.cpp:1629
53 	libxul.so 	js::Interpret 	js/src/jscntxtinlines.h:298
54 	libxul.so 	js::RunScript 	js/src/jsinterp.cpp:585
55 	libxul.so 	js::Invoke 	js/src/vm/Stack.h:984
56 	libxul.so 	JS_CallFunctionValue 	js/src/jscntxt.h:1242
57 	libxul.so 	nsJSContext::CallEventHandler 	dom/base/nsJSEnvironment.cpp:1947
58 	libxul.so 	nsGlobalWindow::RunTimeout 	nsCOMPtr.h:896
59 	libxul.so 	nsGlobalWindow::TimerCallback 	nsAutoPtr.h:907
60 	libxul.so 	nsTimerImpl::Fire 	xpcom/threads/nsTimerImpl.cpp:425
61 	libxul.so 	nsTimerEvent::Run 	nsAutoPtr.h:907
62 	libxul.so 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:631
63 	libxul.so 	NS_ProcessNextEvent_P 	obj-firefox/xpcom/build/nsThreadUtils.cpp:245
64 	libxul.so 	mozilla::ipc::MessagePump::Run 	ipc/glue/MessagePump.cpp:111
65 	libxul.so 	MessageLoop::RunInternal 	ipc/chromium/src/base/message_loop.cc:209
66 	libxul.so 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:487
67 	libxul.so 	nsBaseAppShell::Run 	widget/src/xpwidgets/nsBaseAppShell.cpp:191
68 	libxul.so 	nsAppStartup::Run 	toolkit/components/startup/nsAppStartup.cpp:229
69 	libxul.so 	XRE_main 	toolkit/xre/nsAppRunner.cpp:3583
70 	libxul.so 	Java_org_mozilla_gecko_GeckoAppShell_nativeRun 	toolkit/xre/nsAndroidStartup.cpp:132
71 	libmozutils.so 	Java_org_mozilla_gecko_GeckoAppShell_nativeRun 	other-licenses/android/APKOpen.cpp:232
72 	libdvm.so 	dvmPlatformInvoke 	
73 	libdvm.so 	dvmCallJNIMethod_general 	
74 	libdvm.so 	dvmResolveNativeMethod 	
75 	libdvm.so 	dvmAsmSisterStart 	
76 	libdvm.so 	dvmMterpStd 	
77 	libdvm.so 	dvmInterpret 	
78 	libdvm.so 	dvmCallMethodV 	
79 	libdvm.so 	dvmCallMethod 	
80 	libdvm.so 	dvmDetachCurrentThread 	
81 	libc.so 	__thread_entry 	
82 	libc.so 	pthread_create
Comment 2 Naoki Hirata :nhirata (please use needinfo instead of cc) 2011-10-20 09:27:29 PDT
Starting from the 27th: does not crash for the 9/27th build.
crash-stats indicates that there was a crash for 10/05th build.

28th is when the switch over from Fennec 9 and Fennec 10 occured.
9/28th does not crash; 10/05th build does crash.  Still need to find regression range between them.
Comment 3 Oleg Romashin (:romaxa) 2011-10-20 10:09:00 PDT
Could be bug 689045, but need to double check
Comment 4 Oleg Romashin (:romaxa) 2011-10-20 11:35:44 PDT
hmm, tested it on maemo with latest inbound, and don't see any crash, also tested by moving fennec to background (release layers tree) and still cannot reproduce
Comment 5 Oleg Romashin (:romaxa) 2011-10-20 13:17:23 PDT
Ok, got it reproducible..
We are leaking SurfaceDescriptors
Comment 6 Oleg Romashin (:romaxa) 2011-10-20 14:16:35 PDT
Ok, found where is the problem...
seems after each canvas update we do destroy layers tree...
#0  mozilla::layers::BasicShadowCanvasLayer::Disconnect (this=0x3af17510)
    at gfx/layers/basic/BasicLayers.cpp:3033
#1  0x3c2a4c24 in mozilla::layers::ShadowLayerParent::ActorDestroy (this=0x40652180, 
    why=mozilla::ipc::IProtocolManager<mozilla::ipc::RPCChannel::RPCListener>::Deletion)
    at gfx/layers/ipc/ShadowLayerParent.cpp:93
#2  0x3c17161c in mozilla::layers::PLayerParent::DestroySubtree (this=0x40652180, 
    why=mozilla::ipc::IProtocolManager<mozilla::ipc::RPCChannel::RPCListener>::Deletion)
    at obj-build/ipc/ipdl/PLayerParent.cpp:315

And that is calling Disconnect before destructor.

Each paint has next func calls order:
BasicShadowableCanvasLayer::BasicShadowableCanvasLayer(mozilla::layers::BasicShadowLayerManager*)::2580
BasicShadowableCanvasLayer::Paint(gfxContext*)::2666 Alloc Surf:[300,300]
BasicShadowCanvasLayer::BasicShadowCanvasLayer(mozilla::layers::BasicShadowLayerManager*)::3023
BasicShadowCanvasLayer::Swap(const mozilla::layers::CanvasSurface&, bool, mozilla::layers::CanvasSurface*)::3081 sz[300,300]->[0,0]
BasicShadowCanvasLayer::DestroyFrontBuffer()::3050 front:0
BasicShadowLayerManager::ForwardTransaction()::3357
BasicShadowableCanvasLayer::SetBackBuffer(const mozilla::layers::SurfaceDescriptor&)::2603 SetBack:[300,300]
BasicShadowableCanvasLayer::~BasicShadowableCanvasLayer()::2585
BasicShadowableCanvasLayer::DestroyBackBuffer()::2621 No Destroy Back:[300,300], type:0
BasicShadowCanvasLayer::Disconnect()::3035
BasicShadowCanvasLayer::DestroyFrontBuffer()::3046
BasicShadowCanvasLayer::~BasicShadowCanvasLayer()::3028
BasicShadowCanvasLayer::DestroyFrontBuffer()::3050 front:0


Before it was not reproducible because we were destroying front buffer from 
BasicShadowableCanvasLayer::~BasicShadowableCanvasLayer


IIRC cjones was saying that Disconnect happen only in extra cases... when everything is dying... but it seems not
Comment 7 Oleg Romashin (:romaxa) 2011-10-20 14:20:35 PDT
Created attachment 568517 [details] [diff] [review]
Fix proposal, for canvas buffer leak, assume that disconnect call is expected in normal situation
Comment 8 Oleg Romashin (:romaxa) 2011-10-20 14:23:33 PDT
another question is why we destroy CanvasLayer after each paint (clock tik) here?
Comment 9 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2011-10-20 15:21:57 PDT
I don't know. Is it only this testcase?
Comment 10 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-10-20 15:30:36 PDT
On the compositor side (Shadow*Layer), Disconnect() is normal.  On the content side (Shadowable*Layer), Disconnect() is not normal.  Should be documented in BasicLayers.cpp.
Comment 11 Oleg Romashin (:romaxa) 2011-10-20 15:47:16 PDT
I tried some other clock example and same happening there
http://www.neilwallis.com/projects/html5/clock/index.php
Comment 12 Oleg Romashin (:romaxa) 2011-10-20 15:52:17 PDT
Created attachment 568550 [details] [diff] [review]
Fix proposal, for canvas and other shadow layers buffer leak

Ok, then I guess we should also destroy Front buffers for all Shadow disconnects.
Comment 13 Cristian Nicolae (:xti) 2011-10-21 03:05:31 PDT
This issue doesn't occur on:
Build ID : Mozilla /5.0 (Android;Linux armv7l;rv:10.0a1) Gecko/20110929 Firefox/10.0a1 Fennec/10.0a1 
http://hg.mozilla.org/mozilla-central/rev/cb4b93331e4f

but it occurs on:
Build ID : Mozilla /5.0 (Android;Linux armv7l;rv:10.0a1) Gecko/20110930 Firefox/10.0a1 Fennec/10.0a1
http://hg.mozilla.org/mozilla-central/rev/b5b082d183d0

Possible range:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=cb4b93331e4f&tochange=b5b082d183d0
Comment 14 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-10-25 18:41:41 PDT
Comment on attachment 568550 [details] [diff] [review]
Fix proposal, for canvas and other shadow layers buffer leak

>   virtual ~BasicShadowThebesLayer()
>   {
>     // If Disconnect() wasn't called on us, then we assume that the
>     // remote side shut down and IPC is disconnected, so we let IPDL
>     // clean up our front surface Shmem.
>+    DestroyFrontBuffer();

As the comment says, Disconnect() is the normal path.  If we get to
the dtor with a valid buffer, then it's because the content process
crashed and it's somewhat dangerous to try to free the buffer here.
IPDL will do it automatically as part of the cleanup path.  As the
comment says.  Remove the DestroyFrontBuffer() call here.

>@@ -2900,32 +2901,34 @@ class BasicShadowImageLayer : public Sha
> public:
>   BasicShadowImageLayer(BasicShadowLayerManager* aLayerManager) :
>     ShadowImageLayer(aLayerManager, static_cast<BasicImplData*>(this))
>   {
>     MOZ_COUNT_CTOR(BasicShadowImageLayer);
>   }
>   virtual ~BasicShadowImageLayer()
>   {
>+    DestroyFrontBuffer();

Remote this too.

>   virtual void DestroyFrontBuffer()
>   {
>     if (mAllocator && IsSurfaceDescriptorValid(mFrontBuffer)) {
>       mAllocator->DestroySharedSurface(&mFrontBuffer);
>+      mFrontBuffer = SurfaceDescriptor();

This is unnecessary; DestroySharedSurface() "nulls" this descriptor.

>   virtual void DestroyFrontBuffer()
>   {
>     if (IsSurfaceDescriptorValid(mFrontSurface)) {
>       mAllocator->DestroySharedSurface(&mFrontSurface);
>+      mFrontSurface = SurfaceDescriptor();

Same here.
Comment 15 Oleg Romashin (:romaxa) 2011-10-25 19:26:19 PDT
Created attachment 569589 [details] [diff] [review]
Fix for canvas and other shadow layers buffer leak
Comment 16 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-10-25 19:34:24 PDT
Comment on attachment 569589 [details] [diff] [review]
Fix for canvas and other shadow layers buffer leak

>   virtual void DestroyFrontBuffer()
>   {
>     if (mAllocator && IsSurfaceDescriptorValid(mFrontBuffer)) {
>       mAllocator->DestroySharedSurface(&mFrontBuffer);
>+      mFrontBuffer = SurfaceDescriptor();

This isn't necessary.

r=me with nitfix.
Comment 17 Oleg Romashin (:romaxa) 2011-10-25 20:05:51 PDT
Created attachment 569596 [details] [diff] [review]
Fix for canvas and other shadow layers buffer leak. to push

Fixed comment
Comment 19 Ed Morley [:emorley] 2011-10-28 04:41:14 PDT
https://hg.mozilla.org/mozilla-central/rev/e2aac74cb870
Comment 20 Cristian Nicolae (:xti) 2011-10-31 05:29:03 PDT
Verified fixed on:
Mozilla/5.0 (Android;Linux armv7l;rv:10.0a1)Gecko/20111031
Firefox/10.0a1 Fennec/10.0a1
Devices: LG Optimus 2X
OS: Android 2.2
Comment 21 Cristian Nicolae (:xti) 2011-10-31 09:44:57 PDT
*** Bug 698462 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.