Closed Bug 1017781 Opened 10 years ago Closed 10 years ago

B2G Shutdown crash in layers::ISurfaceAllocator::~ISurfaceAllocator

Categories

(Core :: Graphics: Layers, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 1016538

People

(Reporter: gwagner, Assigned: nical)

References

Details

For example: https://tbpl.mozilla.org/php/getParsedLog.php?id=40584221&tree=Pine&full=1#error0

16:01:37  WARNING -  PROCESS-CRASH | Shutdown | application crashed [@ mozilla::layers::ISurfaceAllocator::~ISurfaceAllocator]
16:01:37     INFO -  Crash dump filename: /tmp/tmprMoPPI/0bcb9110-8dba-f57c-46268d1b-22860c92.dmp
16:01:37     INFO -  Operating system: Android
16:01:37     INFO -                    0.0.0 Linux 2.6.29-00302-g586075d #31 Mon Feb 24 10:28:23 PST 2014 armv7l Android/full/generic:4.0.4.0.4.0.4/OPENMASTER/eng.cltbld.20140528.161950:eng/test-keys
16:01:37     INFO -  CPU: arm
16:01:37     INFO -       0 CPUs
16:01:37     INFO -  Crash reason:  SIGSEGV
16:01:37     INFO -  Crash address: 0x0
16:01:37     INFO -  Thread 0 (crashed)
16:01:37     INFO -   0  libxul.so!mozilla::layers::ISurfaceAllocator::~ISurfaceAllocator [ISurfaceAllocator.cpp:236aa6fb7a14 : 53 + 0x4]
16:01:37     INFO -       r4 = 0x446c1b20    r5 = 0x446c1b50    r6 = 0x00000000    r7 = 0x00000000
16:01:37     INFO -       r8 = 0x00000001    r9 = 0x00000000   r10 = 0x441cfae0    fp = 0x00000001
16:01:37     INFO -       sp = 0xbe9f38c8    lr = 0x40b9d8e1    pc = 0x40b9d8e4
16:01:37     INFO -      Found by: given as instruction pointer in context
16:01:37     INFO -   1  libxul.so!mozilla::layers::CompositableForwarder::~CompositableForwarder [CompositableForwarder.h : 45 + 0x5]
16:01:37     INFO -       r4 = 0x446c1b20    r5 = 0x446c1b50    r6 = 0x00000000    r7 = 0x00000000
16:01:37     INFO -       r8 = 0x00000001    r9 = 0x00000000   r10 = 0x441cfae0    fp = 0x00000001
16:01:37     INFO -       sp = 0xbe9f38d8    pc = 0x40b9fc5b
16:01:37     INFO -      Found by: call frame info
16:01:37     INFO -   2  libxul.so!mozilla::layers::ShadowLayerForwarder::~ShadowLayerForwarder [ShadowLayers.cpp:236aa6fb7a14 : 183 + 0x5]
16:01:37     INFO -       r4 = 0x446c1b20    r5 = 0x446c1b80    r6 = 0x00000000    r7 = 0x00000000
16:01:37     INFO -       r8 = 0x00000001    r9 = 0x00000000   r10 = 0x441cfae0    fp = 0x00000001
16:01:37     INFO -       sp = 0xbe9f38e8    pc = 0x40ba6e55
16:01:37     INFO -      Found by: call frame info
16:01:37     INFO -   3  libxul.so!mozilla::layers::ShadowLayerForwarder::~ShadowLayerForwarder [ShadowLayers.cpp:236aa6fb7a14 : 183 + 0x3]
16:01:37     INFO -       r4 = 0x446c1b20    r5 = 0x446c1b24    r6 = 0x00000000    r7 = 0x00000000
16:01:37     INFO -       r8 = 0x00000001    r9 = 0x00000000   r10 = 0x441cfae0    fp = 0x00000001
16:01:37     INFO -       sp = 0xbe9f3900    pc = 0x40ba6e71
16:01:37     INFO -      Found by: call frame info
16:01:37     INFO -   4  libxul.so!mozilla::AtomicRefCountedWithFinalize<mozilla::layers::ISurfaceAllocator>::Release() [AtomicRefCountedWithFinalize.h : 46 + 0x9]
16:01:37     INFO -       r4 = 0x446c1b20    r5 = 0x446c1b24    r6 = 0x00000000    r7 = 0x00000000
16:01:37     INFO -       r8 = 0x00000001    r9 = 0x00000000   r10 = 0x441cfae0    fp = 0x00000001
16:01:37     INFO -       sp = 0xbe9f3908    pc = 0x40b7b9bb
16:01:37     INFO -      Found by: call frame info
16:01:37     INFO -   5  libxul.so!mozilla::layers::TextureChild::~TextureChild [RefPtr.h : 281 + 0x5]
16:01:37     INFO -       r4 = 0x461f1940    r5 = 0x00000000    r6 = 0x00000000    r7 = 0x00000000
16:01:37     INFO -       r8 = 0x00000001    r9 = 0x00000000   r10 = 0x441cfae0    fp = 0x00000001
16:01:37     INFO -       sp = 0xbe9f3920    pc = 0x40b83ebf
16:01:37     INFO -      Found by: call frame info
16:01:37     INFO -   6  libxul.so!mozilla::layers::TextureChild::~TextureChild [TextureClient.cpp:236aa6fb7a14 : 80 + 0x5]
16:01:37     INFO -       r4 = 0x461f1940    r5 = 0x00000000    r6 = 0x00000000    r7 = 0x00000000
16:01:37     INFO -       r8 = 0x00000001    r9 = 0x00000000   r10 = 0x441cfae0    fp = 0x00000001
16:01:37     INFO -       sp = 0xbe9f3928    pc = 0x40b83ed5
16:01:37     INFO -      Found by: call frame info
16:01:37     INFO -   7  libxul.so!mozilla::layers::TextureChild::Release() [TextureClient.cpp:236aa6fb7a14 : 82 + 0x7]
16:01:37     INFO -       r4 = 0x461f1940    r5 = 0x00000000    r6 = 0x00000000    r7 = 0x00000000
16:01:37     INFO -       r8 = 0x00000001    r9 = 0x00000000   r10 = 0x441cfae0    fp = 0x00000001
16:01:37     INFO -       sp = 0xbe9f3930    pc = 0x40b83e07
16:01:37     INFO -      Found by: call frame info
16:01:37     INFO -   8  libxul.so!mozilla::layers::TextureClient::DestroyIPDLActor(mozilla::layers::PTextureChild*) [TextureClient.cpp:236aa6fb7a14 : 129 + 0x3]
16:01:37     INFO -       r4 = 0x00000001    r5 = 0x4022d4a4    r6 = 0x00000000    r7 = 0x00000000
16:01:37     INFO -       r8 = 0x00000001    r9 = 0x00000000   r10 = 0x441cfae0    fp = 0x00000001
16:01:37     INFO -       sp = 0xbe9f3948    pc = 0x40b83fab
16:01:37     INFO -      Found by: call frame info
16:01:37     INFO -   9  libxul.so!mozilla::layers::ImageBridgeChild::DeallocPTextureChild(mozilla::layers::PTextureChild*) [ImageBridgeChild.cpp:236aa6fb7a14 : 801 + 0x5]
16:01:37     INFO -       r4 = 0x4022d480    r5 = 0x4022d4a4    r6 = 0x00000000    r7 = 0x00000000
16:01:37     INFO -       r8 = 0x00000001    r9 = 0x00000000   r10 = 0x441cfae0    fp = 0x00000001
16:01:37     INFO -       sp = 0xbe9f3958    pc = 0x40b9ea59
Blocks: 999215
Slowly getting through the shutdown issues.  Bug 924622 was awesomely resolved, took care of bug 986113, so this is now the relevant one to look at.
Assignee: nobody → nical.bugzilla
I update gecko from https://github.com/mozilla/gecko-dev.git. Here is my latest commit, d29716084b018ff6d0c2c38b8ea7c41bcc8bb820.
Then I build debug version and run "./mach mochitest-remote layout/forms/test". I get the same error. I ran 2 times and both were hitting the problem. 

Here is the call stack.
#0  0x40b8a988 in ~ISurfaceAllocator (this=0x446f9580, __in_chrg=<value optimized out>)
    at /home/steven/workspace/b2g/gecko-dev/gfx/layers/ipc/ISurfaceAllocator.cpp:53
#1  0x40b9a586 in ~CompositableForwarder (this=0x446f9580, __in_chrg=<value optimized out>)
    at ../../dist/include/mozilla/layers/CompositableForwarder.h:45
#2  0x40b9a5e6 in ~ShadowLayerForwarder (this=0x446f9580, __in_chrg=<value optimized out>)
    at /home/steven/workspace/b2g/gecko-dev/gfx/layers/ipc/ShadowLayers.cpp:183
#3  0x40b9a604 in ~ShadowLayerForwarder (this=0x446f9580, __in_chrg=<value optimized out>)
    at /home/steven/workspace/b2g/gecko-dev/gfx/layers/ipc/ShadowLayers.cpp:183
#4  0x40b7018c in mozilla::AtomicRefCountedWithFinalize<mozilla::layers::ISurfaceAllocator>::Release (this=0x446f9584)
    at ../../dist/include/mozilla/layers/AtomicRefCountedWithFinalize.h:65
#5  0x40b7023e in mozilla::RefPtr<mozilla::layers::CompositableForwarder>::unref (this=0x43ffeeb0, __in_chrg=<value optimized out>)
    at ../../dist/include/mozilla/RefPtr.h:301
#6  ~RefPtr (this=0x43ffeeb0, __in_chrg=<value optimized out>) at ../../dist/include/mozilla/RefPtr.h:242
#7  ~TextureChild (this=0x43ffeeb0, __in_chrg=<value optimized out>)
    at /home/steven/workspace/b2g/gecko-dev/gfx/layers/client/TextureClient.cpp:83
#8  0x40b70254 in ~TextureChild (this=0x446f9580, __in_chrg=<value optimized out>)
    at /home/steven/workspace/b2g/gecko-dev/gfx/layers/client/TextureClient.cpp:83
#9  0x40b6f164 in mozilla::layers::TextureChild::Release (this=0x43ffeeb0)
    at /home/steven/workspace/b2g/gecko-dev/gfx/layers/client/TextureClient.cpp:85
#10 0x40b6f21a in mozilla::layers::TextureChild::ReleaseIPDLReference (actor=0x43ffeeb0)
    at /home/steven/workspace/b2g/gecko-dev/gfx/layers/client/TextureClient.cpp:132
#11 mozilla::layers::TextureClient::DestroyIPDLActor (actor=0x43ffeeb0)
    at /home/steven/workspace/b2g/gecko-dev/gfx/layers/client/TextureClient.cpp:171
#12 0x40b91ec8 in mozilla::layers::LayerTransactionChild::DeallocPTextureChild (this=<value optimized out>, actor=0x43ffeeb0)
    at /home/steven/workspace/b2g/gecko-dev/gfx/layers/ipc/LayerTransactionChild.cpp:145
#13 0x4091a956 in mozilla::layers::PLayerTransactionChild::RemoveManagee (this=0x446f9520, aProtocolId=<value optimized out>, 
    aListener=<value optimized out>)
    at /home/steven/workspace/b2g/gecko-dev/obj-emulator-debug/ipc/ipdl/PLayerTransactionChild.cpp:645
#14 0x409722ae in mozilla::layers::PTextureChild::OnMessageReceived (this=0x43ffeeb0, __msg=...)
    at /home/steven/workspace/b2g/gecko-dev/obj-emulator-debug/ipc/ipdl/PTextureChild.cpp:233
#15 0x40885e22 in mozilla::layers::PCompositorChild::OnMessageReceived (this=0x4408eff0, __msg=...)
    at /home/steven/workspace/b2g/gecko-dev/obj-emulator-debug/ipc/ipdl/PCompositorChild.cpp:698
#16 0x4085d6ea in mozilla::ipc::MessageChannel::DispatchAsyncMessage (this=0x4408f020, aMsg=...)
    at /home/steven/workspace/b2g/gecko-dev/ipc/glue/MessageChannel.cpp:1152
#17 0x40863b36 in mozilla::ipc::MessageChannel::DispatchMessage (this=0x4408f020, aMsg=...)
    at /home/steven/workspace/b2g/gecko-dev/ipc/glue/MessageChannel.cpp:1066
#18 0x40863c0c in mozilla::ipc::MessageChannel::OnMaybeDequeueOne (this=<value optimized out>)
    at /home/steven/workspace/b2g/gecko-dev/ipc/glue/MessageChannel.cpp:1049
#19 0x40640832 in DispatchToMethod<FdWatcher, void (FdWatcher::*)()> (this=<value optimized out>)
    at /home/steven/workspace/b2g/gecko-dev/ipc/chromium/src/base/tuple.h:383
#20 RunnableMethod<FdWatcher, void (FdWatcher::*)(), Tuple0>::Run (this=<value optimized out>)
    at /home/steven/workspace/b2g/gecko-dev/ipc/chromium/src/base/task.h:307
#21 0x4085b0ea in mozilla::ipc::MessageChannel::RefCountedTask::Run (this=0x440601c0)
    at ../../dist/include/mozilla/ipc/MessageChannel.h:390
#22 mozilla::ipc::MessageChannel::DequeueTask::Run (this=0x440601c0) at ../../dist/include/mozilla/ipc/MessageChannel.h:407
#23 0x4084dd20 in MessageLoop::RunTask (this=0xbeae772c, task=0x440601c0)
    at /home/steven/workspace/b2g/gecko-dev/ipc/chromium/src/base/message_loop.cc:357
#24 0x40851352 in MessageLoop::DeferOrRunPendingTask (this=0x440601c0, pending_task=<value optimized out>)
    at /home/steven/workspace/b2g/gecko-dev/ipc/chromium/src/base/message_loop.cc:365
#25 0x40851fa4 in MessageLoop::DoWork (this=0xbeae772c)
    at /home/steven/workspace/b2g/gecko-dev/ipc/chromium/src/base/message_loop.cc:443
#26 0x4085dc38 in mozilla::ipc::DoWorkRunnable::Run (this=<value optimized out>)
    at /home/steven/workspace/b2g/gecko-dev/ipc/glue/MessagePump.cpp:233
#27 0x40682f26 in nsThread::ProcessNextEvent (this=0x40244880, aMayWait=false, aResult=0xbeae6c6f)
    at /home/steven/workspace/b2g/gecko-dev/xpcom/threads/nsThread.cpp:766
#28 0x406308ac in NS_ProcessNextEvent (aThread=0x40244880, aMayWait=false)
    at /home/steven/workspace/b2g/gecko-dev/xpcom/glue/nsThreadUtils.cpp:256
#29 0x4085dd0a in mozilla::ipc::MessagePump::Run (this=0x40201bb0, aDelegate=0xbeae772c)
    at /home/steven/workspace/b2g/gecko-dev/ipc/glue/MessagePump.cpp:99
#30 0x4085de92 in mozilla::ipc::MessagePumpForChildProcess::Run (this=0x40201bb0, aDelegate=0xbeae772c)
    at /home/steven/workspace/b2g/gecko-dev/ipc/glue/MessagePump.cpp:302
#31 0x40850eaa in MessageLoop::RunInternal (this=0xbeae772c)
    at /home/steven/workspace/b2g/gecko-dev/ipc/chromium/src/base/message_loop.cc:229
#32 0x40850ec2 in MessageLoop::RunHandler (this=0xbeae772c)
    at /home/steven/workspace/b2g/gecko-dev/ipc/chromium/src/base/message_loop.cc:222
#33 MessageLoop::Run (this=0xbeae772c) at /home/steven/workspace/b2g/gecko-dev/ipc/chromium/src/base/message_loop.cc:196
#34 0x40f80eee in nsBaseAppShell::Run (this=0x446136a0)
    at /home/steven/workspace/b2g/gecko-dev/widget/xpwidgets/nsBaseAppShell.cpp:164
#35 0x4192008e in XRE_RunAppShell () at /home/steven/workspace/b2g/gecko-dev/toolkit/xre/nsEmbedFunctions.cpp:693
#36 0x4085de9e in mozilla::ipc::MessagePumpForChildProcess::Run (this=0x40201bb0, aDelegate=0xbeae772c)
    at /home/steven/workspace/b2g/gecko-dev/ipc/glue/MessagePump.cpp:272
#37 0x40850eaa in MessageLoop::RunInternal (this=0xbeae772c)
    at /home/steven/workspace/b2g/gecko-dev/ipc/chromium/src/base/message_loop.cc:229
#38 0x40850ec2 in MessageLoop::RunHandler (this=0xbeae772c)
    at /home/steven/workspace/b2g/gecko-dev/ipc/chromium/src/base/message_loop.cc:222
#39 MessageLoop::Run (this=0xbeae772c) at /home/steven/workspace/b2g/gecko-dev/ipc/chromium/src/base/message_loop.cc:196
#40 0x41920922 in XRE_InitChildProcess (aArgc=2, aArgv=0xbeae7848, aProcess=3199105084)
    at /home/steven/workspace/b2g/gecko-dev/toolkit/xre/nsEmbedFunctions.cpp:530
#41 0x00008894 in main (argc=7, argv=0xbeae78c4) at /home/steven/workspace/b2g/gecko-dev/ipc/app/MozillaRuntimeMain.cpp:149
Notice that this is an assertion failure:

https://tbpl.mozilla.org/php/getParsedLog.php?id=43740325&tree=Try&full=1#error4

05:50:40     INFO -  07-14 12:41:55.552 F/MOZ_Assert(  753): Assertion failure: mUsedShmems.empty(), at ../../../gecko/gfx/layers/ipc/ISurfaceAllocator.cpp:53


I don't understand this assertion or the mUsedShmems that it's asserting should be empty at this point, but that was added recently, in

http://hg.mozilla.org/mozilla-central/rev/193df828b5a2

Bug 981315: Add ShmemSection and use it for gfxShmSharedReacLock. r=gal
Flags: needinfo?(bas)
Also note that the present crash is very different from the other shutdown crashes that we recently fixed. In particular, bug 924622 is fixed for good, but that doesn't help us here. The present bug is happening at the level of LayerTransactionParent and Textures, not at the level of PCompositor/PImageBridge toplevel protocol shutdown as were the other shutdown bugs that we recently looked into.
Depends on: 981315
(In reply to Benoit Jacob [:bjacob] from comment #4)
> Notice that this is an assertion failure:
> 
> https://tbpl.mozilla.org/php/getParsedLog.
> php?id=43740325&tree=Try&full=1#error4
> 
> 05:50:40     INFO -  07-14 12:41:55.552 F/MOZ_Assert(  753): Assertion
> failure: mUsedShmems.empty(), at
> ../../../gecko/gfx/layers/ipc/ISurfaceAllocator.cpp:53
> 
> 
> I don't understand this assertion or the mUsedShmems that it's asserting
> should be empty at this point, but that was added recently, in
> 
> http://hg.mozilla.org/mozilla-central/rev/193df828b5a2
> 
> Bug 981315: Add ShmemSection and use it for gfxShmSharedReacLock. r=gal

This basically just checks if all the tiles have been de-allocated by the time the ISurfaceAllocator is shutdown. This assertion is indicative of something not having been cleaned up neatly at shutdown.
Flags: needinfo?(bas)
Well, it was added at a time when gfx ipc shutdown was a giant bag of undefined behavior and race conditions. Now it's in a more reasonable shape (since bug 774388 landed), but we're still getting this crash.
(In reply to Benoit Jacob [:bjacob] from comment #7)
> Well, it was added at a time when gfx ipc shutdown was a giant bag of
> undefined behavior and race conditions. Now it's in a more reasonable shape
> (since bug 774388 landed), but we're still getting this crash.

Someone will have to spend time figuring out -why- there's tiles still around when this is destructed :).
This might be addressed by Bug 1039883.
(In reply to Sotaro Ikeda [:sotaro PTO July/25 - Aug/3] from comment #9)
> This might be addressed by Bug 1039883.

Sorry, I confirmed Bug 103988 does not address the problem :-(
What could be happening here is that TileBuffers are being destroyed asynchronously and the assert is triggered while the host side hasn't yet released all of the locks:

- (client) destroy TileContentClient
  -> ReadUnlock the client tiles but some are still held by the host side so their shmems arent destroyed
...
- (client) destroy the ISurfaceAllocator (the assertion is hit here)
- (host) destroy TileContentHost
  -> ReadUnlock the host tiles, too late though, since we already crashed encountering the assertion.
Blocks: 1039883
I confirmed the problem by using master flame by the following STR

[1] Show homescreen
[2] Start "Setting" app
[3] push home button. By this, move to cards view.
[4] Kill Setting app by swiping the sttting app's card.
(In reply to Sotaro Ikeda [:sotaro PTO July/25 - Aug/3] from comment #12)
> I confirmed the problem by using master flame by the following STR
> 
> [1] Show homescreen
> [2] Start "Setting" app
> [3] push home button. By this, move to cards view.
> [4] Kill Setting app by swiping the sttting app's card.

It is with Bug 1039883 fix case.
See the discussion in bug 1039883 - we need that one in order to reduce the memory usage, and it can't land until this one is resolved somehow.
blocking-b2g: --- → 2.0+
(In reply to Sotaro Ikeda [:sotaro PTO July/25 - Aug/3] from comment #13)
> (In reply to Sotaro Ikeda [:sotaro PTO July/25 - Aug/3] from comment #12)
> > I confirmed the problem by using master flame by the following STR
> > 
> > [1] Show homescreen
> > [2] Start "Setting" app
> > [3] push home button. By this, move to cards view.
> > [4] Kill Setting app by swiping the sttting app's card.
> 
> It is with Bug 1039883 fix case.

About Bug 1039883, I found a problem to the patch. I free TileHosts without unlocking.
Other problem seems when ISurfaceAllocator::~ISurfaceAllocator() is called, some gfxMemorySharedReadLocks are still alive in b2g process.
(In reply to Sotaro Ikeda [:sotaro PTO July/25 - Aug/3] from comment #13)
> (In reply to Sotaro Ikeda [:sotaro PTO July/25 - Aug/3] from comment #12)
> > I confirmed the problem by using master flame by the following STR
> > 
> > [1] Show homescreen
> > [2] Start "Setting" app
> > [3] push home button. By this, move to cards view.
> > [4] Kill Setting app by swiping the sttting app's card.
> 
> It is with Bug 1039883 fix case.

This was a problem of Bug 1039883 patch. Since fixing the patch, the problem seems to be fixed on my flame. If Bug 1039883 can be checked in, a dependency to this bug could be removed.
OK, this was cleared up as not blocking bug 1039883, thus it doesn't need to be 2.0+ or block anything else.
No longer blocks: CAF-v2.0-FC-metabug, 1039883
blocking-b2g: 2.0+ → ---
I have a fix for this in bug 1016538
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.