Crash in CompositorChild::ActorDestroy while receiving calls and SMS continuously

RESOLVED WORKSFORME

Status

()

Core
Graphics: Layers
--
critical
RESOLVED WORKSFORME
4 years ago
4 years ago

People

(Reporter: Greg Grisco, Assigned: sotaro)

Tracking

({crash})

unspecified
ARM
Gonk (Firefox OS)
crash
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(blocking-b2g:2.0+)

Details

(Whiteboard: [caf-crash 48][caf priority: p2][b2g-crash][CR 584125], crash signature)

Attachments

(2 attachments)

(Reporter)

Description

4 years ago
[Blocking Requested - why for this release]:

1. Enable auto answer.
2. Receive MT call and MT SMS continuously.
3. After few hours of run, mini dumps are generated in the phone.

This is really a duplicate of bug 945992, but opening it on 2.0

[@ mozalloc_abort(char const*) | NS_DebugBreak | mozilla::layers::CompositorChild::ActorDestroy(mozilla::ipc::IProtocolManager::ActorDestroyReason) | mozilla::layers::PCompositorChild::DestroySubtree(mozilla::ipc::IProtocolManager::ActorDestroyReason) ]
Component: IPC → Graphics: Layers
(Assignee)

Comment 1

4 years ago
Greg, why does this bug flagged to 2.0? Does the crash happen on b2g-2.0? If it is a dup of bug 945992, the crash seems to happen on b2g-v1.3.
(Reporter)

Comment 2

4 years ago
(In reply to Sotaro Ikeda [:sotaro] from comment #1)
> Greg, why does this bug flagged to 2.0? Does the crash happen on b2g-2.0? If
> it is a dup of bug 945992, the crash seems to happen on b2g-v1.3.

Hi Sotaro.  I'll upload logs in a second, but this bug is crashing on 2.0 now.
Created attachment 8476747 [details]
EXTRA file attachment
Created attachment 8476748 [details]
decoded minidump

Updated

4 years ago
Keywords: crash
Flags: needinfo?(nical.bugzilla)
Can we confirm why this is a CAF blocker?
(Assignee)

Comment 6

4 years ago
>AbortMessage=[Child 3653] ###!!! ABORT: ActorDestroy by IPC channel failure at CompositorChild: file ../../../../../../../../gecko/gfx/layers/ipc/CompositorChild.cpp, line 169

The above log was triggered by the following code. It just say that CompositorChild is going to be closed because of IPC error. The log does not have an inforation about why IPC error happens. 

void
CompositorChild::ActorDestroy(ActorDestroyReason aWhy)
{
  MOZ_ASSERT(sCompositor == this);

#ifdef MOZ_B2G
  // Due to poor lifetime management of gralloc (and possibly shmems) we will
  // crash at some point in the future when we get destroyed due to abnormal
  // shutdown. Its better just to crash here. On desktop though, we have a chance
  // of recovering.
  if (aWhy == AbnormalShutdown) {
    NS_RUNTIMEABORT("ActorDestroy by IPC channel failure at CompositorChild");
  }
#endif
(Assignee)

Updated

4 years ago
Assignee: nobody → sotaro.ikeda.g
(Assignee)

Comment 7

4 years ago
Greg, can you re-test again with IPC log enabled?
Flags: needinfo?(ggrisco)
(Assignee)

Comment 8

4 years ago
To analyze the problem, IPC enabled log with crash log becomes necessary.
(Reporter)

Comment 9

4 years ago
(In reply to Milan Sreckovic [:milan] from comment #5)
> Can we confirm why this is a CAF blocker?

Hi Milan, this is affecting our stability milestone for this release.  We've seen this crash 3 times on latest AU.
Flags: needinfo?(ggrisco)
(Assignee)

Comment 10

4 years ago
(In reply to Sotaro Ikeda [:sotaro] from comment #7)
> Greg, can you re-test again with IPC log enabled?

Greg, is it possible?
Without the IPC log enabled log, it is not possible to investigate about the problem.
Flags: needinfo?(ggrisco)

Updated

4 years ago
Whiteboard: [b2g-crash][CR 584125] → [caf priority: p2][b2g-crash][CR 584125]

Updated

4 years ago
Whiteboard: [caf priority: p2][b2g-crash][CR 584125] → [caf-crash 48][caf priority: p2][b2g-crash][CR 584125]
Observed on: 

Device: 
Gonk Version: AU_LINUX_GECKO_B2G_JB_3.2.01.03.00.112.255
Moz BuildID: 20140226004002
B2G Version: 2.0
Gecko Version: 28.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=8039a5cb7519adfa81677df577f494c6a4de6140
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=7599b5ccc906f556076798fc062a9e51e2c0eece
The bot appears to be reporting this on 1.3 ...
(Reporter)

Comment 13

4 years ago
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #12)
> The bot appears to be reporting this on 1.3 ...

Yeah, right now the bot doesn't discriminate between versions.  We're tracking this locally as a single issue, so it will log here whenever it witnesses the crash on any version.  I'll see if we can improve this but it might take a little time.  In the meantime, I'll try to force the bot to comment on 2.0 version here.
Flags: needinfo?(ggrisco)
(Assignee)

Updated

4 years ago
Flags: needinfo?(ggrisco)
Observed on: 

Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.060
Moz BuildID: 20140810000201
B2G Version: 2.0
Gecko Version: 32.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=de28796a8956a48bb98ca67df6a33e0622d642d1
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=2b27becae85092d46bfadcd4fb5605e82e1e1093
(Assignee)

Comment 15

4 years ago
(In reply to cafbot (PoC: ggrisco) from comment #14)
> Observed on: 
> 
> Device: msm8226
> Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.060
> Moz BuildID: 20140810000201
> B2G Version: 2.0
> Gecko Version: 32.0
> Gaia: 
> http://git.mozilla.org/?p=releases/gaia.git;a=commit;
> h=de28796a8956a48bb98ca67df6a33e0622d642d1
> Gecko:
> http://git.mozilla.org/?p=releases/gecko.git;a=commit;
> h=2b27becae85092d46bfadcd4fb5605e82e1e1093

This gecko does not include Bug 1053204 fix.
(Reporter)

Comment 16

4 years ago
(In reply to Sotaro Ikeda [:sotaro] from comment #15)
> This gecko does not include Bug 1053204 fix.

Right.  According to bug 1053204, that fix should be built into AU63 and later.
Flags: needinfo?(ggrisco)
(Assignee)

Comment 17

4 years ago
(In reply to Sotaro Ikeda [:sotaro] from comment #10)
> (In reply to Sotaro Ikeda [:sotaro] from comment #7)
> > Greg, can you re-test again with IPC log enabled?
> 
> Greg, is it possible?
> Without the IPC log enabled log, it is not possible to investigate about the
> problem.

Greg, Tapas, is it possible re-test by enabling gecko IPC log?
Flags: needinfo?(tkundu)
Flags: needinfo?(ggrisco)
(In reply to Sotaro Ikeda [:sotaro] from comment #17)
> (In reply to Sotaro Ikeda [:sotaro] from comment #10)
> > (In reply to Sotaro Ikeda [:sotaro] from comment #7)
> > > Greg, can you re-test again with IPC log enabled?
> > 
> > Greg, is it possible?
> > Without the IPC log enabled log, it is not possible to investigate about the
> > problem.
> 
> Greg, Tapas, is it possible re-test by enabling gecko IPC log?

We asked our test folks to reproduce it with IPC logs. we will get back to you asap
Is it OK if enable only minimal gecko IPC traces from bug 1046084 comment 35. Do we still need FULL IPC log here. ? I am asking again because our test team will try to reproduce both bug 1056893 and bug 1046084 on same build.
Flags: needinfo?(tkundu)
Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(khuey)
I don't think so.  But we might be able to do targeted logging here too.  Sotaro what information do you actually need here?
Flags: needinfo?(khuey)
(Assignee)

Comment 21

4 years ago
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #20)
> I don't think so.  But we might be able to do targeted logging here too. 
> Sotaro what information do you actually need here?

I have no idea what triggers IPC error, therefore if would be nice if we could get full gecko IPC log. But bug 1046084 comment 35 seems OK for me now.
Flags: needinfo?(sotaro.ikeda.g)
(In reply to Sotaro Ikeda [:sotaro] from comment #21)
> (In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #20)
> > I don't think so.  But we might be able to do targeted logging here too. 
> > Sotaro what information do you actually need here?
> 
> I have no idea what triggers IPC error, therefore if would be nice if we
> could get full gecko IPC log. But bug 1046084 comment 35 seems OK for me now.

That code only logs errors that go through MessageChannel::MaybeHandleError.  It doesn't look like this does.
bent, can we end up in OnChannelError for something other than our pipe being screwed up?
Flags: needinfo?(bent.mozilla)
(Assignee)

Comment 24

4 years ago
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #22)
> 
> That code only logs errors that go through MessageChannel::MaybeHandleError.
> It doesn't look like this does.

Hmm, if so, we need full gecko IPC log.
(In reply to Sotaro Ikeda [:sotaro] from comment #15)
> (In reply to cafbot (PoC: ggrisco) from comment #14)
> > Observed on: 
> > 
> > Device: msm8226
> > Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.060
> > Moz BuildID: 20140810000201
> > B2G Version: 2.0
> > Gecko Version: 32.0
> > Gaia: 
> > http://git.mozilla.org/?p=releases/gaia.git;a=commit;
> > h=de28796a8956a48bb98ca67df6a33e0622d642d1
> > Gecko:
> > http://git.mozilla.org/?p=releases/gecko.git;a=commit;
> > h=2b27becae85092d46bfadcd4fb5605e82e1e1093
> 
> This gecko does not include Bug 1053204 fix.

Just wanted to be rather explicit that this crash report that CAF filed does not include  gecko fix include from Bug 1053204.

(If we think this is an independent issue and has no connection to 1053204, we shall continue investigating... They have picked this patch from 1053204 in their AU63 which is currently undergoing testing)
If we're randomly stomping on fds because bug 1053204 is not fixed yet that could explain this bug ...
My recommendation would be to take the logging for bug 1046084 and not turn on full IPC logging.  If this is not fixed by bug 1053204 we at least know what subsystem it is in, whereas we don't even know where to start looking for bug 1046084.
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #27)
> My recommendation would be to take the logging for bug 1046084 and not turn
> on full IPC logging.  If this is not fixed by bug 1053204 we at least know
> what subsystem it is in, whereas we don't even know where to start looking
> for bug 1046084.

Done. We waiting logs from test team. We will update asap

Updated

4 years ago
blocking-b2g: 2.0? → 2.0+

Updated

4 years ago
Flags: needinfo?(nical.bugzilla)
(Assignee)

Comment 29

4 years ago
Bug 1054929 did invalid fd close. It could be one possible cause.
Depends on: 1054929
No longer depends on: 1054929
Depends on: 1054929
(Reporter)

Comment 30

4 years ago
(In reply to Sotaro Ikeda [:sotaro] from comment #29)
> Bug 1054929 did invalid fd close. It could be one possible cause.

We applied patches from bug 1057220, maybe that will help catch invalid fd close if that's the cause.
Flags: needinfo?(ggrisco)
(Reporter)

Comment 31

4 years ago
We're still not seeing this reproduce, closing.
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Resolution: FIXED → WORKSFORME
You need to log in before you can comment on or make changes to this bug.