content process crashes during IPC

RESOLVED DUPLICATE of bug 1054929

Status

()

Core
DOM: Content Processes
P1
blocker
RESOLVED DUPLICATE of bug 1054929
4 years ago
4 years ago

People

(Reporter: tkundu, Assigned: khuey)

Tracking

(Blocks: 1 bug, {crash})

unspecified
ARM
Gonk (Firefox OS)
crash
Points:
---
Dependency tree / graph
Bug Flags:
in-moztrap -

Firefox Tracking Flags

(blocking-b2g:2.0+, b2g-v1.3T affected, b2g-v1.4 affected, b2g-v2.0 affected)

Details

(Whiteboard: [b2g-crash][caf-crash 4][caf priority: p1][CR 687932][CR 687932][sprd296788][sprd337499], crash signature)

Attachments

(5 attachments)

Created attachment 8464642 [details]
stacktrace.txt

Test steps:
1. Run a  script with call, SMS, Camera, camcorder, Music and video test case.
2. After night run, mini dumps are generated on the phone.
[Blocking Requested - why for this release]:
blocking-b2g: --- → 2.0?
Crash Signature: [@ mozalloc_abort(char const*) | NS_DebugBreak | mozilla::dom::ContentChild::ProcessingError(mozilla::ipc::HasResultCodes::Result) ]

Updated

4 years ago
Whiteboard: [CR 687932] → [CR 687932][CR 687932]

Updated

4 years ago
Whiteboard: [CR 687932][CR 687932] → [caf priority: p1][CR 687932][CR 687932]

Updated

4 years ago
Whiteboard: [caf priority: p1][CR 687932][CR 687932] → [b2g-crash][caf-crash 4][caf priority: p1][CR 687932][CR 687932]

Updated

4 years ago
Keywords: crash
Observed on: 

Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.043
Moz BuildID: 20140721000201
B2G Version: 2.0
Gecko Version: 32.0a2
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=8cb1a949f2e9650bb2c5598e78a6f24a58bbaf97
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=5f27d3ee3ccf01ac91a3efacb5e3e22ea62fd73c

Updated

4 years ago
Component: Gaia::Homescreen → DOM: Content Processes
Product: Firefox OS → Core
So we got MsgRouteError.  What does that mean?
Flags: needinfo?(bent.mozilla)
The main process is still alive.  It's not clear from the log which app this is though.
blocking-b2g: 2.0? → 2.0+
Summary: b2g process crashes during IPC → content process crashes during IPC
We discussed this offline.
Flags: needinfo?(bent.mozilla)
Kyle would you be the right assignee here ?
Flags: needinfo?(khuey)
I can write a patch that enables the logging that we'll need to go further here but we'll need more data to actually solve this.
Flags: needinfo?(khuey)
Blocks: 961343
status-b2g-v1.3T: --- → affected
status-b2g-v1.4: --- → affected
status-b2g-v2.0: --- → affected
Whiteboard: [b2g-crash][caf-crash 4][caf priority: p1][CR 687932][CR 687932] → [b2g-crash][caf-crash 4][caf priority: p1][CR 687932][CR 687932][sprd296788][sprd337499]
Flags: needinfo?(ryang)
Passing this to kyle to get the debug patch which may give us more info here.
Assignee: nobody → khuey
tracked by Bug961343. need Spredatrum to get log with IPC enabled. Thanks !
Flags: needinfo?(ryang)
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #8)
> I can write a patch that enables the logging that we'll need to go further
> here but we'll need more data to actually solve this.

kyle, does CAF need to cherry pick patch in Bug961343 which may help with additional logging here to get more data here?
Flags: needinfo?(khuey)
No, they don't need that patch.  I'll prepare what they need.
Flags: needinfo?(khuey)
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #12)
> No, they don't need that patch.  I'll prepare what they need.

:khuey, if we can help with a patch ASAP, CAF will be able to take this in in their stability test run that kick's off tonight. Thanks for the help!
Created attachment 8469546 [details] [diff] [review]
Diagnostic Patch

Tapas, if you test with this it will dump logging messages for IPC to logcat.  If you can reproduce the crash and provide the logcat around the time of the crash we can see which IPC protocol is causing the crash and we can either audit the relevant code or add more targeted logging to find the root cause of this issue.
Flags: needinfo?(tkundu)
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #14)
> Created attachment 8469546 [details] [diff] [review]
> Diagnostic Patch
> 
> Tapas, if you test with this it will dump logging messages for IPC to
> logcat.  If you can reproduce the crash and provide the logcat around the
> time of the crash we can see which IPC protocol is causing the crash and we
> can either audit the relevant code or add more targeted logging to find the
> root cause of this issue.

We will run with this patch and confirm you soon
Observed on: 

Device: msm8610
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.058
Moz BuildID: 20140807000201
B2G Version: 2.0
Gecko Version: 32.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=8cc28fd31905a0ea2b2e15d13e80a0eab2feb1ba
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=f7bd772b1e42774708a4ede13b149a1706a59b25
comment #16 is invalid here and CAF is still not seeing this after the diagnostic patch landed in comment #34

Comment 18

4 years ago
Hi Bhavna, I think comment #16 is valid.  Since you mention comment #34, maybe you intended to comment another bug?

Comment 19

4 years ago
Oh, talked to Tapas.  AU58 doesn't have the patch since he added it only to engineering build.  We still have yet to reproduce it with engineering build but will continue trying.
(In reply to bhavana bajaj [:bajaj] from comment #17)
> comment #16 is invalid here and CAF is still not seeing this after the
> diagnostic patch landed in comment #34

err, meant comment #14 here
So did we see this or not?  And if we did, was it in a build with the patch from comment 14?

Comment 22

4 years ago
We still don't have new logs on this. It's not reproducing in our setup with additional logging enabled.

Updated

4 years ago
QA Whiteboard: [2.0-signoff-need+]

Comment 23

4 years ago
No enough Steps present to create test case to address bug.
QA Whiteboard: [2.0-signoff-need+] → [2.0-signoff-need+][QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Flags: in-moztrap?(rmitchell)
QA Whiteboard: [2.0-signoff-need+][QAnalyst-Triage?] → [2.0-signoff-need+][QAnalyst-Triage+]
Flags: needinfo?(ktucker)
Flags: in-moztrap?(rmitchell)
Flags: in-moztrap-

Comment 24

4 years ago
Not reproducing anymore, closing. Will reopen if it comes back again.
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Flags: needinfo?(tkundu)
Resolution: --- → WORKSFORME
Observed on: 

Device: msm8610
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.066
Moz BuildID: 20140810160202
B2G Version: 2.0
Gecko Version: 32.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=de28796a8956a48bb98ca67df6a33e0622d642d1
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=2b27becae85092d46bfadcd4fb5605e82e1e1093

Updated

4 years ago
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
needz logz
Flags: needinfo?(ggrisco)
Created attachment 8476897 [details]
EXTRA file attachment
Created attachment 8476898 [details]
decoded minidump

Comment 29

4 years ago
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #26)
> needz logz

I just had cafbot upload the latest minidump/extra combo.  The extra file contains some portion of the log.
Flags: needinfo?(ggrisco)
These logs do not appear to have been taken with the provided patch.  There's no 'GeckoIPC' entries in the log.  So we still don't have anything to go on here.

Comment 31

4 years ago
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #30)
> These logs do not appear to have been taken with the provided patch. 
> There's no 'GeckoIPC' entries in the log.  So we still don't have anything
> to go on here.

I think we had applied the patch in a test build but weren't able to repro the crash with that build.
Tapas, do you think we should try another eng build with this patch?
Flags: needinfo?(tkundu)
So you're saying it reproduces without the logging but not with it?

Comment 33

4 years ago
Kyle -- that's what it seems like. But, we are trying to reproduce it with the logs again and see if we get lucky. Can you analyze the shared patch and see if anything could cause the issue to not reproduce by altering the timing or anything?
Flags: needinfo?(tkundu)
Yeah, the logging could change the timing.  We could probably come up with something more targeted that would be less likely to change the timing.
Since you cannot reproduce with the full logging turned on, please grab this patch: https://hg.mozilla.org/releases/mozilla-b2g32_v2_0/rev/301f3d899f83.  It does not execute any code until after the error has already occurred so it should not disturb the timing.

We will also need either a link to an exact Gecko cset that we can build ourselves or certain files from your build directory.  We have to match the values printed by that patch to a series of enums that are generated during the build to determine what subsystem is responsible for this error.  If you are unable to provide a link to the exact sources used to build the Gecko you reproduce this issue on I can provide your instructions on determining what files from the build we need.
Flags: needinfo?(ggrisco)

Updated

4 years ago
QA Whiteboard: [2.0-signoff-need+][QAnalyst-Triage+] → [2.0-signoff-need+][QAnalyst-Triage+][lead-review+]
Bug 1054929 did invalid fd close. It could be one possible cause.
Depends on: 1054929

Comment 38

4 years ago
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #36)
> You need to pick up 
> https://hg.mozilla.org/releases/mozilla-b2g32_v2_0/rev/d946233724d5 too.


We did pick up these two patches in an engineering build and are waiting a few days to see if it reproduces with the patches.
Flags: needinfo?(ggrisco)
No longer depends on: 1054929
Depends on: 1054929

Comment 39

4 years ago
Greg -- can you please cherry-pick the change from bug 1054929.
Flags: needinfo?(ggrisco)

Comment 40

4 years ago
Ok, the patch from bug 1054929 is now built into:
AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.074
Flags: needinfo?(ggrisco)

Comment 41

4 years ago
We didn't see this issue repeat on AU 74 since appyling patch from bug 1054929.
(In reply to Greg Grisco from comment #41)
> We didn't see this issue repeat on AU 74 since appyling patch from bug
> 1054929.

Can we close this one as a dupe to bug 1054929 ?
Flags: needinfo?(ggrisco)

Comment 43

4 years ago
Yes, I think so.  We have not seen this issue reproduce.  Thanks.
Flags: needinfo?(ggrisco)
Status: REOPENED → RESOLVED
Last Resolved: 4 years ago4 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1054929
You need to log in before you can comment on or make changes to this bug.