Closed Bug 1046084 Opened 10 years ago Closed 10 years ago

content process crashes during IPC

Categories

(Core :: DOM: Content Processes, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

()

RESOLVED DUPLICATE of bug 1054929
blocking-b2g 2.0+
Tracking Status
b2g-v1.3T --- affected
b2g-v1.4 --- affected
b2g-v2.0 --- affected

People

(Reporter: tkundu, Assigned: khuey)

References

Details

(Keywords: crash, Whiteboard: [b2g-crash][caf-crash 4][caf priority: p1][CR 687932][CR 687932][sprd296788][sprd337499])

Crash Data

Attachments

(5 files)

Attached file stacktrace.txt
Test steps:
1. Run a  script with call, SMS, Camera, camcorder, Music and video test case.
2. After night run, mini dumps are generated on the phone.
[Blocking Requested - why for this release]:
blocking-b2g: --- → 2.0?
Crash Signature: [@ mozalloc_abort(char const*) | NS_DebugBreak | mozilla::dom::ContentChild::ProcessingError(mozilla::ipc::HasResultCodes::Result) ]
Whiteboard: [CR 687932] → [CR 687932][CR 687932]
Whiteboard: [CR 687932][CR 687932] → [caf priority: p1][CR 687932][CR 687932]
Whiteboard: [caf priority: p1][CR 687932][CR 687932] → [b2g-crash][caf-crash 4][caf priority: p1][CR 687932][CR 687932]
Keywords: crash
Observed on: 

Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.043
Moz BuildID: 20140721000201
B2G Version: 2.0
Gecko Version: 32.0a2
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=8cb1a949f2e9650bb2c5598e78a6f24a58bbaf97
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=5f27d3ee3ccf01ac91a3efacb5e3e22ea62fd73c
Component: Gaia::Homescreen → DOM: Content Processes
Product: Firefox OS → Core
So we got MsgRouteError.  What does that mean?
Flags: needinfo?(bent.mozilla)
The main process is still alive.  It's not clear from the log which app this is though.
blocking-b2g: 2.0? → 2.0+
Summary: b2g process crashes during IPC → content process crashes during IPC
We discussed this offline.
Flags: needinfo?(bent.mozilla)
Kyle would you be the right assignee here ?
Flags: needinfo?(khuey)
I can write a patch that enables the logging that we'll need to go further here but we'll need more data to actually solve this.
Flags: needinfo?(khuey)
Blocks: 961343
Whiteboard: [b2g-crash][caf-crash 4][caf priority: p1][CR 687932][CR 687932] → [b2g-crash][caf-crash 4][caf priority: p1][CR 687932][CR 687932][sprd296788][sprd337499]
Flags: needinfo?(ryang)
Passing this to kyle to get the debug patch which may give us more info here.
Assignee: nobody → khuey
tracked by Bug961343. need Spredatrum to get log with IPC enabled. Thanks !
Flags: needinfo?(ryang)
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #8)
> I can write a patch that enables the logging that we'll need to go further
> here but we'll need more data to actually solve this.

kyle, does CAF need to cherry pick patch in Bug961343 which may help with additional logging here to get more data here?
Flags: needinfo?(khuey)
No, they don't need that patch.  I'll prepare what they need.
Flags: needinfo?(khuey)
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #12)
> No, they don't need that patch.  I'll prepare what they need.

:khuey, if we can help with a patch ASAP, CAF will be able to take this in in their stability test run that kick's off tonight. Thanks for the help!
Attached patch Diagnostic PatchSplinter Review
Tapas, if you test with this it will dump logging messages for IPC to logcat.  If you can reproduce the crash and provide the logcat around the time of the crash we can see which IPC protocol is causing the crash and we can either audit the relevant code or add more targeted logging to find the root cause of this issue.
Flags: needinfo?(tkundu)
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #14)
> Created attachment 8469546 [details] [diff] [review]
> Diagnostic Patch
> 
> Tapas, if you test with this it will dump logging messages for IPC to
> logcat.  If you can reproduce the crash and provide the logcat around the
> time of the crash we can see which IPC protocol is causing the crash and we
> can either audit the relevant code or add more targeted logging to find the
> root cause of this issue.

We will run with this patch and confirm you soon
Observed on: 

Device: msm8610
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.058
Moz BuildID: 20140807000201
B2G Version: 2.0
Gecko Version: 32.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=8cc28fd31905a0ea2b2e15d13e80a0eab2feb1ba
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=f7bd772b1e42774708a4ede13b149a1706a59b25
comment #16 is invalid here and CAF is still not seeing this after the diagnostic patch landed in comment #34
Hi Bhavna, I think comment #16 is valid.  Since you mention comment #34, maybe you intended to comment another bug?
Oh, talked to Tapas.  AU58 doesn't have the patch since he added it only to engineering build.  We still have yet to reproduce it with engineering build but will continue trying.
(In reply to bhavana bajaj [:bajaj] from comment #17)
> comment #16 is invalid here and CAF is still not seeing this after the
> diagnostic patch landed in comment #34

err, meant comment #14 here
So did we see this or not?  And if we did, was it in a build with the patch from comment 14?
We still don't have new logs on this. It's not reproducing in our setup with additional logging enabled.
QA Whiteboard: [2.0-signoff-need+]
No enough Steps present to create test case to address bug.
QA Whiteboard: [2.0-signoff-need+] → [2.0-signoff-need+][QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Flags: in-moztrap?(rmitchell)
QA Whiteboard: [2.0-signoff-need+][QAnalyst-Triage?] → [2.0-signoff-need+][QAnalyst-Triage+]
Flags: needinfo?(ktucker)
Flags: in-moztrap?(rmitchell)
Flags: in-moztrap-
Not reproducing anymore, closing. Will reopen if it comes back again.
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: needinfo?(tkundu)
Resolution: --- → WORKSFORME
Observed on: 

Device: msm8610
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.066
Moz BuildID: 20140810160202
B2G Version: 2.0
Gecko Version: 32.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=de28796a8956a48bb98ca67df6a33e0622d642d1
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=2b27becae85092d46bfadcd4fb5605e82e1e1093
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
needz logz
Flags: needinfo?(ggrisco)
Attached file decoded minidump
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #26)
> needz logz

I just had cafbot upload the latest minidump/extra combo.  The extra file contains some portion of the log.
Flags: needinfo?(ggrisco)
These logs do not appear to have been taken with the provided patch.  There's no 'GeckoIPC' entries in the log.  So we still don't have anything to go on here.
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #30)
> These logs do not appear to have been taken with the provided patch. 
> There's no 'GeckoIPC' entries in the log.  So we still don't have anything
> to go on here.

I think we had applied the patch in a test build but weren't able to repro the crash with that build.
Tapas, do you think we should try another eng build with this patch?
Flags: needinfo?(tkundu)
So you're saying it reproduces without the logging but not with it?
Kyle -- that's what it seems like. But, we are trying to reproduce it with the logs again and see if we get lucky. Can you analyze the shared patch and see if anything could cause the issue to not reproduce by altering the timing or anything?
Flags: needinfo?(tkundu)
Yeah, the logging could change the timing.  We could probably come up with something more targeted that would be less likely to change the timing.
Since you cannot reproduce with the full logging turned on, please grab this patch: https://hg.mozilla.org/releases/mozilla-b2g32_v2_0/rev/301f3d899f83.  It does not execute any code until after the error has already occurred so it should not disturb the timing.

We will also need either a link to an exact Gecko cset that we can build ourselves or certain files from your build directory.  We have to match the values printed by that patch to a series of enums that are generated during the build to determine what subsystem is responsible for this error.  If you are unable to provide a link to the exact sources used to build the Gecko you reproduce this issue on I can provide your instructions on determining what files from the build we need.
Flags: needinfo?(ggrisco)
QA Whiteboard: [2.0-signoff-need+][QAnalyst-Triage+] → [2.0-signoff-need+][QAnalyst-Triage+][lead-review+]
Bug 1054929 did invalid fd close. It could be one possible cause.
Depends on: 1054929
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #36)
> You need to pick up 
> https://hg.mozilla.org/releases/mozilla-b2g32_v2_0/rev/d946233724d5 too.


We did pick up these two patches in an engineering build and are waiting a few days to see if it reproduces with the patches.
Flags: needinfo?(ggrisco)
No longer depends on: 1054929
Depends on: 1054929
Greg -- can you please cherry-pick the change from bug 1054929.
Flags: needinfo?(ggrisco)
Ok, the patch from bug 1054929 is now built into:
AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.074
Flags: needinfo?(ggrisco)
We didn't see this issue repeat on AU 74 since appyling patch from bug 1054929.
(In reply to Greg Grisco from comment #41)
> We didn't see this issue repeat on AU 74 since appyling patch from bug
> 1054929.

Can we close this one as a dupe to bug 1054929 ?
Flags: needinfo?(ggrisco)
Yes, I think so.  We have not seen this issue reproduce.  Thanks.
Flags: needinfo?(ggrisco)
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: