Closed Bug 925057 Opened 12 years ago Closed 11 years ago

Get gecko stack traces from crashed b2g processes

Categories

(Remote Protocol :: Marionette, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: overholt, Unassigned)

Details

(Whiteboard: [runner])

In Marionette logs for crashes such as bug 919569 I can see that the b2g process died: ERROR - 09-23 15:00:59.432 45 45 F libc : Fatal signal 11 (SIGSEGV) at ERROR - This usually indicates the B2G process has crashed but I don't see a gecko stack trace which would make it easier to track down a root cause. Is it possible to get stack traces when a gecko process crashes?
We do get stack traces sometimes, I believe when we don't it indicates the stack originates in a system library that we don't have symbols for. cc'ing ted and ahal who might know more.
(In reply to comment #1) > We do get stack traces sometimes, I believe when we don't it indicates the > stack originates in a system library that we don't have symbols for. cc'ing > ted and ahal who might know more. We should still get a stack trace in that case, I believe, it would just not be symbolicated for those frames. The problem here is that we don't get any stacks whatsoever!
The first thing I would check is that we actually have Breakpad enabled. If this is on a branch before bug 717538 landed then it might not be on by default, and depends on the harness setting MOZ_CRASHREPORTER=1.
Yep, it should be getting set: http://mxr.mozilla.org/mozilla-central/source/build/mobile/b2gautomation.py#85 I can say I've definitely seen stack traces in the past, but it's possible there was a regression at some point.
For reference bug 843296 and its dependents implemented crash reporting for b2g. Bug 866937 is about missing symbol information. The latter was pseudo-blocked on bug 807792 which is now finished. Maybe we should try fixing that and see if that solves the cases :overholt is seeing (actually overholt's crash is in libc which is specifically called out in bug 866937).
If we're getting the F/libc message, and breakpad was enabled, it means that breakpad failed somehow — it restores the previous handler on failure and sets it to SIG_DFL on success, before re-raising the crash signal. There might be an instance of this or something similar buried in the comments on bug 929128.
Whiteboard: [runner]
I think bug 976120 fixed this issue. I propose we resolve WFM and re-open/create a new bug if we see more instances of missing stacks.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Resolution: FIXED → WORKSFORME
Product: Testing → Remote Protocol
You need to log in before you can comment on or make changes to this bug.