Closed Bug 637243 Opened 14 years ago Closed 14 years ago

Android crash stacks are completely busted

Categories

(Toolkit :: Crash Reporting, defect)

ARM
Android
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla2.0
Tracking Status
blocking2.0 --- final+

People

(Reporter: jdm, Assigned: glandium)

References

Details

Attachments

(1 file)

Case in point: 02f20a8d-630c-49a6-a908-fe6242110227 I triggered a content crash with the latest crashme, but the stack does not reflect that at all.
I recall the stacks for 4.0b5pre being either completely or mostly bogus, so I'm doing some searching to try to narrow the window for when this started happening. I'm seeing bizarro stacks being reported at least as far back as Feb 2.
Ok, so it looks like we have no little crash information on the period between 1/26 and 1/28, which is when we switched from 4.0b4pre to 4.0b5pre (and stopped collecting data until crash-stats was updated) and coincidentally the closest regression window that I can find. Every stack I look at for 4.0b4pre in leading up to that switch looks fine; in 4.0b5pre, there's a single nsScriptSecurityManager::doGetObjectPrincipal crash which has an intelligible stack and comes from build 20110127162904. I think we should investigate what landed in after that build went out.
Likely suspect: bug 628233 in which elfhack is enabled on Android. Sigh.
Blocks: 628233
Summary: Crash stacks are completely busted → Android crash stacks are completely busted
I took a look at random ff 4.0b12 crash reports for linux, they seem to be busted too :( (if someone could take a closer look to validate...) It could be a problem with either the crash symbols creation or the dumping process, because with gdb and standard debugging symbols, I get proper stacks.
blocking2.0: --- → ?
(In reply to comment #6) > It could be a problem with either the crash symbols creation or the dumping > process, because with gdb and standard debugging symbols, I get proper stacks. Obviously not crash symbols creation, since it takes symbols from files *before* elfhack.
(In reply to comment #7) > Obviously not crash symbols creation, since it takes symbols from files > *before* elfhack. However, note that .dbg files are taken out of non elfhack'ed binaries and thus don't correspond to elfhack'ed binaries. This shouldn't be a problem, though, as they are normally not used for crash reports. The .sym files are fine, though, since the .text addresses are the same between elfhack'ed and non-elfhack'ed binaries.
If the function at the top of the stack is sensible, but the stack after that is screwed, then it's possible that whatever elfhack is doing is screwing up the CFI data present in the .sym files. We parse the DWARF CFI and use that info to find caller frames. jimb wrote all that code, but he has a pretty good writeup of it here: http://code.google.com/p/google-breakpad/wiki/SymbolFiles#STACK_CFI_records
bz kept pointing out bad stacks over the past few days: bug 635901, bug 636052, etc.
This is essential. Are we pretty sure that elfhack caused this? If so, we should just disabled elfhack for this release and revisit later. And boy howdy, we should probably come up with some kind of unit test to make sure stackwalking works sanely. ted/glandium, which one of you wants to take this?
blocking2.0: ? → final+
glandium is on this, but I agree that we should just disable for this release. A stackwalking unittest would be great, although probably a PITA to setup just because of the need to build all the Breakpad processor code.
Assignee: nobody → mh+mozilla
The responsible is the minidump writer, which only prints out one mapping for what is not data in libxul.so, while there are now two with elfhack. Let's disable elfhack for now (which is what the patch does), and we'll see later to fix minidump writer, and to fix the .dbg files generation, too.
Attachment #515610 - Flags: review?(ted.mielczarek)
Comment on attachment 515610 [details] [diff] [review] Disable elfhack by default Please land this ASAP.
Attachment #515610 - Flags: review?(ted.mielczarek) → review+
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.0
(In reply to comment #13) > we'll see > later to fix minidump writer, and to fix the .dbg files generation, too. Is there a bug on file for that part?
(In reply to comment #16) > (In reply to comment #13) > > we'll see > > later to fix minidump writer, and to fix the .dbg files generation, too. > > Is there a bug on file for that part? bug 637316 and bug 637341.
tracking-fennec: ? → ---
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: