Closed
Bug 664510
Opened 12 years ago
Closed 12 years ago
Get valid crashreporter reports again
Categories
(Firefox for Android Graveyard :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: azakai, Assigned: ted)
References
Details
(Whiteboard: [mobile_dev_needed][android_tier_1])
We have some bugs where we crash on talos. With crashreporter working there, we could get useful data which otherwise is extremely difficult to acquire.
Reporter | ||
Comment 1•12 years ago
|
||
To clarify, the issue is that our crash reports have no useful information for symbols in the OS libs (like libc). If we build our own libc etc. we can do that with debug symbols.
Reporter | ||
Comment 2•12 years ago
|
||
Ignore comments 0 and 1. This is a broader issue than it seemed. We are getting corrupt stack traces basically all the time apparently. Stacks only show libc, and sometimes other system libraries, but never our own code. This happens both in crashreporter - almost all the recent top crashes are like this now - and when manually running a debugger on device. Example: https://crash-stats.mozilla.com/report/index/555d3584-6131-4aec-a9fc-8d4dc2110609 This seems to be a recent regression.
Summary: Get crashreporter reports from talos runs → Get valid crashreporter reports again
Comment 3•12 years ago
|
||
any update on this? I assumed this would be fixed by now since it is a regression and not getting proper crash reports is a pretty serious problem.
Comment 4•12 years ago
|
||
I do not believe that crash reports are broken in general. I tested crashing both content and chrome processes on the June 24 nightly, and they showed up with perfect stacks on socorro.
Comment 5•12 years ago
|
||
(In reply to comment #1) > To clarify, the issue is that our crash reports have no useful information > for symbols in the OS libs (like libc). If we build our own libc etc. we can > do that with debug symbols. Ted wanted to grab the symbols from the system libraries with an extension ; unfortunately, our dynamic linker is broken and doesn't permit that (bug 647288)
Reporter | ||
Comment 6•12 years ago
|
||
(In reply to comment #4) > I do not believe that crash reports are broken in general. I tested crashing > both content and chrome processes on the June 24 nightly, and they showed up > with perfect stacks on socorro. It isn't 100% broken, but we saw this both when debugging recently in bug 662936, and in most of the recent top crashers.
Comment 7•12 years ago
|
||
who is working on this bug and is there an ETA for fixing it? My understanding is that nobody is looking at bug 662936 until this is resolved.
Reporter | ||
Comment 8•12 years ago
|
||
dougt may have already found part of the stack trace issue in general (something with the library loader), but I don't think it can explain problems with crashreporter (which doesn't depend on the library loader AFAIK). I can't make a guess as to ETA, but we are doing our best. I am also working on bug 662936 in parallel, some ideas that do not depend on this bug.
Reporter | ||
Comment 9•12 years ago
|
||
ted and jdm inform me on irc that the issue is we are 'stuck' inside libc calls, without a way to get a proper stack trace from there. Relevant bugs are bug 668210 and bug 644707. The latter bug has a potential partial solution, will look into that. In general though there might not be a way to fix this for all cases.
Assignee | ||
Comment 10•12 years ago
|
||
Specifically, these are almost certainly calls to libc!abort(). Since we don't have debug symbols for system libraries on crash-stats, and there's no frame pointer on ARM, the stack walker just scans the stack looking for possible return addresses. Clearly it gets lost and wanders off into the weeds. bug 644707 would probably fix the abort() case, bug 668210 is a bit more work but would help fix the general case by crowdsourcing symbol data. It would also get us function names for system library stack frames, which would be nice.
Assignee | ||
Comment 12•12 years ago
|
||
bug 644707 should have fixed the majority of these.
Depends on: 644707
Was that only pushed to Nightly? This won't resolve Aurora nor Beta crashes would it? (In reply to comment #12) > bug 644707 should have fixed the majority of these.
Reporter | ||
Comment 14•12 years ago
|
||
Only Nightly so far. If we see an improvement in Nightly crash report quality, and no new problems due to this patch, then we should ask for this to be in Aurora and Beta.
Updated•12 years ago
|
Whiteboard: [mobile_dev_needed]
Updated•12 years ago
|
Whiteboard: [mobile_dev_needed] → [mobile_dev_needed][android_tier_1]
Still seeing issues with crash reports: https://crash-stats.mozilla.com/report/index/c2b7b34c-b663-4a18-93e8-d69272110812 https://crash-stats.mozilla.com/report/index/d736a1f9-f20c-45c6-88e9-bc01c2110809 https://crash-stats.mozilla.com/report/index/a6c24e71-d85d-4bd2-91b1-324602110809
Comment 16•12 years ago
|
||
I think that's unavoidable for any stack that dies inside of libc.so at this point.
Assignee | ||
Comment 17•12 years ago
|
||
I'll try to get bug 668210 revived, that may be our only hope.
Updated•12 years ago
|
Assignee: nobody → ted.mielczarek
Comment 18•12 years ago
|
||
(In reply to Ted Mielczarek [:ted, :luser] from comment #17) > I'll try to get bug 668210 revived, that may be our only hope. No update here in over 3 weeks on this android_tier_1 bug. Any progress?
Assignee | ||
Comment 19•12 years ago
|
||
Sorry, I've been working on bug 668210 but I apparently failed to update bugzilla. I have a working extension there, we'll need to get it installed on a variety of devices to get useful symbols into crash-stats.
Assignee | ||
Comment 20•12 years ago
|
||
Preliminary results are encouraging. Looking at the last 4 hours of crash reports: https://crash-stats.mozilla.com/query/query?product=Fennec&version=ALL%3AALL&range_value=4&range_unit=hours&date=09%2F08%2F2011+06%3A41%3A09&query_search=signature&query_type=contains&query=&reason=&build_id=&process_type=any&hang_type=any&do_query=1 The #1 topcrash is __libc_android_abort, which is probably a rollup of all those distinct libc.so@xxx crashes. We'll probably need to skiplist that signature to get more distinct crash reports out of it, but it looks like it's having a positive effect.
Assignee | ||
Comment 21•12 years ago
|
||
Filed bug 685888 on skiplisting that signature.
Assignee | ||
Comment 22•12 years ago
|
||
I've done all I can do here, I think the situation has improved. It's not perfect, but I don't think it ever will be.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•