Closed Bug 609166 Opened 14 years ago Closed 12 years ago

Crash [@ libc-2.5.so@0x2a548 ] on Maemo 5

Categories

(Firefox for Android Graveyard :: General, defect)

ARM
Maemo
defect
Not set
critical

Tracking

(fennec-)

RESOLVED WORKSFORME
Tracking Status
fennec - ---

People

(Reporter: ashah, Unassigned)

Details

(Keywords: crash, topcrash)

Crash Data

      No description provided.
Mozilla/5.0(Maemo; Linux armv7l;rv2.0b8pre) Gecko/20101102 Firefox/4.0b8pre
Fennec/4.0b3pre

I dont know how I got into this crash.I hope we can get some information from the crash report
The crash ID is as follows:

bp-cb3d732c-7a4c-414d-96e2-6cdf52101102
Summary: Crash → Crash@ libc-2.5.s0@0x2a548
Summary: Crash@ libc-2.5.s0@0x2a548 → Crash@ libc-2.5.so@0x2a548
tracking-fennec: --- → ?
Here's some questions that might help you figure out the crash:
1) Did you have sync running?
2) Were you logging into anything that required a SSL cert? (such as the Mozilla Guest network)?
3) Were they going on concurrently?  (trying to log in at the same time as syncing)
IRC convo with blassey: 
blassey suspects that it may be a OOM issue.
Not enough mozilla symbols in the stack to know for certain.
tracking-fennec: ? → 2.0-
(In reply to comment #2)
> Here's some questions that might help you figure out the crash:
> 1) Did you have sync running?
I wasnt actively syncing but yes I was connected to sync.
> 2) Were you logging into anything that required a SSL cert? (such as the
> Mozilla Guest network)?
Yes I was on the Guest Network
> 3) Were they going on concurrently?  (trying to log in at the same time as
> syncing)
No.
It is #13 top crasher in Fennec 4.0b3 for the last week.

More reports at:
http://crash-stats.mozilla.com/report/list?range_value=4&range_unit=weeks&signature=libc-2.5.so%400x2a548&product=Fennec
OS: Linux → Android
Summary: Crash@ libc-2.5.so@0x2a548 → Crash [@ libc-2.5.so@0x2a548 ]
Severity: normal → critical
OS: Android → Linux
OS: Linux → Maemo
As per discussion with crowder on IRC, this goes through ld-2.5.so code, so perhaps it might be the lib-loading hack(s) thats actually causing this crash. Can someone please look into this? 

Its #4 top crash for Fennec 4.0b4pre
(In reply to comment #6)
> As per discussion with crowder on IRC, this goes through ld-2.5.so code, so
> perhaps it might be the lib-loading hack(s) thats actually causing this crash.
> Can someone please look into this? 
> 
> Its #4 top crash for Fennec 4.0b4pre

Library loading hacks are on android only.
There are abort()s for which we're not getting a full stack.  This is likely a symptom of something badly wrong causing libxul to abort().  Not sure why we're not getting a full backtrace; might be breakpad's heuristics breaking down.
Since it has to crawl through a bunch of system library frames without symbols, I'm not surprised that it failed to get to a useful frame. Without symbols it resorts to stack scanning to recover anything, and it probably just failed to find anything via scanning at the end there.
breakpad's stack scanning heuristics leave a lot to be desired.  I have a local patch that improves them at the cost of increased (but filterable) garbage, if anyone cares to grab some of these dumps and have a go.
Sure, I'd like to see them.
OK.  The patch is dumb

Index: src/processor/stackwalker_x86.cc
===================================================================
--- src/processor/stackwalker_x86.cc	(revision 570)
+++ src/processor/stackwalker_x86.cc	(working copy)
@@ -576,10 +576,7 @@
 bool StackwalkerX86::ScanForReturnAddress(u_int32_t location_start,
                                           u_int32_t *location_found,
                                           u_int32_t *eip_found) {
-  const int kRASearchWords = 15;
-  for (u_int32_t location = location_start;
-       location <= location_start + kRASearchWords * 4;
-       location += 4) {
+  for (u_int32_t location = location_start; ; location += 4) {
     u_int32_t eip;
     if (!memory_->GetMemoryAtAddress(location, &eip))
       break;

It just has breakpad walk until it runs out addressible memory near the thread's stack.  The motivation is that it's worse to miss a valid frame than to find a bunch of garbage ones.  When I used this in the past, I filtered all the garbage manually because I had an approximate callgraph paged into my brain of the relevant code.  More sophisticated, automated techniques are possible.

(This is a fairly old patch, maybe things have changed+improved in the meantime.)
Well, that's brutally simple. :)

From Socorro's point of view, whether the dump has a lot of garbage frames or a few doesn't really matter, does it? So there's really no harm in just scanning to the end of the stack memory available from the minidump.

Can we think of a good way to avoid generating massive meaningless stacks, though?
(In reply to comment #14)
> Can we think of a good way to avoid generating massive meaningless stacks,
> though?

Two filtering heuristics I know of need a post-processing step
 - drop frames below main or thread-main.  Requires platform-specific knowledge and/or binaries.
 - drop "frames" with "return addresses" that don't point after a call instruction (or jal).  Requires binaries.

I don't know of anything general we could apply in the dump processor.
The counter-example here is bug 614547, where even our current scanning heuristic gets a bogus frame at frame 1, which winds up as the (crappy) signature.
Summary: Crash [@ libc-2.5.so@0x2a548 ] → Crash [@ libc-2.5.so@0x2a548 ] on Linux 2.6.28
A MozillaZine user reports seeing this crash reproducibly when zooming in a forum web site in Fennec 4.0 on Maemo: http://forums.mozillazine.org/viewtopic.php?f=47&t=2158445&sid=dfb66a304325b77d5124ad5b4b0180dd
It is #1 top crasher in Fennec 5.0b2 (around 4000 ADU for 5.0) with 93% of all crashes.

Is it possible to prevent Fennec running on Linux 2.6.28?
tracking-fennec: - → ?
(In reply to comment #19)
> Is it possible to prevent Fennec running on Linux 2.6.28?
My bad. Indeed, it is the latest Linux kernel version for Maemo 5.
Summary: Crash [@ libc-2.5.so@0x2a548 ] on Linux 2.6.28 → Crash [@ libc-2.5.so@0x2a548 ] on Maemo 5
Crash Signature: [@ libc-2.5.so@0x2a548 ]
Maemo only and we can reproduce, so we are not tracking this. If we get STR we can try to fix it.
tracking-fennec: ? → -
I've been using Nightly on Maemo5 regularly but I think I never saw this. Any clue on what sites it could be more prone to happen?
Looks like the signature may have changed some in 7.0a1?

0 	libc-2.5.so 	libc-2.5.so@0x2a548 	
1 	libc-2.5.so 	libc-2.5.so@0x121beb 	
2 	libc-2.5.so 	libc-2.5.so@0x2bb6b 	
3 	ld-2.5.so 	ld-2.5.so@0x9827 	
4 	libmozalloc.so 	moz_realloc 	memory/mozalloc/mozalloc.cpp:143
5 		@0xffff3ea7 	
6 	libc-2.5.so 	libc-2.5.so@0x1216ee 	
7 	libc-2.5.so 	libc-2.5.so@0x1216a7 	
8 	libc-2.5.so 	libc-2.5.so@0xb3773 	
9 	libc-2.5.so 	libc-2.5.so@0x1216ee 	
10 	libc-2.5.so 	libc-2.5.so@0xb378f 	
11 	libc-2.5.so 	libc-2.5.so@0x63c13 

More reports: 
https://crash-stats.mozilla.com/report/list?range_value=7&range_unit=days&date=2011-07-17%2002%3A00%3A00&signature=libc-2.5.so%400x2a548&version=Fennec%3A7.0a1
There have been no crashes with their signatures containing libc-2.5.so in Fennec for the last four weeks.
In addition, Maemo is no longer supported.
I close it as WFM.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.