Closed Bug 1158664 Opened 9 years ago Closed 8 years ago

Crash [@ ElfLoader::DebuggerHelper::Add(ElfLoader::link_map] Android 4.3/4.4 webappstartup crash

Categories

(Firefox for Android Graveyard :: General, defect)

40 Branch
ARM
Android
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: bc, Assigned: glandium)

References

(Blocks 1 open bug, )

Details

(4 keywords)

Crash Data

Attachments

(2 files)

https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=d3fd099cb346&exclusion_profile=false&filter-searchStr=autophone

https://autophone.s3.amazonaws.com/pub/mozilla.org/mobile/tinderbox-builds/mozilla-inbound-android-api-11/1430083818/autophone-webapp-1-nexus-5-kot49h-3-tombstone_00.1.txt

Build fingerprint: 'google/hammerhead/hammerhead:4.4.2/KOT49H/937116:user/release-keys'
Revision: '11'
pid: 24333, tid: 24347, name: Gecko  >>> org.mozilla.fennec:org.mozilla.fennec.Webapp0 <<<
signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 400f7118
    r0 759fb9c4  r1 759fb9c4  r2 400f8000  r3 400f7108
    r4 7550b228  r5 757450bc  r6 759fba74  r7 75737000
    r8 00000000  r9 0015256e  sl 755200b4  fp 75520114
    ip 5f1b4001  sp 759fb9c0  lr 00000001  pc 754d6340  cpsr 800d0030
    d0  5a5a5a5a5a5a5a5a  d1  5a5a5a5a5a5a5a5a
    d2  0000000000000000  d3  0000000000000000
    d4  0000000000002400  d5  0000040000000000
    d6  0000360000000000  d7  0000020000000b00
    d8  0000000000000000  d9  0000000000000000
    d10 0000000000000000  d11 0000000000000000
    d12 0000000000000000  d13 0000000000000000
    d14 0000000000000000  d15 0000000000000000
    d16 0000000000000000  d17 0000000001000000
    d18 000003dd000003d2  d19 000003f2000003e7
    d20 0000003f0000003f  d21 0000003f0000003f
    d22 0000000000000000  d23 0000000000000000
    d24 0002aaa80002aaa8  d25 0002aaa80002aaa8
    d26 0707070703030303  d27 000000400000003f
    d28 0001000000010000  d29 0001000000010000
    d30 00f7400000f48000  d31 00fc800000f9c000
    scr 60000010

backtrace:
    #00  pc 00026340  /data/app-lib/org.mozilla.fennec-1/libmozglue.so (ElfLoader::DebuggerHelper::Add(ElfLoader::link_map*)+59)
    #01  pc 00024ed5  /data/app-lib/org.mozilla.fennec-1/libmozglue.so (CustomElf::Load(Mappable*, char const*, int)+892)
    #02  pc 00026781  /data/app-lib/org.mozilla.fennec-1/libmozglue.so (ElfLoader::Load(char const*, int, LibHandle*)+220)
    #03  pc 00026859  /data/app-lib/org.mozilla.fennec-1/libmozglue.so (__wrap_dlopen+16)
    #04  pc 00029d9b  /data/app-lib/org.mozilla.fennec-1/libmozglue.so (_ZL11loadNSSLibsPKc.part.0+50)
    #05  pc 00029df7  /data/app-lib/org.mozilla.fennec-1/libmozglue.so (loadSQLiteLibs(char const*)+10)
    #06  pc 0002acc7  /data/app-lib/org.mozilla.fennec-1/libmozglue.so (Java_org_mozilla_gecko_mozglue_GeckoLoader_loadSQLiteLibsNative+34)
    #07  pc 0001dbcc  /system/lib/libdvm.so (dvmPlatformInvoke+112)
    #08  pc 0004e123  /system/lib/libdvm.so (dvmCallJNIMethod(unsigned int const*, JValue*, Method const*, Thread*)+398)
    #09  pc 0004fb11  /system/lib/libdvm.so (dvmResolveNativeMethod(unsigned int const*, JValue*, Method const*, Thread*)+184)
    #10  pc 00026fe0  /system/lib/libdvm.so
    #11  pc 0002dfa0  /system/lib/libdvm.so (dvmMterpStd(Thread*)+76)
    #12  pc 0002b638  /system/lib/libdvm.so (dvmInterpret(Thread*, Method const*, JValue*)+184)
    #13  pc 00060581  /system/lib/libdvm.so (dvmCallMethodV(Thread*, Method const*, Object*, bool, JValue*, std::__va_list)+336)
    #14  pc 000605a5  /system/lib/libdvm.so (dvmCallMethod(Thread*, Method const*, Object*, JValue*, ...)+20)
    #15  pc 0005528b  /system/lib/libdvm.so
    #16  pc 0000d170  /system/lib/libc.so (__thread_entry+72)
    #17  pc 0000d308  /system/lib/libc.so (pthread_create+240)

Lingering left over from Bug 1152308 or regression from it?
Bob when did this start? Is there a regression range pointing to bug 1152308? Does it happen on every startup?

Mike do you have any idea what's going on here? It looks like dbg->r_map->l_prev has garbage? It appears that d0 and d1 have poisoned data, but AFAICT those registers are only used for NEON stuff. Probably unrelated.
Flags: needinfo?(mh+mozilla)
Flags: needinfo?(bob)
snorp: It is an intermittent and not reproducible. I haven't seen it otherwise in the backfill back to 2015-04-17 on mozilla-inbound.
Flags: needinfo?(bob)
Android 4.4 signature 1 (ElfLoader::DebuggerHelper::Add)
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=546210eeaf4a&exclusion_profile=false&filter-searchStr=autophone
https://autophone.s3.amazonaws.com/pub/mozilla.org/mobile/tinderbox-builds/mozilla-inbound-android-api-11/1430167516/autophone-s1s2-1-nexus-5-kot49h-3-tombstone_00.1.txt

Android 4.4 signature 2
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=b85a60205e15&exclusion_profile=false&filter-searchStr=autophone
https://autophone.s3.amazonaws.com/pub/mozilla.org/mobile/tinderbox-builds/mozilla-inbound-android-api-11/1430171296/autophone-s1s2-1-nexus-5-kot49h-3-tombstone_00.1.txt

Android 4.3 signature 2
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=479891760c0c&exclusion_profile=false&filter-searchStr=autophone
https://autophone.s3.amazonaws.com/pub/mozilla.org/mobile/tinderbox-builds/mozilla-inbound-android-api-11/1430179035/autophone-s1s2-1-nexus-7-jss15q-2-tombstone_00.1.txt

Android 4.3 signature 1
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=ac11b098effd&exclusion_profile=false&filter-searchStr=autophone
https://autophone.s3.amazonaws.com/pub/mozilla.org/mobile/tinderbox-builds/mozilla-inbound-android-api-11/1430182215/autophone-s1s2-1-nexus-7-jss15q-2-tombstone_00.1.txt

snorp: should I file a different bug for signature 2?
Flags: needinfo?(snorp)
Summary: Crash [@ ElfLoader::DebuggerHelper::Add(ElfLoader::link_map] Android 4.4 webappstartup crash with 0x5a in d{0,1} → Crash [@ ElfLoader::DebuggerHelper::Add(ElfLoader::link_map] Android 4.3/4.4 webappstartup crash with 0x5a in d{0,1}
(In reply to Bob Clary [:bc:] from comment #3)
> snorp: should I file a different bug for signature 2?

Please do. We've seen this reported in the play store before, and I thought it was an Android bug, but now I'm not so sure.
Flags: needinfo?(snorp)
"5a5a5a5a5a5a5a5a" could be the poison value jemalloc stomps on freed memory -- in other words a use after free type bug, and given the intermittancies probably a racy one. I don't know if anything in the Android system itself uses a similar value.
There is no thread safety in ElfLoader::DebuggerHelper::Add/Remove, so that could come from there if we recently started loading libraries concurrently.
Flags: needinfo?(mh+mozilla)
Assignee: nobody → mh+mozilla
tracking-fennec: ? → 40+
The code around the crash address looks like this:

   26336:       f7ff ff6c       bl      26212 <_ZN14EnsureWritableC1IPN9ElfLoader8link_mapEEEPT_j>
   2633a:       6823            ldr     r3, [r4, #0]
   2633c:       a801            add     r0, sp, #4
   2633e:       685b            ldr     r3, [r3, #4]
   26340:       611d            str     r5, [r3, #16]
   26342:       f7ff f957       bl      255f4 <_ZN14EnsureWritableD1Ev>

d0 and d1 are not involved.
Summary: Crash [@ ElfLoader::DebuggerHelper::Add(ElfLoader::link_map] Android 4.3/4.4 webappstartup crash with 0x5a in d{0,1} → Crash [@ ElfLoader::DebuggerHelper::Add(ElfLoader::link_map] Android 4.3/4.4 webappstartup crash
The poison values are scary, but this seems like it is happening so early in startup, and involves a race, that it would be extremely hard for anybody to actually exploit it.
Group: core-security → firefox-core-security
Crash Signature: [@ ElfLoader::DebuggerHelper::Add(ElfLoader::link_map] → [@ ElfLoader::DebuggerHelper::Add(ElfLoader::link_map] [@ ElfLoader::DebuggerHelper::Add
Flags: needinfo?(snorp)
droeh has seen this pretty frequently
Flags: needinfo?(snorp)
Crash Signature: [@ ElfLoader::DebuggerHelper::Add(ElfLoader::link_map] [@ ElfLoader::DebuggerHelper::Add → [@ ElfLoader::DebuggerHelper::Add(ElfLoader::link_map] [@ ElfLoader::DebuggerHelper::Add]
No reports on crash-stats in the last 6 months. Is this still an issue?
tracking-fennec: 40+ → ---
Flags: needinfo?(bob)
These were the only ones in tombstones I could find from s3's storage for autophone. These are both with an aosp build on nexus 6 / Android 5.1. I Think we can incomplete this unless we see it again.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
Group: firefox-core-security
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: