Closed Bug 951782 Opened 11 years ago Closed 10 years ago

Firefox 26 for Android crashes on HTC Desire Z due to on-demand decompression

Categories

(Firefox for Android Graveyard :: General, defect)

26 Branch
All
Android
defect
Not set
major

Tracking

(firefox26 wontfix, firefox27+ fixed, firefox28+ fixed, firefox29+ fixed, fennec27+)

RESOLVED FIXED
Firefox 29
Tracking Status
firefox26 --- wontfix
firefox27 + fixed
firefox28 + fixed
firefox29 + fixed
fennec 27+ ---

People

(Reporter: nico, Assigned: blassey)

References

Details

(Keywords: crash)

Attachments

(2 files, 1 obsolete file)

A user reported on French-speaking support forums that Firefox always crashes at startup since he updated to version 26, on his HTC Desire Z (Android 2.2). The following error appears in logs: "Could not find method android.nfc.NfcAdapter.getDefaultAdapter, referenced from method org.mozilla.gecko.BrowserApp.onCreate" (see longer log below).

Reinstalling Firefox 26 (or 27 beta) does not solve the problem. Switching back to Firefox 25.0.1 solves it.

Could Firefox 26 be not handling well phones not supporting NFC?

The forum topic where the problem was reported is here: http://forums.mozfr.org/viewtopic.php?f=32&t=116462

----------------- LOG -----------------
I/ActivityManager( 1295): Starting activity: Intent { act=android.intent.action.MAIN cat=[android.intent.category.LAUNCHER] flg=0x10100000 cmp=org.mozilla.firefox/.App bnds=[243,586][357,704] }
I/dalvikvm( 6596): Could not find method android.nfc.NfcAdapter.getDefaultAdapter, referenced from method org.mozilla.gecko.BrowserApp.onCreate
D/dalvikvm( 6596): VFY: dead code 0x012f-013b in Lorg/mozilla/gecko/BrowserApp;.onCreate (Landroid/os/Bundle;)V
W/dalvikvm( 6596): Unable to resolve superclass of Lorg/mozilla/gecko/widget/GeckoActionProvider; (599)
W/dalvikvm( 6596): Link of class 'Lorg/mozilla/gecko/widget/GeckoActionProvider;' failed
E/dalvikvm( 6596): Could not find class 'org.mozilla.gecko.widget.GeckoActionProvider', referenced from method org.mozilla.gecko.BrowserApp.onCreateOptionsMenu
W/dalvikvm( 6596): VFY: unable to resolve new-instance 3105 (Lorg/mozilla/gecko/widget/GeckoActionProvider;) in Lorg/mozilla/gecko/BrowserApp;
D/dalvikvm( 6596): VFY: dead code 0x0056-005b in Lorg/mozilla/gecko/BrowserApp;.onCreateOptionsMenu (Landroid/view/Menu;)Z
I/dalvikvm( 6596): Could not find method android.nfc.NfcAdapter.getDefaultAdapter, referenced from method org.mozilla.gecko.BrowserApp.onDestroy
D/dalvikvm( 6596): VFY: dead code 0x007b-0083 in Lorg/mozilla/gecko/BrowserApp;.onDestroy ()V
8< 8< 8<
E/GeckoLinker( 6596): /data/app/org.mozilla.firefox-1.apk!/assets/libnss3.so: Warning: unhandled flags #8 not handled
I/ActivityManager( 1295): Process org.mozilla.firefox (pid 6596) has died.
I/WindowManager( 1295): WIN DEATH: Window{47c8bf60 org.mozilla.firefox/org.mozilla.firefox.App paused=false}
V/WindowManager( 1295): Remove Window{47c8bf60 org.mozilla.firefox/org.mozilla.firefox.App paused=false}: mSurface=Surface(name=org.mozilla.firefox/org.mozilla.firefox.App, identity=715) mExiting=false isAnimating=false app-animation=null inPendingTransaction=false mDisplayFrozen=false
V/WindowManager( 1295): Remove Window{47bca6e0 Starting org.mozilla.firefox paused=false}: mSurface=Surface(name=Starting org.mozilla.firefox, identity=714) mExiting=false isAnimating=false app-animation=null inPendingTransaction=false mDisplayFrozen=false
----------------- LOG END -----------------
The code in question is definitely guarded with a VERSION.SDK_INT >= 14 call and shouldn't be getting invoked on a device running Android 2.2.

http://mxr.mozilla.org/mozilla-central/source/mobile/android/base/BrowserApp.java?rev=9c9c3e3e7bc2#546
Actually looking at the logcat this output is normal - that's what dalvik outputs when it culls the code that is not available in a particular API version. Can we get a more complete log that contains everything from when Firefox is started to a little bit after it shuts down? Please attach it as an attachment to this bug if possible.
Changing topic as the NFC message is most likely unrelated to the crash.
Summary: Firefox 26 for Android crashes at startup if phone does not support NFC → Firefox 26 for Android crashes on HTC Desire Z
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #1)
> The code in question is definitely guarded with a VERSION.SDK_INT >= 14 call
> and shouldn't be getting invoked on a device running Android 2.2.
> 
> http://mxr.mozilla.org/mozilla-central/source/mobile/android/base/BrowserApp.
> java?rev=9c9c3e3e7bc2#546

^ is from 29, though kats' comment also applies to 26. A link to the same code in 26 is below:

https://mxr.mozilla.org/mozilla-release/source/mobile/android/base/BrowserApp.java?rev=c9b696397007#495
This log has me quite baffled. There's zero info why exactly we crash. It's happening during Gecko loading, but you can see messages from the linker that's still trying to load Gecko after the crash, so it seems that isn't actually the cause. But if it's Java, why is there no backtrace?

Do we have a HTC Desire Z in QA? 

Nicolas, would the user seeing the problem be able to bisect this, i.e. find the first Firefox Nightly (going backwards in time) that works again?
No.

Kevin has a HTC G2 though it does not have HTC's Sense UI.

Similar devices perhaps? Devices tested

* HTC Desire (Android 2.2) - Does not reproduce
* HTC Desire Z (Android 2.3.3) - Does not reproduce
* LG-P990 (Android 2.2.2) - Does not reproduce
* HTC Desire S (Android 2.3.3) - Does not reproduce
* HTC Wildfire S A510e - Does not reproduce (via [1])

Unlikely, but perhaps it's locale related? I did en-US.

[1]: ftp://ftp.mozilla.org/pub/mobile/releases/26.0/android-armv6/en-US/
I am asking the user to try a Firefox in en-US too.

Just to be sure, as the HTC Desire Z chipset is based on ARMv7, the version to try is "mozilla-aurora-android/"? E.g. http://ftp.mozilla.org/pub/mozilla.org/mobile/nightly/2013/12/2013-12-03-00-40-03-mozilla-aurora-android/
(In reply to Nicolas Turcot (:nico@nc) from comment #9)
> I am asking the user to try a Firefox in en-US too.
> 
> Just to be sure, as the HTC Desire Z chipset is based on ARMv7, the version
> to try is "mozilla-aurora-android/"? E.g.
> http://ftp.mozilla.org/pub/mozilla.org/mobile/nightly/2013/12/2013-12-03-00-
> 40-03-mozilla-aurora-android/

That should be fine, but it'd be better if the user tries all of these,

26 (release), http://ftp.mozilla.org/pub/mobile/releases/latest/android/en-US/
27 (beta), http://ftp.mozilla.org/pub/mobile/releases/latest-beta/android/en-US/
28 (aurora), http://ftp.mozilla.org/pub/mobile/nightly/latest-mozilla-aurora-android/en-US/
29 (nightly), http://ftp.mozilla.org/pub/mobile/nightly/latest-mozilla-central-android/en-US/

Does the crash reporter appear when the crash happens?
I found part of a what I believe to be a stack trace:

E/GeckoGLController( 6596):    at org.mozilla.gecko.gfx.GLController$1.run(GLController.java:160)

in the attached log (without any other lines in the trace). Related?
That would be this line:
https://mxr.mozilla.org/mozilla-release/source/mobile/android/base/gfx/GLController.java#160

Which is inside a try/catch block...so 1) how can it crash there ...and 2) leave only a single line in the logs? Did the user try to sanitize the log, if so he deleted too much?

Also, if I understood the French thread correctly, you pointed him to the wrong builds :) Nightly is "xxxx-xx-xx-xx-xx-xx-mozilla-central-android" not "-aurora".
(In reply to Gian-Carlo Pascutto (:gcp) from comment #12)
> Also, if I understood the French thread correctly, you pointed him to the
> wrong builds :) Nightly is "xxxx-xx-xx-xx-xx-xx-mozilla-central-android" not
> "-aurora".

Right. Can you also confirm me the build dates of the nightlies that the user would have to test?
So the last working nightly is 24.0a1 of 2013-05-20. The crash is reproduced on 2013-05-21 and later versions.

Crash reporter does not appear, and Firefox home screen seems to appear very briefly before the crash.

I am waiting for more precisions about the logs.
That is good information, thanks.

If that date is correct, a range would look like http://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2013-05-20&enddate=2013-05-21

and the likely candidate is:

e0e9b99639f8	Chris Lord — bug 869696 - Use an alternative method to unlock gralloc textures on Adreno (TM) 205. r=bjacob Targeting the NULL EGLImage causes slowness on the Geeksphone Peak, and assumedly, other "Adreno (TM) 205" devices. Achieve the same effect by deleting the GL texture instead.

1caf7322f918	Chris Lord — bug 869696 - Add AdrenoTM205 to renderer enum. r=bjacob Add the 'Adreno (TM) 205' renderer string to the renderer enum in GLContext.

from bug 869696
Blocks: 869696
Flags: needinfo?(bjacob)
I'm very surprised, for two reasons:
 1. I didn't expect this code to be run at all on Android. I thought that that code path would only be taken on B2G.
 2. This is only making a couple of GL calls, shouldn't be crashy.

If this really is the culprit, there is a very easy way out by just enclosing this code in a #ifdef GONK...
Flags: needinfo?(bjacob)
Wait... the code added by e0e9b99639f8 does not seem to exist anymore in mozilla-central, or even in mozilla-release.
How about this:

http://hg.mozilla.org/mozilla-central/rev/ef5b7b1039ac

It caused some problems before on phones with really badly broken firmware. It's also consistent with crashing on startup during Gecko lib loading.
Here's a build of current Nightly, but with on-demand decompression disabled. If it fixes the problem, we at least have some idea where to look:

https://dl.dropboxusercontent.com/u/32496746/fennec-29.0a1.en-US.android-arm.apk
I'm going to un-block bug 869696 after having checked with :snorp that these gralloc paths really aren't used on B2G, implying that these changes can't have caused these crashes.
No longer blocks: 869696
aren't used on *Android*
See comment #19
Flags: needinfo?(nicotnc-gecko)
(In reply to Gian-Carlo Pascutto (:gcp) from comment #19)
> Here's a build of current Nightly, but with on-demand decompression
> disabled. If it fixes the problem, we at least have some idea where to look:
> 
> https://dl.dropboxusercontent.com/u/32496746/fennec-29.0a1.en-US.android-arm.
> apk

Note one can disable ondemand decompression with --es env0 MOZ_LINKER_ONDEMAND=0 on the am start command line with a normal build.
(In reply to Gian-Carlo Pascutto (:gcp) from comment #19)
> Here's a build of current Nightly, but with on-demand decompression
> disabled.

The user reported that the problem is actually fixed with this build (<http://forums.mozfr.org/viewtopic.php?p=749078#p749078>). :)
Flags: needinfo?(nicotnc-gecko)
Status: UNCONFIRMED → NEW
tracking-fennec: --- → ?
Ever confirmed: true
Summary: Firefox 26 for Android crashes on HTC Desire Z → Firefox 26 for Android crashes on HTC Desire Z due to on-demand decompression
That tells us what causes it, but doesn't really give an idea how to fix, or more likely, work around it. Nicolas, Wikipedia and comment 7 show that the Desire Z has a newer firmware with Android 2.3.3, can't the user upgrade to that as this bug is apparently fixed there?

Mike, we had some other phones IIRC where this was a problem, what was done there to fix the issues? Do we have something like a blacklist?
(In reply to Gian-Carlo Pascutto (:gcp) from comment #25)
> Mike, we had some other phones IIRC where this was a problem, what was done
> there to fix the issues? Do we have something like a blacklist?

Obviously, those other phones had different issues, because we didn't fix the issues by using a blacklist but finding the root cause problems at runtime and disabling on-demand decompression accordingly. We /could/ add a blacklist, but in general, i'd rather find the actual cause and detect or work around it.

The first thing I'd want to see here is the logs with MOZ_DEBUG_LINKER set (am start  -n org.mozilla.firefox/.App --es env0 MOZ_DEBUG_LINKER=1)

Then a double check that the 26 build that crashes during startup *does* start when run with MOZ_LINKER_ONDEMAND=0 (am start  -n org.mozilla.firefox/.App --es env0 MOZ_LINKER_ONDEMAND=0)

With that being said, one thing we may want to do at the java level is detect whether loading gecko/nss previously worked, and if it didn't, disable on-demand decompression. Something that would allow to start with MOZ_DEBUG_LINKER=1 and/or MOZ_LINKER_ONDEMAND=0 without having to type commands would be nice too.
qawanted to help understand if this issue is only happening on one device and if there are any STR to reliably reproduce. Is the crash happening on a high volume and under what circumstances ?
Keywords: qawanted
We mentioned in comment #7, we only have the American variant of the device, the G2 from early 2010 and  do not see this problem. The devices I mentioned were tested on AppThwack and on their Desire Z, we had a successful start. Steps to reproduce are supposedly to launch Firefox on the HTC Desire Z.

There's no crash-reporter so I don't think you can get numbers.
Keywords: qawanted
NI to kbrosnan to flash his similar device to 2.2.

Nicolas, what are the chances that the user would be willing to stop by the Paris office for someone to have a look?
Flags: needinfo?(kbrosnan)
I can start Firefox 29, 28 an 26 on my HTC Tmobile G2 (same hardware as the Desire Z). However this phone is closer to a Nexus device as it does not ship with HTC Sense or other HTC customizations.

On 27 I crash with bug 847021 which is one of the DB locked crashes.
Flags: needinfo?(kbrosnan)
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #29)
> Nicolas, what are the chances that the user would be willing to stop by the
> Paris office for someone to have a look?

No, he lives in Lyon, 400km from Paris.
Hi Gcp,

Do you think you could give us a hand and provide the instructions to set up GDB to debug a JAVA issue for FF for Android or maybe point us to the WIKI page if there's one?

Thanks a lot,
Hermina
Flags: needinfo?(gpascutto)
I can do that, but this seems totally offtopic to this bug? Contact me by email or CC me on the relevant bug, please.

For info, start here: https://wiki.mozilla.org/Mobile/Fennec/Android#Debugging
Flags: needinfo?(gpascutto)
:glandium, understand that the right and the long term approach to fix this issue would be to investigate the root cause and fix it. But given Firefox 27 release timeline is there a way we can blacklist On-demand decompression for affected devices or thoughts on other alternative's  we could pursue in the interim ?
Flags: needinfo?(mh+mozilla)
The same as http://hg.mozilla.org/mozilla-central/rev/4655d7317a03 could be done here. Someone who knows the right values for the htc desire Z would need to come up with a patch.
Flags: needinfo?(mh+mozilla)
Can you try this build http://dump.lassey.us/fennec-desireZ.apk

and report back the log? Specifically I'm looking for these two lines:

I/GeckoLinker( ####): MANUFACTURER: ???
I/GeckoLinker( ####): MODEL: ?????????????
Flags: needinfo?(nicotnc-gecko)
This is a log I got from the user running with MOZ_DEBUG_LINKER=1.
The lack of "Caught segmentation fault" message suggests something is replacing our segfault handler despite the guards added in bug 874708.
One possibility is that the libc on those devices has a signal() implementation that doesn't call sigaction() but calls the signal system call directly. I'll ask for a copy of libc.so.
Attachment #8350180 - Attachment is obsolete: true
I'll also request the data from comment 36.
Flags: needinfo?(nicotnc-gecko)
I'm assuming glandium cleaned the needinfo mistakenly
Flags: needinfo?(nicotnc-gecko)
I did mean to clean it.
Flags: needinfo?(nicotnc-gecko) → needinfo?(mh+mozilla)
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #36)
> Can you try this build http://dump.lassey.us/fennec-desireZ.apk
> 
> and report back the log? Specifically I'm looking for these two lines:
> 
> I/GeckoLinker( ####): MANUFACTURER: ???
> I/GeckoLinker( ####): MODEL: ?????????????

That build crashed and didn't give those lines. I asked him to get the info with this app:
https://play.google.com/store/apps/details?id=com.nellymoser.deviceinfo

I got a copy of his libc.so, I'll look to see if it is what I think it is.
Flags: needinfo?(mh+mozilla)
(In reply to Mike Hommey [:glandium] from comment #41)
> I got a copy of his libc.so, I'll look to see if it is what I think it is.

And it's not :( signal() does call sigaction(). I'll have to come up with something to debug this.
BUILD.MANUFACTURER:HTC
BUILD.MODEL:HTC Vision
BUILD.HARDWARE:vision
tracking-fennec: ? → +
tracking-fennec: + → 27+
Given we have not found the forward risk or the root-cause here It will great if we can get a the blacklist patch to disable on demand decompression ready by Monday to land on our second last beta without risking the final one. Brad would you be the right assignee here ?
Flags: needinfo?(blassey.bugs)
yup, I can take this.
Assignee: nobody → blassey.bugs
Flags: needinfo?(blassey.bugs)
Brad, any update here ? I would rather wontix this and relnote this than risk the final beta unless the blacklist is really well *tested*  ? Thoughts ?
Flags: needinfo?(blassey.bugs)
[Approval Request Comment]
Bug caused by (feature/regressing bug #): 
User impact if declined: 
Testing completed (on m-c, etc.): 
Risk to taking this patch (and alternatives if risky): 
String or IDL/UUID changes made by this patch:
Attachment #8364510 - Flags: review?(mark.finkle)
Attachment #8364510 - Flags: approval-mozilla-beta?
Attachment #8364510 - Flags: approval-mozilla-aurora?
Flags: needinfo?(blassey.bugs)
Attachment #8364510 - Flags: review?(mark.finkle) → review+
(In reply to bhavana bajaj [:bajaj] from comment #46)
> Brad, any update here ? I would rather wontix this and relnote this than
> risk the final beta unless the blacklist is really well *tested*  ? Thoughts
> ?

QA does not have any devices to test this against.
Comment on attachment 8364510 [details] [diff] [review]
htc_desire_z.patch

discussed the risk with :blassey on irc given its low , lets land in parallel so this can get tested in beta..

Also requesting QA verification to see if the blacklisting works
Attachment #8364510 - Flags: approval-mozilla-beta?
Attachment #8364510 - Flags: approval-mozilla-beta+
Attachment #8364510 - Flags: approval-mozilla-aurora?
Attachment #8364510 - Flags: approval-mozilla-aurora+
(In reply to Aaron Train [:aaronmt] from comment #48)
> (In reply to bhavana bajaj [:bajaj] from comment #46)
> > Brad, any update here ? I would rather wontix this and relnote this than
> > risk the final beta unless the blacklist is really well *tested*  ? Thoughts
> > ?
> 
> QA does not have any devices to test this against.

ah, just seeing this let me NI the reporter.
Mike (coucou), I guess you don't have this device.
Is there anyway we could verify we are blacklisting the right device? thanks
Flags: needinfo?(mh+mozilla)
Keywords: crash
https://hg.mozilla.org/mozilla-central/rev/446676b8cc15
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 29
This is not reproducible on HTC Desire Z (Android 2.3.3) but like in  Comment 7 probably this was never reproducible on this device.
(In reply to Sylvestre Ledru [:sylvestre] from comment #52)
> Mike (coucou), I guess you don't have this device.
> Is there anyway we could verify we are blacklisting the right device? thanks

For devices not blacklisted, there should be a "GeckoLibLoad" message in logcat saying libs were loaded in n ms with n usually less than 1000. That can be compared with the number you get when doing "am start  -n org.mozilla.firefox/.App --es env0 MOZ_LINKER_ONDEMAND=0" (replace firefox with the relevant appname, fennec for nightlies), which should be significantly larger (at least 2 or 3 times).

I'll have the user who reported the crash test a nightly with the workaround in place.
Flags: needinfo?(mh+mozilla)
Depends on: 1043033
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: