Closed Bug 1513870 Opened 5 years ago Closed 5 years ago

Crash in java.lang.IllegalArgumentException: at org.mozilla.gecko.util.GeckoJarReader.getStream(GeckoJarReader.java)

Categories

(Firefox for Android Graveyard :: General, defect, P1)

Firefox 65
Unspecified
Android
defect

Tracking

(firefox-esr60 unaffected, firefox64 unaffected, firefox65+ fixed, firefox66 fixed)

RESOLVED FIXED
Tracking Status
firefox-esr60 --- unaffected
firefox64 --- unaffected
firefox65 + fixed
firefox66 --- fixed

People

(Reporter: calixte, Assigned: jlorenzo)

References

()

Details

(Keywords: crash, regression)

Crash Data

This bug was filed from the Socorro interface and is
report bp-060ecc44-e4ad-4f53-8c2c-82cc40181213.
=============================================================

Java stack trace:

java.lang.IllegalArgumentException
	at org.mozilla.gecko.util.GeckoJarReader.getStream(GeckoJarReader.java:201)
	at org.mozilla.gecko.util.GeckoJarReader.getText(GeckoJarReader.java:83)
	at org.mozilla.gecko.BrowserLocaleManager.getPackagedLocaleTags(BrowserLocaleManager.java:425)
	at org.mozilla.gecko.preferences.LocaleListPreference.getUsableLocales(LocaleListPreference.java:222)
	at org.mozilla.gecko.preferences.LocaleListPreference.buildList(LocaleListPreference.java:293)
	at org.mozilla.gecko.preferences.LocaleListPreference.<init>(LocaleListPreference.java:110)
	at java.lang.reflect.Constructor.newInstance0(Native Method)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:343)
	at android.preference.GenericInflater.createItem(GenericInflater.java:385)
	at android.preference.GenericInflater.createItemFromTag(GenericInflater.java:432)
	at android.preference.GenericInflater.rInflate(GenericInflater.java:483)
	at android.preference.GenericInflater.rInflate(GenericInflater.java:495)
	at android.preference.GenericInflater.inflate(GenericInflater.java:327)
	at android.preference.GenericInflater.inflate(GenericInflater.java:264)
	at android.preference.PreferenceManager.inflateFromResource(PreferenceManager.java:324)
	at android.preference.PreferenceFragment.addPreferencesFromResource(PreferenceFragment.java:332)
	at org.mozilla.gecko.preferences.GeckoPreferenceFragment.onCreate(GeckoPreferenceFragment.java:85)
	at android.app.Fragment.performCreate(Fragment.java:2503)
	at android.app.FragmentManagerImpl.moveToState(FragmentManager.java:1256)
	at android.app.FragmentManagerImpl.addAddedFragments(FragmentManager.java:2426)
	at android.app.FragmentManagerImpl.executeOpsTogether(FragmentManager.java:2205)
	at android.app.FragmentManagerImpl.removeRedundantOperationsAndExecute(FragmentManager.java:2161)
	at android.app.FragmentManagerImpl.execPendingActions(FragmentManager.java:2062)
	at android.app.FragmentManagerImpl$1.run(FragmentManager.java:738)
	at android.os.Handler.handleCallback(Handler.java:898)
	at android.os.Handler.dispatchMessage(Handler.java:107)
	at android.os.Looper.loop(Looper.java:198)
	at android.app.ActivityThread.main(ActivityThread.java:6716)
	at java.lang.reflect.Method.invoke(Native Method)
	at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:493)
	at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:858)

There are 42 crashes (27 installs) in 65.0b4 (and 1 in 65.0a1).
:cpeterson, could you investigate please ?
Flags: needinfo?(cpeterson)
This new crash is currently the #1 top crash on Fennec 65.0b4.
Flags: needinfo?(sdaswani)
Interestingly, from where this is thrown [1] there is a reference to bug 849589, with 
> origUrl=jar:jar:file:/data/app/org.mozilla.firefox_beta-1.apk!/assets/omni.ja!/res/multilocale.txt

[1] https://searchfox.org/mozilla-central/rev/3160ddc1f0ab55d230c595366662c62950e5c785/mobile/android/geckoview/src/main/java/org/mozilla/gecko/util/GeckoJarReader.java#201
Matt, does this GeckoJarReader exception look like a regression from your LocaleList bug 1493306?
Blocks: 1493306
Flags: needinfo?(cpeterson) → needinfo?(mbrubeck)
Clearing my NI as this may be a Gecko issue. Feel free to NI me if not.
Flags: needinfo?(sdaswani)
Assignee: nobody → mbrubeck
Flags: needinfo?(mbrubeck)
(In reply to Chris Peterson [:cpeterson] from comment #3)
> Matt, does this GeckoJarReader exception look like a regression from your
> LocaleList bug 1493306?

Probably not, as that touched only GeckoView code and this crash does not seem to be happening in GeckoView.
According to comments in Socorro, this happens when switching languages:

> umschalten auf deutsche Sprache nicht möglich. Warum?

> setting language, firefox crash

I still couldn't reproduce this in Beta or Nightly on my own devices, though.
Depends on: 849589
Assignee: mbrubeck → nobody
This is by far the #1 top crash on Beta65.
Flags: needinfo?(sdaswani)
Priority: -- → P1
I see this only happening on 65.0b4 where it's impossible to change language.
Tried the following on two devices (Samsung S7 - Android 8, Nokia 7 - Android 9)
> Settings -> General -> Language
The app crashes every time.

Interestingly, on latest nightly I cannot reproduce this.

Stefan, can you help us with a small regression range?
Flags: needinfo?(sdaswani) → needinfo?(stefan.deiac)
Hi, I wasn't able to find any regression range for this crash on Nightly 66.0a1 using OnePlus 5T(Android 8.1.0). Also, I was able to reproduce this crash just first time on Beta 65.b4, after that moment I wasn't able anymore to reproduce the crash. I'll keep an eye on this issue to investigate it much more.
Flags: needinfo?(stefan.deiac)
Assignee: nobody → petru.lingurar
Status: NEW → ASSIGNED
(In reply to Petru-Mugurel Lingurar[:petru] from comment #2)
> Interestingly, from where this is thrown [1] there is a reference to bug
> 849589, with 
> > origUrl=jar:jar:file:/data/app/org.mozilla.firefox_beta-1.apk!/assets/omni.ja!/res/multilocale.txt
> 
> [1] https://searchfox.org/mozilla-central/rev/3160ddc1f0ab55d230c595366662c62950e5c785/mobile/android/geckoview/src/main/java/org/mozilla/gecko/util/GeckoJarReader.java#201

Also, the initial exception causing this [1] has the following message
> Got class java.util.zip.InflaterInputStream, but expected ByteBufferInputStream!
something which should not happen - bug 849589 comment 14
> This error can theoretically not happen, because it would require the omni.ja to be deflated which it is not

I did a local build with artifacts, from the beta branch and this works ok for me while the beta from PlayStore crashes everytime.

[1] https://searchfox.org/mozilla-central/rev/3160ddc1f0ab55d230c595366662c62950e5c785/mobile/android/geckoview/src/main/java/org/mozilla/gecko/util/GeckoJarReader.java#198
Following the
> > Settings -> General -> Language
scenario (the one most users reported)
does not result in a crash anymore with 65.0b5.multi.android-aarch64

So.. might have it all been because of a corrupted build?

Stefan, can you confirm if the issue does not reproduce anymore?
Flags: needinfo?(stefan.deiac)
Hi, I tried to reproduce this crash multiple times on the latest version of Beta 65.0b5 using your scenario Petru, but without any success. Devices used: Nokia 6(Android 7.1.1) and OnePlus 5T(Android 8.1.0).
Flags: needinfo?(stefan.deiac)
After updating from PlayStore to beta 5 I cannot reproduce the issue on the devices on which the app would crash everytime I would go to `Settings -> General -> Language`
Also in crash-stats I see no crashes for beta 5.
I'm glad to hear that this is gone, but it's still worrying to me that we shipped a buggy build like this without any red flags on our end prior to it going out. I'd feel better if we had a better understanding of the root cause here before closing the bug out.
(In reply to Ryan VanderMeulen [:RyanVM] from comment #14)
> I'm glad to hear that this is gone, but it's still worrying to me that we
> shipped a buggy build like this without any red flags on our end prior to it
> going out. I'd feel better if we had a better understanding of the root
> cause here before closing the bug out.

According to :glandium - https://bugzilla.mozilla.org/show_bug.cgi?id=849589#c14
> This error can theoretically not happen, because it would require the omni.ja to be deflated which it is not
That's why I was inclined to consider a somehow corrupted build.
Maybe Mike can chime in.
Flags: needinfo?(mh+mozilla)
Do we have a copy of the apk that crashed?
Flags: needinfo?(mh+mozilla)
Should be http://archive.mozilla.org/pub/mobile/releases/65.0b5/android-api-16/multi/

Have not read the full history but what if this is an upgrade issue? Going from 64b -> 65b resulted in a one time spike of crashes. We might see a similar one time spike going from 64.0.x release to 65.0 release.
The app that crashed should be 65.0b4 - http://archive.mozilla.org/pub/mobile/releases/65.0b4/android-api-16/multi/
The crash doesn't appear in the subsequent 65.0b5.

Trying to see if there's a correlation between updating betas I've tested locally
- 64.0b15 (last 64 beta) - app works normally   updated to   65.0b4 - app crashes when going to `Settings -> General -> Language`
- 64.0b15 (last 64 beta) - app works normally   updated to   65.0b5 - app works normally
65b4 was definitely busted. Both the en-US and the multi build had a deflated omni.ja in the APK. That's not the case in 65b5, and that wasn't the case in 64b15.
AIUI, the apk is generated by gradle or something these days. Maybe we're missing some explicitness about not deflating omni.ja, and have only been lucky so far?
Flags: needinfo?(nalexander)
I downloaded the Fennec 64.0b4 APKs from ftp.mozilla.org and the omni.ja is not deflated in them.

If omni.ja were deflated, it seems like it would cause problems much earlier, and not only when users go to change the locale preferences.
I pulled from these URLs:

http://archive.mozilla.org/pub/mobile/releases/65.0b4/android-api-16/multi/fennec-65.0b4.multi.android-arm.apk
http://archive.mozilla.org/pub/mobile/releases/65.0b5/android-api-16/multi/fennec-65.0b5.multi.android-arm.apk

Something looks suspicious, because:

  -rw-rw-rw-@  1 nalexander  staff   49745761 20 Dec 10:54 fennec-65.0b4.multi.android-arm.apk
  -rw-rw-rw-@  1 nalexander  staff   53593425 20 Dec 10:54 fennec-65.0b5.multi.android-arm.apk

It's unusual to have a ~4Mb filesize change.  Let's investigate:

And beta 4 does in fact have a deflated omni.ja:

unzip -vl fennec-65.0b4.multi.android-arm.apk | grep omni.ja
10490268  Defl:N  9618335   8% 00-00-1980 00:00 545e6c4b  assets/omni.ja

where-as beta 5 does not:

unzip -vl fennec-65.0b4.multi.android-arm.apk | grep omni.ja
10491577  Stored 10491577   0% 00-00-1980 00:00 58e87a25  assets/omni.ja

So that explains the problem.  Now, how is this even possible?  Because we're _very_ explicit about this stuff when packaging Fennec:

https://searchfox.org/mozilla-central/rev/232ced2697b8938073fa79b8e6aa3718876c0b69/python/mozbuild/mozbuild/action/package_fennec_apk.py#97-98

(And I'm not aware of any changes in beta.)

I think the answer is that this was fallout from the APK signing machinery from the releng org.  It completely borked FxR, for example; my guess is that it borked us here too.  I will try to find a ticket.
Flags: needinfo?(nalexander)
> I think the answer is that this was fallout from the APK signing machinery
> from the releng org.  It completely borked FxR, for example; my guess is
> that it borked us here too.  I will try to find a ticket.

The relevant ticket is Bug 1506598.  I don't know how to figure out if that ticket's timeline works this out, but given that the issue appears to be fixed, and that ticket is in the right ballpark, I'm inclined to close this.  RyanVM, is that sufficient?

If we really feel strongly about testing this, we could ensure that omni.ja is STOR and not DEFL in https://searchfox.org/mozilla-central/source/mobile/android/tests/browser/robocop/src/org/mozilla/gecko/tests/testJarReader.java.
Depends on: 1506598
Flags: needinfo?(ryanvm)
We had APK signing issues with 65.0b4, see bug 1513564. So yeah, I could see all of that being a contributing factor to what happened and that at least gives us some reason to believe it was a one-time thing.

But yes, I also think we should have test coverage for this. At least, I don't see a good reason for *not* having it given the severity of the issue if it goes uncaught.
Flags: needinfo?(ryanvm)
I think the issue is that for b4, we changed signing infra (now we use Autograph service from secops) for b4 but had failures so we manually signed the apk. We fixed automation for b5 so it seems like we are back to STOR now.

I imagine we did something funny while signing manually so it will be just a one off and we can close this bug. Perhaps coverage makes sense.

What impact did this have on the product?
I agree with comment 23 and comment 25: bug 1506598 is the root cause of the problem. APKs aren't re-zipped anymore by default. That default behavior changed 25 days ago[1]. :ulfr signed the APK on his local machine so I see 2 things that may have happened: 

a. either the local config overwrote the default behavior
b. or the local code predated [1]

Julien, do you think one of them is the explanation? 

[1] https://github.com/mozilla-services/autograph/pull/170
Flags: needinfo?(jvehent)
:g-k did the actual signing, but I'm pretty sure that in the rush to sign and push the APK, we forgot to run zipalign post-signing.

Lesson learned: we should have just cut a new release and rerun everything through the pipeline, instead of short-cutting the signing step.
Flags: needinfo?(jvehent)
Got it. For the record, I'm certain that's not a zipalign problem. Google Play rejects APKs when they're not zipaligned. Moreover, zipaligns don't change the type of compression of an APK.

Anyway, I agree with the moral of the story.
I'm clearing Petru as the assignee, please let me know if this isn't OK.
Assignee: petru.lingurar → nobody
Status: ASSIGNED → NEW
Julien and Johan let me know if there is work to be done in Fennec or if the linked releng issue will fix this.
Marking this fixed.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Flagging this because it is tracking 65 and has no assignee, and it shows up on a list I check weekly. Liz can we remove the tracking?
Flags: needinfo?(lhenry)
Adding an assignee to get it off the radar, though I'm not entirely happy about this bug being closed while comment 24 remains unaddressed...
Assignee: nobody → jlorenzo
Flags: needinfo?(lhenry)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #24)
> We had APK signing issues with 65.0b4, see bug 1513564. So yeah, I could see
> all of that being a contributing factor to what happened and that at least
> gives us some reason to believe it was a one-time thing.
> 
> But yes, I also think we should have test coverage for this. At least, I
> don't see a good reason for *not* having it given the severity of the issue
> if it goes uncaught.

Nick is this something we can do? (Not asking with urgency, just didn't want it to fall through the cracks)
Flags: needinfo?(nalexander)
I'm OK with this being closed - there is a followup bug filed at https://github.com/mozilla-releng/mozapkpublisher/issues/146.
(In reply to David Bolter [:davidb] (NeedInfo me for attention) from comment #35)
> (In reply to Ryan VanderMeulen [:RyanVM] from comment #24)
> > We had APK signing issues with 65.0b4, see bug 1513564. So yeah, I could see
> > all of that being a contributing factor to what happened and that at least
> > gives us some reason to believe it was a one-time thing.
> > 
> > But yes, I also think we should have test coverage for this. At least, I
> > don't see a good reason for *not* having it given the severity of the issue
> > if it goes uncaught.
> 
> Nick is this something we can do? (Not asking with urgency, just didn't want
> it to fall through the cracks)

Yes: Bug 1517894.
Flags: needinfo?(nalexander)
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.