Closed Bug 964854 Opened 6 years ago Closed 1 year ago

crash in java.lang.IllegalStateException: stateLabelString must not be null at org.mozilla.gecko.fxa.authenticator.AndroidFxAccount.getState(AndroidFxAccount.java)

Categories

(Firefox for Android :: Android Sync, defect, P5)

Firefox 29
All
Android
defect

Tracking

()

RESOLVED WONTFIX
Firefox 38
Tracking Status
firefox31 --- wontfix
firefox32 --- affected
firefox33 --- affected
firefox34 --- affected
firefox35 --- affected
firefox36 --- affected
firefox37 --- affected
firefox38 --- affected
firefox39 --- affected
fennec + ---

People

(Reporter: aaronmt, Assigned: nalexander)

References

(Depends on 1 open bug)

Details

(Keywords: crash, Whiteboard: [workaround: delete account])

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is 
report bp-5884a69b-a8f2-415d-8427-8eb842140127.
=============================================================

ava.lang.IllegalStateException: stateLabelString must not be null
	at org.mozilla.gecko.fxa.authenticator.AndroidFxAccount.getState(AndroidFxAccount.java:342)
	at org.mozilla.gecko.fxa.activities.FxAccountStatusActivity.refresh(FxAccountStatusActivity.java:191)
	at org.mozilla.gecko.fxa.activities.FxAccountStatusActivity.refresh(FxAccountStatusActivity.java:213)
	at org.mozilla.gecko.fxa.activities.FxAccountStatusActivity.onResume(FxAccountStatusActivity.java:163)
	at android.app.Instrumentation.callActivityOnResume(Instrumentation.java:1202)
	at android.app.Activity.performResume(Activity.java:5361)
	at android.app.ActivityThread.performResumeActivity(ActivityThread.java:2816)
	at android.app.ActivityThread.handleResumeActivity(ActivityThread.java:2855)
	at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2300)
	at android.app.ActivityThread.access$700(ActivityThread.java:150)
	at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1280)
	at android.os.Handler.dispatchMessage(Handler.java:99)
	at android.os.Looper.loop(Looper.java:175)
	at android.app.ActivityThread.main(ActivityThread.java:5279)
	at java.lang.reflect.Method.invokeNative(Native Method)
	at java.lang.reflect.Method.invoke(Method.java:511)
	at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:1102)
	at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:869)
	at dalvik.system.NativeStart.main(Native Method)


lolAndroid
Component: General → Android Sync
Product: Firefox for Android → Android Background Services
Think we should get a crash-signature for Android Background Services :)

Stripped from product change

[@ java.lang.IllegalStateException: stateLabelString must not be null at org.mozilla.gecko.fxa.authenticator.AndroidFxAccount.getState(AndroidFxAccount.java)]
Nah, this is us.  It's a corrupt Account on disk, and little error checking since this has been under construction.  Basically, delete account and begin again and we shouldn't see this.
Managed to reproduce using HTC One X (Android 4.1.1);
I think the account I used is not corrupt because I can successfully login and use the account on other devices.
Flaviu: that's not what Nick means by "corrupt"; there's a bundle of account data on the phone, and that's what's corrupt. That you can log in and use the account on other devices proves the point.

The only reason I'd be concerned about this bug is if you signed in on this phone very recently, or we're seeing lots of these crashes. If your account is a few weeks old, then I chalk this up to "QA risk".
Severity: critical → normal
Priority: -- → P2
Whiteboard: [workaround: delete account]
This feels like it's related to Bug 1046285.
tracking-fennec: --- → ?
Assignee: nobody → nalexander
tracking-fennec: ? → +
I spent some more time thinking about this, and my best guess is that we're hitting bugs like http://stackoverflow.com/a/11696961.

It's as if our account state isn't being written by setUserData, although we have no real way of knowing if that's the case or if we have legitimate corrupt data being produced.

If my hunch is correct, I have no ideas on how to work around this, save for maintaining our own in-memory cache of the Account state, and only falling back to the Android Account's cache when we haven't populated our own.  We write using setUserData each update but avoid the possibly buggy getUserData.
Just updating the status flags -- this is currently the #3 crash in Firefox for Android 35.0a1.
We're currently at an eye-popping 5 crashes per 100ADI on Android 35.0a1.  That's a five alarm fire to me.
Status: NEW → ASSIGNED
I agree that this is urgent, but for the record: that doesn't mean 5%. We'll get one crash per sync, potentially dozens per day, for affected users. 

Sucks hard for those folks, though.
Duplicate of this bug: 1090332
Duplicate of this bug: 1090352
Duplicate of this bug: 1108680
Hi, Nick:
In our Chinese local market, we received a lot of complains about sync crashes.

I made a crash search and found out most devices are produced by Chinese manufacturer such as Xiaomi, Lenovo.

Do you happen to know if there's a plan to fix it? 

More detailed information, please refer to 

https://crash-stats.mozilla.com/report/list?signature=java.lang.IllegalStateException%3A+stateLabelString+must+not+be+null+at+org.mozilla.gecko.fxa.authenticator.AndroidFxAccount.getState%28AndroidFxAccount.java%29&product=FennecAndroid

Click on "Mobile Devices" on the page.
Flags: needinfo?(nalexander)
Duplicate of this bug: 1046285
(In reply to xshen from comment #14)
> Hi, Nick:
> In our Chinese local market, we received a lot of complains about sync
> crashes.
> 
> I made a crash search and found out most devices are produced by Chinese
> manufacturer such as Xiaomi, Lenovo.
> 
> Do you happen to know if there's a plan to fix it? 

There is no concrete plan, partly because there has been no strong signal that this is worth spending time on.  I outline a work-around in https://bugzilla.mozilla.org/show_bug.cgi?id=964854#c7.

I don't think a quick work-around will take long, I'll try to get to it.  In the interim, can you help me understand how many crashes we're really getting, and trying to figure out how many actual users are seeing this crash?  Lots of the crash reports are clearly the same device, but crash-stats is pretty awful and I can't "group" crash reports by devices.
Flags: needinfo?(nalexander) → needinfo?(xshen)
rnewman: here are a couple of small patches to shed light, perhaps, on caching issues, and then to work around Android caching by doing it ourselves.  The caching logic itself could use your attention, since it's easy to make a mistake.

This would make any caching read issues less frequent, but if we really have bad data on disk we would see the issues just the same.
Attachment #8562243 - Flags: review?(rnewman)
(In reply to Nick Alexander :nalexander from comment #16)
Nick: 
Thanks for your quick response. It's really helpful to improve users' experience if it's fixed.

In the crash-stats I provided(it's a 7-day report till Feb 11th,2015), we can see:
"Crashes per install" average rate is 1.9 , so it's about 404 installations seeing this crash in our local market(see below).

Sync users in China are about 8000, it means 5% Chinese Sync users are affected by this bug. 

=======================================
Here's the detail calculation:
Version   crashes  installations  rate 
35.0.1    999      526            1.9
35.0      725      417            1.7 

Brand     crashes  installations                 
XiaoMi    510      510/1.9 = 270  
LENOVO    254      254/1.9 = 134   
CN market                    404
=======================================
Flags: needinfo?(xshen)
https://hg.mozilla.org/mozilla-central/rev/c4265d0b3fbf
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 38
according to user feedback on sumo the issue still persists with current firefox 38 aurora & 39 nightly builds: https://support.mozilla.org/en-US/questions/1048946#answer-698773
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
url:        https://hg.mozilla.org/integration/fx-team/rev/078576757615b9c4841814a40d7f1d35b87524e5
changeset:  078576757615b9c4841814a40d7f1d35b87524e5
user:       Nick Alexander <nalexander@mozilla.com>
date:       Fri Jun 19 10:50:34 2015 -0700
description:
Bug 964854 - Reveal cause of exception. r=rnewman

This fixes an oversight from Bug 1042929, which neglected to push the
underlying exception cause through to the final exception.
An update on this and related tickets, like Bug 1138943.  ncroiset has helped me do a lot of debugging, and the result is that, on some devices, org.json.simple just completely fails to parse JSON.  org.json parses it just fine.

It is possible that ncroiset's guess about 64-bit processor architecture is accurate; it's quite hard to tell from the crash dumps (or even having the device on hand!) if there's a pattern in the device fingerprints.

I don't have an offending device -- I might try to get one for this ticket -- but the path ahead probably looks like removing org.json.simple from the android-sync (and therefore Fennec) codebases entirely.  That will be a separate ticket and will require some investigation, since we have a JUnit 4 test suite that will need to be massaged to use org.json.
Keywords: leave-open
Depends on: 1182193
Depends on: 1204559
Crash Signature: [@ java.lang.IllegalStateException: stateLabelString must not be null at org.mozilla.gecko.fxa.authenticator.AndroidFxAccount.getState(AndroidFxAccount.java)] → [@ java.lang.IllegalStateException: stateLabelString must not be null at org.mozilla.gecko.fxa.authenticator.AndroidFxAccount.getState(AndroidFxAccount.java)] [@ java.lang.IllegalStateException: stateLabelString must not be null at org.mozilla.gecko.fxa.…
I sent the screenshots of CPU-z to Nick just now.
Duplicate of this bug: 1146526
Bumping this way down in priority; only one crash reported in a while, and it's on v33.
Priority: P2 → P5
Product: Android Background Services → Firefox for Android
Not a top crash anymore.
Re-triaging per https://bugzilla.mozilla.org/show_bug.cgi?id=1473195

Needinfo :susheel if you think this bug should be re-triaged.
Closing because no crashes reported for 12 weeks.
Status: REOPENED → RESOLVED
Closed: 5 years ago1 year ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.