Closed Bug 895775 Opened 11 years ago Closed 10 years ago

java.lang.IllegalArgumentException: Receiver not registered: org.mozilla.gecko.GeckoNetworkManager@<addr>: at android.app.LoadedApk.forgetReceiverDispatcher(LoadedApk.java) at org.mozilla.gecko.GeckoNetworkManager.stopListening(GeckoNetworkManager.java)

Categories

(Firefox for Android Graveyard :: General, defect)

25 Branch
ARM
Android
defect
Not set
critical

Tracking

(firefox34+ verified, firefox35+ verified, firefox36 verified, fennec34+)

VERIFIED FIXED
Firefox 36
Tracking Status
firefox34 + verified
firefox35 + verified
firefox36 --- verified
fennec 34+ ---

People

(Reporter: scoobidiver, Assigned: mfinkle)

Details

(Keywords: crash, regression, topcrash-android-armv7, Whiteboard: [native-crash])

Crash Data

Attachments

(1 file, 3 obsolete files)

There are a few crashes in 22.0, 23.0 Beta and one crash in 25.0a1: bp-df74a893-78a4-4ac6-a06c-384712130718.

java.lang.IllegalArgumentException: Receiver not registered: org.mozilla.gecko.GeckoNetworkManager@40e88448
	at android.app.LoadedApk.forgetReceiverDispatcher(LoadedApk.java:657)
	at android.app.ContextImpl.unregisterReceiver(ContextImpl.java:1358)
	at android.content.ContextWrapper.unregisterReceiver(ContextWrapper.java:445)
	at org.mozilla.gecko.GeckoNetworkManager.stopListening(GeckoNetworkManager.java:144)
	at org.mozilla.gecko.GeckoNetworkManager.stop(GeckoNetworkManager.java:139)
	at org.mozilla.gecko.GeckoApplication.onActivityPause(GeckoApplication.java:79)
	at org.mozilla.gecko.GeckoActivity.onPause(GeckoActivity.java:25)
	at org.mozilla.gecko.GeckoApp.onPause(GeckoApp.java:2008)
	at android.app.Activity.performPause(Activity.java:5319)
	at android.app.Instrumentation.callActivityOnPause(Instrumentation.java:1226)
	at android.app.ActivityThread.performPauseActivity(ActivityThread.java:3131)
	at android.app.ActivityThread.performPauseActivity(ActivityThread.java:3100)
	at android.app.ActivityThread.handlePauseActivity(ActivityThread.java:3078)
	at android.app.ActivityThread.access$800(ActivityThread.java:156)
	at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1267)
	at android.os.Handler.dispatchMessage(Handler.java:99)
	at android.os.Looper.loop(Looper.java:137)
	at android.app.ActivityThread.main(ActivityThread.java:5230)
	at java.lang.reflect.Method.invokeNative(Native Method)
	at java.lang.reflect.Method.invoke(Method.java:525)
	at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:799)
	at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:566)
	at dalvik.system.NativeStart.main(Native Method)

More reports at:
https://crash-stats.mozilla.com/report/list?product=FennecAndroid&signature=java.lang.IllegalArgumentException%3A+Receiver+not+registered%3A+org.mozilla.gecko.GeckoNetworkManager%40%3Caddr%3E%3A+at+android.app.LoadedApk.forgetReceiverDispatcher%28LoadedApk.java%29
It has spiked since 25.0a1/20130718. The regression window for the spike is (3 days - worst case):
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=ff0a372e3170&tochange=f26e4c26ce4a
Keywords: regression
Version: Trunk → Firefox 25
I took a quick look at the code diff of that range and nothing jumped out me. I also looked at the code that throws this exception and I noticed that GeckoConnectivityReceiver and GeckoNetworkManager are treated exactly the same in GeckoApplication. So If GeckoNetworkManager is throwing but GeckoConnectivityReceiver is not, it must be because of the code difference in the two start/stop functions. In particular GeckoConnectivityReceiver checks the return value from registerReceiver and doesn't try to unregister if that "failed". (I use "failed" in quotes because the comment that explains the failure is incorrect when compared to the Android documentation for the function).

So in summary: GeckoNetworkManager appears correct but is throwing the exception, and GeckoConnectivityReceiver appears incorrect but behaves fine. Weird.
I wonder if enableNotifications() is being called while Fennec is in the background. If so, that means GeckoNetworkManager is stopped, and startListening() isn't called (even though mShouldNotify is set to true). The next call to stop() would cause a crash.
Nevermind, that theory doesn't make sense since stop() wouldn't be called again. How about the fact that GeckoNetworkManager startListening/stopListening isn't synchronized (GeckoConnectivityReceiver start/stop is, though it doesn't seem like it needs to be)? startListening() and stopListening() are called from both the UI thread and the Gecko thread.
I think you're right. If stop() gets called on the UI thread while the Gecko thread is in the middle of enableNotifications (after setting mShouldNotify to true, but before executing startListening()), then this exception will result.

I hadn't realized these functions are called on different threads, there totally needs to be some synchronization here.
Attached patch network-crash v0.1 (obsolete) — Splinter Review
I noticed that GeckoNetworkManager did not use the same pattern as GeckoBatteryManager and GeckoConnectivityReceiver, where we use mIsEnabled and check on start/stop.

GeckoNetworkManager has an added complication (shared by GeckoBatteryManager) where Gecko can start/stop the listening in addition to Android pause/resume. GeckoNetworkManager seems a little more complicated than GeckoBatteryManager though, using more indirection. I didn't want to try to refactor that away in this bug.
Assignee: nobody → mark.finkle
Attachment #8509515 - Flags: review?(rnewman)
Attached patch network-crash v0.2 (obsolete) — Splinter Review
Same as previous, but with the world's simplest test. nsINetworkServiceLink uses the GeckoNetworkManager as it's impl. I might try adding an HTML file and using window.navigator.connection as well.
Attachment #8509515 - Attachment is obsolete: true
Attachment #8509515 - Flags: review?(rnewman)
Attachment #8509534 - Flags: review?(rnewman)
Attached patch network-crash v0.3 (obsolete) — Splinter Review
Forgot to qref
Attachment #8509534 - Attachment is obsolete: true
Attachment #8509534 - Flags: review?(rnewman)
Attachment #8509678 - Flags: review?(rnewman)
Comment on attachment 8509678 [details] [diff] [review]
network-crash v0.3

Review of attachment 8509678 [details] [diff] [review]:
-----------------------------------------------------------------

This class has some scary threading issues, once you look at it, so r+ with those addressed.

startListening is called via JNI from the Gecko thread. stopListening is called on the UI thread.

To avoid lots of unpleasant futzing, I suggest we synchronize startListening/stopListening (and perhaps start/stop -- take a look), plus making the volatile changes in my comments inline.

Endless thanks for writing a test.

::: mobile/android/base/GeckoNetworkManager.java
@@ +83,4 @@
>  
>      // Whether the manager should notify Gecko that a change in Network
>      // Information happened.
>      private boolean mShouldNotify;

This little guy should be volatile, as should mShouldBeListening.

@@ +87,5 @@
>  
>      // The application context used for registering receivers, so
>      // we can unregister them again later.
>      private volatile Context mApplicationContext;
> +    private boolean mIsEnabled;

Because we have both start/stop and startListening/stopListening, this should probably be called mIsListening.

Please leave a comment that this is guarded by synchronization on the instance.

@@ +94,5 @@
>          if (sInstance == null) {
>              sInstance = new GeckoNetworkManager();
>          }
>  
>          return sInstance;

sInstance should be volatile.

::: mobile/android/base/tests/testNetworkManager.java
@@ +1,3 @@
> +package org.mozilla.gecko.tests;
> +
> +

Kill a couple lines here. And license block/PD block.
Attachment #8509678 - Flags: review?(rnewman) → review+
What if I just put any JNI entry point on the UI thread? I see that the Gamepad API does this in GeckoAppShell, so I did the same for the Network notifications.

I renamed to mIsListening, but did not do the volatile & synchronizing changes if posting to the UI thread nullifies the threading issues. Let me know.

If this seems sane, I might do the same for GeckoBatteryManager too.
Attachment #8509678 - Attachment is obsolete: true
Attachment #8511703 - Flags: review?(rnewman)
Comment on attachment 8511703 [details] [diff] [review]
network-crash v0.4

Review of attachment 8511703 [details] [diff] [review]:
-----------------------------------------------------------------

Yup, this is fine. It does change behavior -- previously, a call to enableNetworkNotifications would have the side-effect of synchronously updating the connection type before the call returns. I don't see anywhere that we rely on that behavior in the tree.

There are two small changes I think we should make while we're here to make the class thread-safe (at least apparently so) --

* Make mConnectionType volatile.
* In getCurrentInformation, capture mConnectionType to a local final to avoid access/update races. (I.e., use the volatile correctly.)

This is preferable to synchronizing access to mConnectionType (updateConnectionType on the main thread, GAS.getCurrentNetworkInformation from the Gecko thread via JNI). We can't employ the same posting approach there, because it returns a value.

I'll make those two changes and land.

Thanks for writing a test!
Attachment #8511703 - Flags: review?(rnewman) → review+
The Android 4 infra reports no network. The Android 2.3 infra reports a network. I adjusted the test to deal with either situation, still checking that the interface doesn't fail with an error:

https://hg.mozilla.org/integration/fx-team/rev/5d04d8f6e856
https://hg.mozilla.org/mozilla-central/rev/d7460d5fee44
https://hg.mozilla.org/mozilla-central/rev/5d04d8f6e856
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 36
[Tracking Requested - why for this release]:
tracking-fennec: --- → ?
mfinkle, as this is the #1 crash signature on 34 beta, can you please request uplifts for this?
Flags: needinfo?(mark.finkle)
Comment on attachment 8511703 [details] [diff] [review]
network-crash v0.4

Approval Request Comment
[Feature/regressing bug #]:
[User impact if declined]: Top crash in Beta
[Describe test coverage new/current, TBPL]:Adds a simple test
[Risks and why]: low
[String/UUID change made/needed]: none
Flags: needinfo?(mark.finkle)
Attachment #8511703 - Flags: approval-mozilla-beta?
Attachment #8511703 - Flags: approval-mozilla-aurora?
Topcrash in 34, so 34+.
tracking-fennec: ? → 34+
Comment on attachment 8511703 [details] [diff] [review]
network-crash v0.4

Beta+
Aurora+
Attachment #8511703 - Flags: approval-mozilla-beta?
Attachment #8511703 - Flags: approval-mozilla-beta+
Attachment #8511703 - Flags: approval-mozilla-aurora?
Attachment #8511703 - Flags: approval-mozilla-aurora+
Current stats for the last 7 days:
* Fennec 34.0b4 - 1176 crashes
* Fennec 34.0b6 - 0 crashes
* Fennec 35.0a2 - 0 crashes with builds generated on November 1st or later
* Fennec 36.0a1 - 0 crashes with builds generated on November 1st or later

I think this bug can be marked verified fixed based on statistics.
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: