Crash in java.lang.IllegalArgumentException: Doorhanger:Add was not registered at org.mozilla.gecko.EventDispatcher.unregisterListener(EventDispatcher.java)

VERIFIED FIXED in Firefox 55

Status

()

defect
P1
critical
VERIFIED FIXED
3 years ago
2 years ago

People

(Reporter: sebastian, Assigned: droeh)

Tracking

(Blocks 2 bugs, {crash})

unspecified
Firefox 55
All
Android
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox52 unaffected, firefox-esr52 unaffected, firefox53 disabled, firefox54 disabled, firefox55 fixed, firefox58 verified)

Details

(crash signature)

Attachments

(3 attachments)

This bug was filed from the Socorro interface and is 
report bp-8c7e5664-d7b3-4ba2-8962-d5b362161221.
=============================================================
This is happened will switching between browser and custom tabs afaik.
This is weird. We register in the constructor of DoorHangerPopup and unregister when calling destroy(). There's no way to call destroy() without constructing the object first - so why are we not registered anymore?
DoorHangerPopup is initialized in onGlobalLayout(). Not sure is there any document or code guarantees that onGlobalLayout() will always be triggered before onDestroy() is called?

If onDestroy() is called without calling onGlobalLayout() before, this exception would be happened.
Here's a simple fix to make sure we do not call destroy() on the DoorHangerPopup multiple times, which will at least prevent this from being a crash.

The bigger issue here seems to be that we are hitting BrowserApp/GeckoApp.onDestroy() multiple times in a row without hitting GeckoApp.initialize() in between -- I haven't been able to reproduce this or figure out how it is occurring.
Assignee: nobody → droeh
Attachment #8845594 - Flags: review?(s.kaspari)
Attachment #8845594 - Flags: review?(s.kaspari) → review+
Pushed by droeh@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a8448c330bab
Set mDoorHangerPopup to null after destroying it. r=sebastian
https://hg.mozilla.org/mozilla-central/rev/a8448c330bab
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 55
Still seems to happen in latest Nightly (e.g. bp-49acd1c2-3c4f-4269-8b49-b81d12170322)
Status: RESOLVED → REOPENED
Flags: needinfo?(droeh)
Resolution: FIXED → ---
(In reply to Jim Chen [:jchen] [:darchons] from comment #8)
> Still seems to happen in latest Nightly (e.g.
> bp-49acd1c2-3c4f-4269-8b49-b81d12170322)

Yeah. This particular bug is essentially the same as bug 1338055*; unfortunately, fixing it just leads to another crash when taking the same steps. I'll likely land a single patch to handle this and the other crashes in bug 1347584 when I have them figured out.

* - Except instead of swiping to kill a custom tab while Fennec is open, we trigger this by swiping to kill Fennec while a custom tab is open.
Flags: needinfo?(droeh)
Hi Julian 
I think you are working on CustomTab now. Do you have any idea ?
Flags: needinfo?(walkingice0204)
Hi Sebastian
Please help look at this code..[0]
Do you think the collection / remove logic could be the culprit?


[0] http://searchfox.org/mozilla-central/rev/9af9b1c7946ebef6056d2ee4c368d496395cf1c8/mobile/android/geckoview/src/main/java/org/mozilla/gecko/EventDispatcher.java#160
Flags: needinfo?(s.kaspari)
In CustomTabs the code just extends GeckoApp and haven't touch anything relate to DoorHanger.
Flags: needinfo?(walkingice0204)
(In reply to Nevin Chen [:nechen] from comment #11)
> Hi Sebastian
> Please help look at this code..[0]
> Do you think the collection / remove logic could be the culprit?

Yeah, that is very likely. Most of those components are written with the constraint that there will always only be one BrowserApp/GeckoApp instance. However now with custom tabs and web apps this is no longer true.

We register for the event in the constructor of DoorHangerPopup (via GeckoApp). However we only call destroy() from BrowserApp and never from CustomTabsActivity. This looks wrong. Apart from that we should audit that both activities (or multiple in general) can register/unregister and do not interfere with each other.
Flags: needinfo?(s.kaspari)
Hi Dylan 
Per comment 13, I think the foundamental fix is to use the safer way to remove items in the collection.
Maybe you can add the fix to your patch? Please feel free to let me know how do you think.
Flags: needinfo?(droeh)
So the problem that's happening here is that we're trying to remove listeners from an EventDispatcher that belongs to a separate activity; I don't think a change in how we remove items from the collection would be able to fix this; at best it would be papering over the bug.

I'm trying to address all the crashes that happen when you swipe-to-kill Fennec while focused on a custom tab, and part of that will be making sure that the listener unregistering that happens in BrowserApp.onDestroy() will be done on the correct EventDispatcher; another part, which Sebastian brought up, is making sure there is creation/destruction parity -- nothing should be created in GeckoApp and destroyed in BrowserApp or vice versa, etc.
Flags: needinfo?(droeh)
Duplicate of this bug: 1352378
Also seen during development of bug 1351739 when enabling "Don't keep activities".
There, I'm launching a CustomTabsActivity from inside BrowserApp, so we end up with a sequence of
- BrowserApp onPause
- CustomTabs onCreate/Start/Resume
- BrowserApp onStop/Destroy and crash,

which I guess is the same sequence that happens when you keep activities, but then manually swipe the background activity away.
Blocks: 1351739
Comment on attachment 8853740 [details]
Bug 1325021 - Safely removing the listener in the collection to prevent concurrency issues.

In the situation described above in comment 17, this now crashes with

> 04-02 22:32:19.426 14616-14616 E/GeckoCrashHandler: >>> REPORTING UNCAUGHT EXCEPTION FROM THREAD 1 ("main")
> java.lang.UnsupportedOperationException
>    at java.util.concurrent.CopyOnWriteArrayList$CowIterator.remove(CopyOnWriteArrayList.java:744)
>   at org.mozilla.gecko.EventDispatcher.safelyRemove(EventDispatcher.java:181)
>    at org.mozilla.gecko.EventDispatcher.unregisterListener(EventDispatcher.java:168)
>    at org.mozilla.gecko.EventDispatcher.unregisterGeckoThreadListener(EventDispatcher.java:221)
>    at org.mozilla.gecko.GeckoApp.onDestroy(GeckoApp.java:2425)
>    at org.mozilla.gecko.customtabs.CustomTabsActivity.onDestroy(CustomTabsActivity.java:152)
>    at android.app.Activity.performDestroy(Activity.java:6430)
>    at android.app.Instrumentation.callActivityOnDestroy(Instrumentation.java:1165)
>    at android.app.ActivityThread.performDestroyActivity(ActivityThread.java:3855)
>    at android.app.ActivityThread.handleDestroyActivity(ActivityThread.java:3886)
>    at android.app.ActivityThread.-wrap5(ActivityThread.java)
>    at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1411)
>    at android.os.Handler.dispatchMessage(Handler.java:102)
>    at android.os.Looper.loop(Looper.java:148)
>    at android.app.ActivityThread.main(ActivityThread.java:5459)
>    at java.lang.reflect.Method.invoke(Native Method)
>    at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:728)
>    at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:618)
>    at de.robv.android.xposed.XposedBridge.main(XposedBridge.java:102)

Actually in fact it's even worse than before, because now it crashes already when switching from custom tab to BrowserApp, whereas previously only the other route was affected. Besides, even without the above crash, your patch doesn't look like it solves the fundamental problem: The EventDispatcher wants to enforce (by crashing if necessary) that registering and unregistering of event listeners is done properly.
Attachment #8853740 - Flags: feedback-
Comment on attachment 8853740 [details]
Bug 1325021 - Safely removing the listener in the collection to prevent concurrency issues.

Looks like Jan has this covered.
Attachment #8853740 - Flags: review?(s.kaspari)
Blocks: 1285858
Priority: -- → P1
(In reply to Sebastian Kaspari (:sebastian) from comment #21)
> Comment on attachment 8853740 [details]
> Bug 1325021 - Safely removing the listener in the collection to prevent
> concurrency issues.
> 
> Looks like Jan has this covered.

If this is no longer happening, we can close the bug.
Can you confirm?
Flags: needinfo?(s.kaspari)
I've been testing Focus as my primary browser lately - So I do not have much personal experience. Looking at the crash reports there's at least build 20170503100516 affected.
Flags: needinfo?(s.kaspari)
The saving grace is that these are diagnostic crashes (https://dxr.mozilla.org/mozilla-central/rev/a748acbebbde373a88868dc02910fb2bc5e6a023/mobile/android/geckoview/src/main/java/org/mozilla/gecko/EventDispatcher.java#129), so at least we won't crash on Beta/Release.
However as this means we're having some lifecycle issues with these event listeners, depending on how these occur exactly we might or might not experience some other doorhanger related issues on Beta/Release instead...
Right, I thought we could sidestep this with GeckoView-based custom tabs, but we actually run into a similar issue there as well. Here's a patch that should solve both.

I was aiming to remove GeckoApp.getEventDispatcher entirely, but unfortunately it's quite difficult to remove it from FormAssistPopup; we rely on being able to get the (instance) EventDispatcher in onAttachedToWindow there, and I don't see any obvious way of passing a GeckoApp reference to it before that happens. There are also some calls to GeckoApp.getEventDispatcher in testing, but I'm not so worried about those.
Attachment #8864581 - Flags: review?(nchen)
Comment on attachment 8864581 [details] [diff] [review]
Reduce reliance on GeckoAppShell.getGeckoInterface

Review of attachment 8864581 [details] [diff] [review]:
-----------------------------------------------------------------

For FindInPageBar and FormAssistPopup, see if you can get the GeckoApp instance from the views themselves (see http://stackoverflow.com/a/32973351).
Attachment #8864581 - Flags: review?(nchen) → review+
Pushed by droeh@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d85d25fa905b
Reduce reliance on GeckoAppShell.getGeckoInterface to solve some custom tabs crashes. r=jchen
https://hg.mozilla.org/mozilla-central/rev/d85d25fa905b
Status: REOPENED → RESOLVED
Closed: 3 years ago2 years ago
Resolution: --- → FIXED
The crash is not reproducing anymore, the changes made here were applied to a different implementation of PWAs, we can close the bug as fixed.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.