There's one crash in 17.0a1/20120808: bp-ab6d1629-0e23-4459-8b71-79abe2120808. java.lang.NullPointerException at org.mozilla.gecko.DoorHangerPopup$3.run(DoorHangerPopup.java:181) at android.app.Activity.runOnUiThread(Activity.java:4591) at org.mozilla.gecko.DoorHangerPopup.removeDoorHanger(DoorHangerPopup.java:179) at org.mozilla.gecko.DoorHangerPopup.removeTransientDoorHangers(DoorHangerPopup.java:196) at org.mozilla.gecko.DoorHangerPopup.onTabChanged(DoorHangerPopup.java:104) at org.mozilla.gecko.Tabs.notifyListeners(Tabs.java:361) at org.mozilla.gecko.GeckoApp$1.run(GeckoApp.java:697) at android.os.Handler.handleCallback(Handler.java:615) at android.os.Handler.dispatchMessage(Handler.java:92) at android.os.Looper.loop(Looper.java:137) at android.app.ActivityThread.main(ActivityThread.java:4745) at java.lang.reflect.Method.invokeNative(Native Method) at java.lang.reflect.Method.invoke(Method.java:511) at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:786) at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:553) at dalvik.system.NativeStart.main(Native Method) More reports at: https://crash-stats.mozilla.com/report/list?signature=java.lang.NullPointerException%3A+at+org.mozilla.gecko.DoorHangerPopup%243.run%28DoorHangerPopup.java%29
It might be a regression from bug 732336.
It's #9 top crasher (xul crash signatures count for one) in 17.0a2.
Sending over to Margaret for investigation.
I reproduced this crash on the latest Aurora and Nightly builds by following this STR: * before running the following steps, copy this webpage to the clipboard: http://popuptest.com/popuptest1.html 1. Go to Settings and clear all data 2. Quit Firefox 3. Reopen it and long tap immediately on URL bar (3 options should be displayed in context menu) 4. Tap on Paste&Go option Actual result: Aurora: https://crash-stats.mozilla.com/report/index/bp-ca7bf3db-cc4a-43ab-aa15-cfb8e2121001 Nightly: https://crash-stats.mozilla.com/report/index/bp-316756d8-c5ca-4da2-a1f5-403062121001 -- Firefox 17.0a2 (2012-09-30) Firefox 18.0a1 (2012-09-30) Device: Galaxy Note OS: Android 4.0.4
We could uplift this is low risk
It's #29 top crasher in 17.0b1.
At #29 Fennec topcrasher, that's just not much to track on. We'll consider an uplift if someone has a verifiable fix.
It starts spiking in 19.0a1/20121018 with startup crashes. The regression range for the spike is: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=dac5700acf8b&tochange=cb573b9307e5
There are about 10 crashes per build from the spike.
I was able to reproduce this crash in Nightly: with a clean profile, open Fennec and rotate the phone before the telemetry prompt appears.
Adding some debugging messages, it looks like the following is happening: 1) addDoorHanger() is called. 2) The Runnable in addDoorHanger() is posted to the UI thread, but not yet run. Inside of this Runnable is where DoorHangerPopup is initialized. 3) removeDoorHanger() is called. This is called because we receive a LOCATION_CHANGE event for about:home, which calls removeTransientDoorHangers(), which then calls removeDoorHanger() for the doorhanger added in #1 4) The Runnable in removeDoorHanger() is posted to the UI thread, and it is run immediately. It is run immediately because our LOCATION_CHANGE callback is executed on the UI thread, and as described in the runOnUiThread() API , runOnUiThread() will immediately run the given Runnable if we're already on the UI thread. 5) We then call mContent.removeView() inside of this Runnable. Since it ran before the Runnable in #2, we never called init(), so mContent is null, causing the NPE. This means we have two bugs here. One, we need to make sure the Runnable from addDoorHanger() always executes before the Runnable in removeDoorHanger(). Two, we shouldn't be triggering a removeDoorHanger() call at startup when we receive the LOCATION_CHANGE event for about:home.  http://developer.android.com/reference/android/app/Activity.html#runOnUiThread%28java.lang.Runnable%29
It's quite easy to reproduce this crash if you go to popuptest.com/popuptest1.com, quit and restart Fennec and then tap on its thumbnail from about:home just after Fennec is opened.
Brian, do you have any new info about this crash? It's the #2 topcrash for Nightly 19 (and #28 on Beta 17b3).
Created attachment 676502 [details] [diff] [review] Move DoorHangerPopup code to UI thread DoorHangerPopup isn't designed to be thread-safe, but we're using and modifying mDoorHangers on both the Gecko thread and the UI thread. This is creating race conditions (see comment 12). The easiest way to fix this is to make addDoorHanger(), removeDoorHanger(), etc. all run on the UI thread. There's no heavy operations like DB accesses or JSON parsing being done in these methods, so there's no need for these methods to be split between running on the Gecko/UI threads. Also, bug 800199 makes the telemetry prompt doorhanger appear after about:home's location change event, so there's no need to use a persistence hack anymore.
Comment on attachment 676502 [details] [diff] [review] Move DoorHangerPopup code to UI thread [Approval Request Comment] Bug caused by (feature/regressing bug #): User impact if declined: doorhanger crashes Testing completed (on m-c, etc.): locally Risk to taking this patch (and alternatives if risky): low String or UUID changes made by this patch: none
Comment on attachment 676502 [details] [diff] [review] Move DoorHangerPopup code to UI thread Might be too late, but nom'ing for beta in case.
Comment on attachment 676502 [details] [diff] [review] Move DoorHangerPopup code to UI thread build on trunk looks ok - approving for branches.