Closed Bug 907720 Opened 12 years ago Closed 12 years ago

Intermittent Android tcheck2 | application crashed [@ mozalloc_abort(char const*)] ("java.lang.IllegalStateException: Fragment already added: PinBookmarkDialog{48425208 #3 pin_bookmark}")

Categories

(Firefox for Android Graveyard :: Awesomescreen, defect, P1)

ARM
Android
defect

Tracking

(firefox24 unaffected, firefox25 unaffected, firefox26 fixed, fennec26+)

RESOLVED FIXED
Firefox 26
Tracking Status
firefox24 --- unaffected
firefox25 --- unaffected
firefox26 --- fixed
fennec 26+ ---

People

(Reporter: emorley, Assigned: Margaret)

References

Details

(Keywords: crash, intermittent-failure)

Attachments

(1 file, 1 obsolete file)

Android Tegra 250 fig talos remote-trobocheck2 on 2013-08-21 05:18:17 PDT for push 7e83ba2dfb51 slave: tegra-039 https://tbpl.mozilla.org/php/getParsedLog.php?id=26815583&full=1&branch=fig { FIRE PROC: ' "MOZ_CRASHREPORTER_SHUTDOWN=1,MOZ_CRASHREPORTER_NO_REPORT=1,NO_EM_RESTART=1" am instrument -w -e deviceroot /mnt/sdcard/tests -e class org.mozilla.fennec_fig.tests.testCheck2 org.mozilla.roboexample.test/org.mozilla.fennec_fig.FennecInstrumentationTestRunner' INFO : EventExpecter: no longer listening for Gecko:Ready PaintExpecter: no longer listening for events __startBeforeLaunchTimestamp1377087902288__endBeforeLaunchTimestamp __startAfterTerminationTimestamp1377087943607__endAfterTerminationTimestamp pushing directory: /tmp/tmpyiu6Ws/profile to /mnt/sdcard/tests/profile Screen width/height:1024/768 colorDepth:24 Browser inner width/height: 1024/695 browser_name:Fennec browser_version:26.0a1 buildID:20130821040220 removing file: /mnt/sdcard/tests/fennec_ids.txt removing file: /mnt/sdcard/tests/robotium.config removing file: /mnt/sdcard/tests/browser_output.txt removing file: /mnt/sdcard/tests/browser_output.txt removing file: /mnt/sdcard/tests/profile/sessionstore.js getting files in '/mnt/sdcard/tests/profile/minidumps/' removing file: /mnt/sdcard/tests/browser_output.txt removing file: /mnt/sdcard/tests/profile/sessionstore.js getting files in '/mnt/sdcard/tests/profile/minidumps/' PROCESS-CRASH | tcheck2 | application crashed [@ mozalloc_abort(char const*)] Crash dump filename: /tmp/tmpPtM48s/469f2341-9b77-5a0d-31e4eb41-270f459d.dmp Operating system: Android 0.0.0 Linux 2.6.32.9-00002-gd8084dc-dirty #1 SMP PREEMPT Wed Feb 2 11:32:06 PST 2011 armv7l nvidia/harmony/harmony/harmony:2.2/FRF91/20110202.102810:eng/test-keys CPU: arm 2 CPUs Crash reason: SIGSEGV Crash address: 0x0 Thread 0 (crashed) 0 libmozalloc.so!mozalloc_abort(char const*) [mozalloc_abort.cpp:7e83ba2dfb51 : 30 + 0x4] r4 = 0x0000aa50 r5 = 0x00000000 r6 = 0x483f35b0 r7 = 0x430cbf30 r8 = 0xbe9697b0 r9 = 0x430cbf28 r10 = 0x430cbf14 fp = 0x00000000 sp = 0xbe969728 lr = 0x4498a0bb pc = 0x4498a0be Found by: given as instruction pointer in context 1 libxul.so!Java_org_mozilla_gecko_GeckoAppShell_reportJavaCrash [AndroidJNI.cpp:7e83ba2dfb51 : 102 + 0x3] r4 = 0x0000aa50 r5 = 0x00000000 r6 = 0x483f35b0 r7 = 0x430cbf30 r8 = 0xbe9697b0 r9 = 0x430cbf28 r10 = 0x430cbf14 fp = 0x00000000 sp = 0xbe969730 pc = 0x55700ef7 Found by: call frame info 2 libmozglue.so!Java_org_mozilla_gecko_GeckoAppShell_reportJavaCrash [jni-stubs.inc:7e83ba2dfb51 : 108 + 0x1] r4 = 0x0000aa50 r5 = 0x00000000 r6 = 0x483f35b0 r7 = 0x430cbf30 r8 = 0xbe9697b0 r9 = 0x430cbf28 r10 = 0x430cbf14 fp = 0x00000000 sp = 0xbe9697a8 pc = 0x80c1b4a3 Found by: call frame info 3 libdvm.so + 0x16e36 r4 = 0xbe9697d0 r5 = 0x00000000 r6 = 0x483f35b0 r7 = 0x430cbf30 r8 = 0xbe9697b0 r9 = 0x430cbf28 r10 = 0x430cbf14 fp = 0x00000000 sp = 0xbe9697b0 pc = 0x80816e38 Found by: call frame info 4 dalvik-LinearAlloc (deleted) + 0x22dfb2 sp = 0xbe9697b8 pc = 0x432f9fb4 Found by: stack scanning 5 2 (deleted) + 0x6d5ae sp = 0xbe9697bc pc = 0x483f35b0 Found by: stack scanning 6 libdvm.so + 0x16ffe sp = 0xbe9697c4 pc = 0x80817000 Found by: stack scanning } From logcat: { 08-21 05:25:09.177 W/GeckoGlobalHistory( 2110): Rebuilding visited link set... 08-21 05:25:09.357 I/Robocop ( 2110): PaintExpecter: no longer listening for events 08-21 05:25:09.414 I/dalvikvm-heap( 2110): Grow heap (frag case) to 9.185MB for 2846736-byte allocation 08-21 05:25:09.444 I/SUTAgentAndroid( 1516): 10.250.49.155 : isdir /mnt/sdcard/tests 08-21 05:25:09.464 I/SUTAgentAndroid( 1516): 10.250.49.155 : cd /mnt/sdcard/tests 08-21 05:25:09.494 I/SUTAgentAndroid( 1516): 10.250.49.155 : ls 08-21 05:25:09.554 I/SUTAgentAndroid( 1516): 10.250.49.155 : pull /mnt/sdcard/tests/browser_output.txt 08-21 05:25:09.644 I/global ( 2110): Default buffer size used in BufferedOutputStream constructor. It would be better to be explicit if an 8k buffer is required. 08-21 05:25:10.137 I/RobocopMotionEventReplayer( 2110): Initialized using offset (0,73) 08-21 05:25:10.137 I/global ( 2110): Default buffer size used in BufferedReader constructor. It would be better to be explicit if an 8k-char buffer is required. 08-21 05:25:10.147 V/RobocopMotionEventReplayer( 2110): Injecting MotionEvent{484e83f8 action=0 x=566.6796 y=426.84274 pressure=0.0 size=0.0} 08-21 05:25:10.184 V/RobocopMotionEventReplayer( 2110): Injecting MotionEvent{48414d40 action=2 x=569.52014 y=426.17374 pressure=0.0 size=0.0} 08-21 05:25:10.257 V/RobocopMotionEventReplayer( 2110): Injecting MotionEvent{484ffa30 action=1 x=569.52014 y=426.17374 pressure=0.0 size=0.0} 08-21 05:25:10.284 V/RobocopMotionEventReplayer( 2110): Injecting MotionEvent{48445948 action=0 x=563.8391 y=388.70782 pressure=0.0 size=0.0} 08-21 05:25:10.347 V/RobocopMotionEventReplayer( 2110): Injecting MotionEvent{48513010 action=1 x=563.8391 y=388.70782 pressure=0.0 size=0.0} 08-21 05:25:10.584 D/AndroidRuntime( 2110): Shutting down VM 08-21 05:25:10.594 E/GeckoAppShell( 2110): >>> REPORTING UNCAUGHT EXCEPTION FROM THREAD 1 ("main") 08-21 05:25:10.594 E/GeckoAppShell( 2110): java.lang.IllegalStateException: Fragment already added: PinBookmarkDialog{48425208 #3 pin_bookmark} 08-21 05:25:10.594 E/GeckoAppShell( 2110): at android.support.v4.app.FragmentManagerImpl.addFragment(FragmentManager.java:1159) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at android.support.v4.app.BackStackRecord.run(BackStackRecord.java:616) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at android.support.v4.app.FragmentManagerImpl.execPendingActions(FragmentManager.java:1444) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at android.support.v4.app.FragmentManagerImpl$1.run(FragmentManager.java:429) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at android.os.Handler.handleCallback(Handler.java:587) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at android.os.Handler.dispatchMessage(Handler.java:92) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at android.os.Looper.loop(Looper.java:123) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at android.app.ActivityThread.main(ActivityThread.java:4627) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at java.lang.reflect.Method.invokeNative(Native Method) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at java.lang.reflect.Method.invoke(Method.java:521) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:868) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:626) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at dalvik.system.NativeStart.main(Native Method) 08-21 05:25:10.594 E/Gecko ( 2110): mozalloc_abort: Redirecting call to abort() to mozalloc_abort }
Summary: Intermittent PROCESS-CRASH | tcheck2 | application crashed [@ mozalloc_abort(char const*)] ("java.lang.IllegalStateException: Fragment already added: PinBookmarkDialog{48425208 #3 pin_bookmark}") → Intermittent Android tcheck2 | application crashed [@ mozalloc_abort(char const*)] ("java.lang.IllegalStateException: Fragment already added: PinBookmarkDialog{48425208 #3 pin_bookmark}")
This java crash is pretty frequent - would it be possible to prioritise it above some of the other intermittent failures? :-)
Flags: needinfo?(lucasr.at.mozilla)
tracking-fennec: --- → ?
Flags: needinfo?(lucasr.at.mozilla)
Priority: -- → P1
Assignee: nobody → lucasr.at.mozilla
tracking-fennec: ? → 26+
Running testCheck2 locally, I'm seeing the about:home content get panned and scrolled, so I think there's some issue here where we start doing the test actions before the HomePager is hidden. In bug 905591, we updated loadAndGetPainted to use loadUrl, which does not go through the manual process of entering a URL. I think the problem is that we have code to hide the HomePager in commitEditingMode/dismissEditingMode, but we don't go through that logic when we use loadUrl, so we're not hiding it until we get a location change event (one of the other places we end up hiding HomePager). I think we're not waiting long enough for that location change event. I haven't looked through the implementation of PaintExpecter yet, but if it's just checking on the state of the gecko stuff happening underneath HomePager, it may be saying that the paint is done before we actually hide the HomePager.
I can reproduce this crash pretty reliably locally, and when I added some debug logging, I found is that we're actually never calling hideHomePager() on location change. For some reason on location change the isAboutHome(tab) call here sometimes returns true: http://mxr.mozilla.org/mozilla-central/source/mobile/android/base/BrowserApp.java#192 I'm really confused about how this could even happen, since Tab.handleLocationChange takes care of updating the URL itself, and then it sends a LOCATION_CHANGE TabEvent afterwards: http://mxr.mozilla.org/mozilla-central/source/mobile/android/base/Tab.java#602 I will continue to investigate...
(In reply to :Margaret Leibovic from comment #140) > I can reproduce this crash pretty reliably locally, and when I added some > debug logging, I found is that we're actually never calling hideHomePager() > on location change. For some reason on location change the isAboutHome(tab) > call here sometimes returns true: > http://mxr.mozilla.org/mozilla-central/source/mobile/android/base/BrowserApp. > java#192 Correction: it always returns true. The browser just sometimes doesn't crash because it sometimes will end up tapping on a bookmark thumbnail, which gets loaded and panned/scrolled. So this test is consistently doing the wrong thing.
I think I may been reading this debug logging incorrectly... I think we're just slow in processing these events, so I was seeing the events from the original about:home load on startup. I added some more logging, and I do see us attempt to load the URL we're actually trying to test, but I don't see us getting a location change event for that at all. In fact, I do sometimes see us successfully load that URL if we don't crash or accidentally tap on a bookmark thumbnail to load a different page. So, I think we just need to do something to make sure we wait for the HomePager to be gone before we start touching things.
This fix is working for me locally, pushed to try: https://tbpl.mozilla.org/?tree=Try&rev=93da8a985e6a I'm still not sure how much I like this approach, but at least it's a band-aid that would fix this test. However, I'm wondering if this problem affects other tests as well.
Here's the patch I pushed to try. I can trigger a bunch of test runs to see if it makes things better.
Comment on attachment 794960 [details] [diff] [review] patch to wait for non-empty title Looking pretty green so far, so flagging for review.
Attachment #794960 - Flags: review?(gbrown)
Comment on attachment 794960 [details] [diff] [review] patch to wait for non-empty title Gr, it still crashes sometimes on 2.2.
Attachment #794960 - Flags: review?(gbrown)
Assignee: lucasr.at.mozilla → margaret.leibovic
Some thoughts on the logs from the try run (comment 143): - Even in successful runs, the new waitForTest often times-out: INFO : waitForTest timeout after 3000 ms INFO : __start_report2.7865772__end_report __startTimestamp1377294590297__endTimestamp - In the failure logs, I usually (always?) see just one drawFinished: 08-24 00:27:32.226 D/Robocop ( 2276): Received drawFinished notification I think this is technically okay, but it seems unusual to me. I thought a page load normally generated quite a few drawFinished's.
I decided to try something new before seeing gbrown's comment: https://tbpl.mozilla.org/?tree=Try&rev=b9fc920c0f5f (In reply to Geoff Brown [:gbrown] from comment #245) > Some thoughts on the logs from the try run (comment 143): > > - Even in successful runs, the new waitForTest often times-out: > > INFO : waitForTest timeout after 3000 ms > > INFO : __start_report2.7865772__end_report > __startTimestamp1377294590297__endTimestamp So maybe that new waitForTest was just acting like a timeout to delay the test a little bit. In my new patch I'm trying to explictly make sure the home pager is hidden, maybe that will work better. > - In the failure logs, I usually (always?) see just one drawFinished: > > 08-24 00:27:32.226 D/Robocop ( 2276): Received drawFinished notification > > I think this is technically okay, but it seems unusual to me. I thought a > page load normally generated quite a few drawFinished's. Maybe we're crashing before we have the chance for more drawFinished notifications to fire?
Trying again with a longer timeout: https://tbpl.mozilla.org/?tree=Try&rev=4f93bd2715ee Maybe this is an issue with the tegras just being slower to hide the HomePager? I'm not sure why else this just seems to be an issue on 2.2 not 4.0.
Margaret - Shane Tully had a Tegra (I think) in the SF office. Could you get it and try to run the test? Maybe the Android 2.2 OS is really slow to show/hide the Home page UI? I thought I saw a bug about Fig being slow on 2.2.
The last try run was a lot greener, I think we just needed a larger max wait time for the tegras. I triggered more test runs to get more data, but at this point I think we should land something if we know it's an improvement.
Attachment #794960 - Attachment is obsolete: true
Attachment #796072 - Flags: review?(gbrown)
Comment on attachment 796072 [details] [diff] [review] wait for HomePager to be hidden Should verifyHomePagerHidden() be part of loadAndPaint()?
Attachment #796072 - Flags: review?(gbrown) → review+
(In reply to Geoff Brown [:gbrown] from comment #319) > Comment on attachment 796072 [details] [diff] [review] > wait for HomePager to be hidden > > Should verifyHomePagerHidden() be part of loadAndPaint()? Here's a try push that does that: https://tbpl.mozilla.org/?tree=Try&rev=4ba851225b4c I was hesitant to do this before because I didn't want to risk problems to other tests, but it does seem like a sensible change to try. If this doesn't work out, I'll just go ahead and land the previous patch.
(In reply to :Margaret Leibovic from comment #322) > (In reply to Geoff Brown [:gbrown] from comment #319) > > Comment on attachment 796072 [details] [diff] [review] > > wait for HomePager to be hidden > > > > Should verifyHomePagerHidden() be part of loadAndPaint()? > > Here's a try push that does that: > https://tbpl.mozilla.org/?tree=Try&rev=4ba851225b4c > > I was hesitant to do this before because I didn't want to risk problems to > other tests, but it does seem like a sensible change to try. > > If this doesn't work out, I'll just go ahead and land the previous patch. This is causing testSystemPages to fail, which actually makes me a bit concerned that that test isn't doing what we want. In any case, we can work using verifyHomePagerHidden more in a separate bug, I want to land the fix that I know improves things so that I'll stop getting these damn TBPL Robot bugmails! https://hg.mozilla.org/integration/fx-team/rev/25ccd1446424
Backed out the testPan part of this for causing testPan to fail: https://hg.mozilla.org/integration/fx-team/rev/cd5991a56874 I'm worried that means that testPan says it's succeeding when it shouldn't be, but we can sort that out in bug 908823.
(In reply to TBPL Robot from comment #357) > RyanVM > https://tbpl.mozilla.org/php/getParsedLog.php?id=27093370&tree=Fx-Team > Android Tegra 250 fx-team talos remote-trobocheck2 on 2013-08-27 18:37:34 > revision: cd5991a56874 > slave: tegra-129 > > PROCESS-CRASH | tcheck2 | application crashed [@ mozalloc_abort(char const*)] > talosError: 'Verify HomePager is hidden: HomePager is hidden > [browser_output.txt]' The crash has morphed now (but it's at least far less frequent).
Whiteboard: [leave open]
Blocks: 910106
One variation, from Comment 360: 08-27 21:59:50.191 V/RobocopMotionEventReplayer( 1974): Injecting MotionEvent{484c0e60 action=2 x=614.9681 y=323.14246 pressure=0.0 size=0.0} 08-27 21:59:50.191 D/AndroidRuntime( 1974): Shutting down VM 08-27 21:59:50.201 E/GeckoAppShell( 1974): >>> REPORTING UNCAUGHT EXCEPTION FROM THREAD 1 ("main") 08-27 21:59:50.201 E/GeckoAppShell( 1974): java.lang.NullPointerException 08-27 21:59:50.201 E/GeckoAppShell( 1974): at org.mozilla.gecko.home.BookmarksPage$PinBookmarkListener.onBookmarkSelected(BookmarksPage.java:328) 08-27 21:59:50.201 E/GeckoAppShell( 1974): at org.mozilla.gecko.home.PinBookmarkDialog$2.onItemClick(PinBookmarkDialog.java:115) 08-27 21:59:50.201 E/GeckoAppShell( 1974): at android.widget.AdapterView.performItemClick(AdapterView.java:284) 08-27 21:59:50.201 E/GeckoAppShell( 1974): at android.widget.ListView.performItemClick(ListView.java:3382)
I also noticed that the crashes would sometimes vary. I wonder if we're basically fuzzing the new about:home with this test :) I can investigate these stacks to see if any of them might be real bugs we should fix.
Depends on: 910320
I didn't look through every log message, but it looks like after I landed my patch, the failures here all morphed to be about "Verify HomePager is hidden: HomePager is hidden", so I think we should close this bug and just use bug 910106 to track those failures.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Whiteboard: [leave open]
Target Milestone: --- → Firefox 26
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: