Closed Bug 1051131 Opened 10 years ago Closed 10 years ago

Intermittent Android 2.3 mochitest-4 application ran for longer than allowed maximum time [@ libc.so + 0xc51c] (test_channelMergerNode.html, test_bug972678.html)

Categories

(Firefox for Android Graveyard :: General, defect)

ARM
Android
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1053502

People

(Reporter: RyanVM, Unassigned)

References

Details

(Keywords: intermittent-failure)

I think these have been happening for awhile now and generally just get ignored along with the other usual Android flakiness. This signature seems to be popping up often, though.

https://tbpl.mozilla.org/php/getParsedLog.php?id=45571765&tree=Mozilla-Inbound

Android 2.3 Emulator mozilla-inbound opt test mochitest-4 on 2014-08-08 15:14:15 PDT for push c5594b478826
slave: tst-linux64-spot-547

16:30:07     INFO -  460 INFO TEST-START | /tests/content/media/webaudio/test/test_bug956489.html
16:30:07     INFO -  org.mozilla.fennec still alive after SIGABRT: waiting...
16:30:07  WARNING -  TEST-UNEXPECTED-FAIL | /tests/content/media/webaudio/test/test_bug956489.html | application ran for longer than allowed maximum time
16:30:07     INFO -  INFO | automation.py | Application ran for: 1:03:08.674701
16:30:07     INFO -  INFO | zombiecheck | Reading PID log: /tmp/tmp7wRVrgpidlog
16:30:07     INFO -  Contents of /data/anr/traces.txt:
16:30:07     INFO -  
16:30:07     INFO -  
16:30:07     INFO -  
16:30:07     INFO -  ----- pid 577 at 2014-08-08 16:28:10 -----
16:30:07     INFO -  Cmd line: org.mozilla.fennec
16:30:07     INFO -  
16:30:07     INFO -  DALVIK THREADS:
16:30:07     INFO -  (mutexes: tll=0 tsl=0 tscl=0 ghl=0 hwl=0 hwll=0)
16:30:07     INFO -  "main" prio=5 tid=1 NATIVE
16:30:07     INFO -    | group="main" sCount=1 dsCount=0 obj=0x4001f188 self=0xce48
16:30:07     INFO -    | sysTid=577 nice=0 sched=0/0 cgrp=default handle=-1345006528
16:30:07     INFO -    | schedstat=( 24894305033 42182087287 6179 )
16:30:07     INFO -    at android.os.MessageQueue.nativePollOnce(Native Method)
16:30:07     INFO -    at android.os.MessageQueue.next(MessageQueue.java:119)
16:30:07     INFO -    at android.os.Looper.loop(Looper.java:117)
16:30:07     INFO -    at android.app.ActivityThread.main(ActivityThread.java:3683)
16:30:07     INFO -    at java.lang.reflect.Method.invokeNative(Native Method)
16:30:07     INFO -    at java.lang.reflect.Method.invoke(Method.java:507)
16:30:07     INFO -    at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:839)
16:30:07     INFO -    at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:597)
16:30:07     INFO -    at dalvik.system.NativeStart.main(Native Method)
16:30:07     INFO -  
16:30:07     INFO -  "Binder Thread #3" prio=5 tid=14 NATIVE
16:30:07     INFO -    | group="main" sCount=1 dsCount=0 obj=0x40696888 self=0x336710
16:30:07     INFO -    | sysTid=1621 nice=0 sched=0/0 cgrp=default handle=3169944
16:30:07     INFO -    | schedstat=( 13301411 21266682 5 )
16:30:07     INFO -    at dalvik.system.NativeStart.run(Native Method)
16:30:07     INFO -  
16:30:07     INFO -  "RefQueueWorker@org.apache.http.impl.conn.tsccm.ConnPoolByRoute@40608160" daemon prio=5 tid=16 WAIT
16:30:07     INFO -    | group="main" sCount=1 dsCount=0 obj=0x4060e958 self=0xa7ae0
16:30:07     INFO -    | sysTid=636 nice=0 sched=0/0 cgrp=default handle=687128
16:30:07     INFO -    | schedstat=( 1114737 9421228 3 )
16:30:07     INFO -    at java.lang.Object.wait(Native Method)
16:30:07     INFO -    - waiting on <0x405f98a0> (a java.lang.ref.ReferenceQueue)
16:30:07     INFO -    at java.lang.Object.wait(Object.java:395)
16:30:07     INFO -    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:107)
16:30:07     INFO -    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:80)
16:30:07     INFO -    at org.apache.http.impl.conn.tsccm.RefQueueWorker.run(RefQueueWorker.java:102)
16:30:07     INFO -    at java.lang.Thread.run(Thread.java:1019)
16:30:07     INFO -  
16:30:07     INFO -  "Thread-19" prio=5 tid=15 NATIVE
16:30:07     INFO -    | group="main" sCount=1 dsCount=0 obj=0x4064e970 self=0x25a2b8
16:30:07     INFO -    | sysTid=617 nice=0 sched=0/0 cgrp=default handle=2724112
16:30:07     INFO -    | schedstat=( 159395566539 300062273478 66020 )
16:30:07     INFO -    at dalvik.system.NativeStart.run(Native Method)
16:30:07     INFO -  
16:30:07     INFO -  "actionMode" prio=5 tid=13 WAIT
16:30:07     INFO -    | group="main" sCount=1 dsCount=0 obj=0x40648df0 self=0x26ffc8
16:30:07     INFO -    | sysTid=590 nice=0 sched=0/0 cgrp=default handle=2557016
16:30:07     INFO -    | schedstat=( 1934567 35421370 3 )
16:30:07     INFO -    at java.lang.Object.wait(Native Method)
16:30:07     INFO -    - waiting on <0x40648df0> (a java.util.Timer$TimerImpl)
16:30:07     INFO -    at java.lang.Object.wait(Object.java:358)
16:30:07     INFO -    at java.util.Timer$TimerImpl.run(Timer.java:214)
16:30:07     INFO -  
16:30:07     INFO -  "Gecko" prio=5 tid=12 SUSPENDED
16:30:07     INFO -    | group="main" sCount=1 dsCount=0 obj=0x40644738 self=0x2721b8
16:30:07     INFO -    | sysTid=589 nice=0 sched=0/0 cgrp=default handle=2550920
16:30:07     INFO -    | schedstat=( 1996416217693 1504753543119 280734 )
16:30:07     INFO -    at org.mozilla.gecko.mozglue.GeckoLoader.nativeRun(Native Method)
16:30:07     INFO -    at org.mozilla.gecko.GeckoAppShell.runGecko(GeckoAppShell.java:353)
16:30:07     INFO -    at org.mozilla.gecko.GeckoThread.run(GeckoThread.java:178)
16:30:07     INFO -  
16:30:07     INFO -  "pool-1-thread-1" prio=5 tid=11 WAIT
16:30:07     INFO -    | group="main" sCount=1 dsCount=0 obj=0x405a9518 self=0x22d370
16:30:07     INFO -    | sysTid=588 nice=0 sched=0/0 cgrp=default handle=2324736
16:30:07     INFO -    | schedstat=( 2429167 20588201 4 )
16:30:07     INFO -    at java.lang.Object.wait(Native Method)
16:30:07     INFO -    - waiting on <0x405a2260> (a java.lang.VMThread)
16:30:07     INFO -    at java.lang.Thread.parkFor(Thread.java:1424)
16:30:07     INFO -    at java.lang.LangAccessImpl.parkFor(LangAccessImpl.java:48)
16:30:07     INFO -    at sun.misc.Unsafe.park(Unsafe.java:337)
16:30:07     INFO -    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:157)
16:30:07     INFO -    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2016)
16:30:07     INFO -    at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:411)
16:30:07     INFO -    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1021)
16:30:07     INFO -    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1081)
16:30:07     INFO -    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:581)
16:30:07     INFO -    at java.lang.Thread.run(Thread.java:1019)
16:30:07     INFO -  
16:30:07     INFO -  "GeckoANRReporter" daemon prio=5 tid=10 NATIVE
16:30:07     INFO -    | group="main" sCount=1 dsCount=0 obj=0x405887f8 self=0x20b9b8
16:30:07     INFO -    | sysTid=586 nice=0 sched=0/0 cgrp=default handle=2145008
16:30:07     INFO -    | schedstat=( 2886266 18601054 4 )
16:30:07     INFO -    at android.os.MessageQueue.nativePollOnce(Native Method)
16:30:07     INFO -    at android.os.MessageQueue.next(MessageQueue.java:119)
16:30:07     INFO -    at android.os.Looper.loop(Looper.java:117)
16:30:07     INFO -    at org.mozilla.gecko.ANRReporter$1.run(ANRReporter.java:95)
16:30:07     INFO -    at java.lang.Thread.run(Thread.java:1019)
16:30:07     INFO -  
16:30:07     INFO -  "GeckoBackgroundThread" prio=5 tid=9 SUSPENDED
16:30:07     INFO -    | group="main" sCount=1 dsCount=0 obj=0x405a9220 self=0x94ad8
16:30:07     INFO -    | sysTid=585 nice=0 sched=0/0 cgrp=default handle=1966024
16:30:07     INFO -    | schedstat=( 3999718184 8924031271 2131 )
16:30:07     INFO -    at android.os.MessageQueue.nativePollOnce(Native Method)
16:30:07     INFO -    at android.os.MessageQueue.next(MessageQueue.java:119)
16:30:07     INFO -    at android.os.Looper.loop(Looper.java:117)
16:30:07     INFO -    at org.mozilla.gecko.util.GeckoBackgroundThread.run(GeckoBackgroundThread.java:32)
16:30:07     INFO -  
16:30:07     INFO -  "Binder Thread #2" prio=5 tid=8 NATIVE
16:30:07     INFO -    | group="main" sCount=1 dsCount=0 obj=0x405128d8 self=0x93230
16:30:07     INFO -    | sysTid=584 nice=0 sched=0/0 cgrp=default handle=1869864
16:30:07     INFO -    | schedstat=( 65326470 64866261 53 )
16:30:07     INFO -    at dalvik.system.NativeStart.run(Native Method)
16:30:07     INFO -  
16:30:07     INFO -  "Binder Thread #1" prio=5 tid=7 NATIVE
16:30:07     INFO -    | group="main" sCount=1 dsCount=0 obj=0x40510890 self=0x1a2fa8
16:30:07     INFO -    | sysTid=583 nice=0 sched=0/0 cgrp=default handle=592728
16:30:07     INFO -    | schedstat=( 58235324 37835260 60 )
16:30:07     INFO -    at dalvik.system.NativeStart.run(Native Method)
16:30:07     INFO -  
16:30:07     INFO -  "Compiler" daemon prio=5 tid=6 VMWAIT
16:30:07     INFO -    | group="system" sCount=1 dsCount=0 obj=0x4050f5a0 self=0x1a2e70
16:30:07     INFO -    | sysTid=582 nice=0 sched=0/0 cgrp=default handle=1860904
16:30:07     INFO -    | schedstat=( 4134866932 3221594350 3643 )
16:30:07     INFO -    at dalvik.system.NativeStart.run(Native Method)
16:30:07     INFO -  
16:30:07     INFO -  "JDWP" daemon prio=5 tid=5 VMWAIT
16:30:07     INFO -    | group="system" sCount=1 dsCount=0 obj=0x4050f4f0 self=0x1c6368
16:30:07     INFO -    | sysTid=581 nice=0 sched=0/0 cgrp=default handle=1643344
16:30:07     INFO -    | schedstat=( 1407927 1969527 7 )
16:30:07     INFO -    at dalvik.system.NativeStart.run(Native Method)
16:30:07     INFO -  
16:30:07     INFO -  "Signal Catcher" daemon prio=5 tid=4 RUNNABLE
16:30:07     INFO -    | group="system" sCount=0 dsCount=0 obj=0x4050f430 self=0x191218
16:30:07     INFO -    | sysTid=580 nice=0 sched=0/0 cgrp=default handle=1601768
16:30:07     INFO -    | schedstat=( 23452484 57706166 7 )
16:30:07     INFO -    at dalvik.system.NativeStart.run(Native Method)
16:30:07     INFO -  
16:30:07     INFO -  "GC" daemon prio=5 tid=3 VMWAIT
16:30:07     INFO -    | group="system" sCount=1 dsCount=0 obj=0x4050f388 self=0x1910e0
16:30:07     INFO -    | sysTid=579 nice=0 sched=0/0 cgrp=default handle=1642576
16:30:07     INFO -    | schedstat=( 2366692821 7045239191 385 )
16:30:07     INFO -    at dalvik.system.NativeStart.run(Native Method)
16:30:07     INFO -  
16:30:07     INFO -  "HeapWorker" daemon prio=5 tid=2 VMWAIT
16:30:07     INFO -    | group="system" sCount=1 dsCount=0 obj=0x4050f2d0 self=0x190f18
16:30:07     INFO -    | sysTid=578 nice=0 sched=0/0 cgrp=default handle=1643720
16:30:07     INFO -    | schedstat=( 1212618751 3116183230 267 )
16:30:07     INFO -    at dalvik.system.NativeStart.run(Native Method)
16:30:07     INFO -  
16:30:07     INFO -  ----- end 577 -----
16:30:07     INFO -  
16:30:07     INFO -  mozcrash INFO | Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-inbound-android/1407534071/fennec-34.0a1.en-US.android-arm.crashreporter-symbols.zip
16:30:07  WARNING -  PROCESS-CRASH | /tests/content/media/webaudio/test/test_bug956489.html | application crashed [@ libc.so + 0xc51c]
16:30:07     INFO -  Crash dump filename: /tmp/tmp2Kc3fj/008e8bc2-1bc3-ccca-2c1d1470-334e5dfb.dmp
16:30:07     INFO -  Operating system: Android
16:30:07     INFO -                    0.0.0 Linux 2.6.29-ge3d684d #1 Mon Dec 16 22:26:51 UTC 2013 armv7l generic/sdk/generic:2.3.7/GINGERBREAD/eng.ubuntu.20140123.014351:eng/test-keys
16:30:07     INFO -  CPU: arm
16:30:07     INFO -       0 CPUs
16:30:07     INFO -  
16:30:07     INFO -  Crash reason:  SIGABRT
16:30:07     INFO -  Crash address: 0xb48
16:30:07     INFO -  
16:30:07     INFO -  Thread 0 (crashed)
16:30:07     INFO -   0  libc.so + 0xc51c
16:30:07     INFO -       r4 = 0x00093620    r5 = 0x00000000    r6 = 0xffffffff    r7 = 0x000000fc
16:30:07     INFO -       r8 = 0x00000000    r9 = 0x00000014   r10 = 0x4428cdf4    fp = 0x4064bde8
16:30:07     INFO -       sp = 0xbec842c8    lr = 0xa8121817    pc = 0xafd0c51c
16:30:07     INFO -      Found by: given as instruction pointer in context
16:30:07     INFO -   1  libutils.so + 0x254bf
16:30:07     INFO -       sp = 0xbec842e0    pc = 0xa81254c1
16:30:07     INFO -      Found by: stack scanning
16:30:07     INFO -   2  libbinder.so + 0x1644b
16:30:07     INFO -       sp = 0xbec84330    pc = 0xa821644d
16:30:07     INFO -      Found by: stack scanning
16:30:07     INFO -   3  libmozglue.so!arena_malloc [jemalloc.c:c5594b478826 : 1647 + 0x3]
16:30:07     INFO -       sp = 0xbec84360    pc = 0x8032835f
16:30:07     INFO -      Found by: stack scanning
16:30:07     INFO -   4  0x51a32862
16:30:07     INFO -       r4 = 0x00000000    r5 = 0x46d66040    r6 = 0x56bbe400    r7 = 0x46d66044
16:30:07     INFO -       r8 = 0x00000000    r9 = 0x00000200   r10 = 0x8032835f    fp = 0xbec843d0
16:30:07     INFO -       sp = 0xbec84388    pc = 0x51a32864
16:30:07     INFO -      Found by: call frame info
16:30:07     INFO -   5  data@app@org.mozilla.fennec-1.apk@classes.dex + 0x10eeb2
16:30:07     INFO -       sp = 0xbec84390    pc = 0x46adceb4
16:30:07     INFO -      Found by: stack scanning
16:30:07     INFO -   6  data@app@org.mozilla.fennec-1.apk@classes.dex + 0x10eeb2
16:30:07     INFO -       sp = 0xbec843a8    pc = 0x46adceb4
16:30:07     INFO -      Found by: stack scanning
https://tbpl.mozilla.org/php/getParsedLog.php?id=45572798&tree=Mozilla-Inbound
Summary: Intermittent Android webaudio application ran for longer than allowed maximum time | application crashed [@ libc.so + 0xc51c] → Intermittent Android 2.3 mochitest-4 webaudio application ran for longer than allowed maximum time | application crashed [@ libc.so + 0xc51c]
https://tbpl.mozilla.org/php/getParsedLog.php?id=45583906&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=45583781&tree=Mozilla-Inbound
Component: Web Audio → General
Product: Core → Firefox for Android
Summary: Intermittent Android 2.3 mochitest-4 webaudio application ran for longer than allowed maximum time | application crashed [@ libc.so + 0xc51c] → Intermittent Android 2.3 mochitest application ran for longer than allowed maximum time | application crashed [@ libc.so + 0xc51c]
https://tbpl.mozilla.org/php/getParsedLog.php?id=45653734&tree=B2g-Inbound
Summary: Intermittent Android 2.3 mochitest application ran for longer than allowed maximum time | application crashed [@ libc.so + 0xc51c] → Intermittent Android 2.3 mochitest-4 application ran for longer than allowed maximum time | application crashed [@ libc.so + 0xc51c] (test_channelMergerNode.html, test_bug972678.html)
See Also: → 1046378
Maire, I'm working to try to narrow down a regression range on this, but can you possibly suggest someone who can look into this? It's pretty frequent and a pain to star on TBPL due to the randomness with which it hits during the test run.
Flags: needinfo?(mreavy)
Interestingly, I can reproduce these as far back as a week ago, but the frequency seems to have gotten dramatically higher around late last week. I'm still looking for a real regression range here, but am not sure when I'll get one.
jchen is helping me get a useful/usable stack trace.  Then depending on what we find, I'll try to get the right person to look at this.
Flags: needinfo?(mreavy)
The 'crash' is just the process being killed because it has exceeded the max permitted runtime for the entire job, surely?
Keywords: crash
Summary: Intermittent Android 2.3 mochitest-4 application ran for longer than allowed maximum time | application crashed [@ libc.so + 0xc51c] (test_channelMergerNode.html, test_bug972678.html) → Intermittent Android 2.3 mochitest-4 application ran for longer than allowed maximum time [@ libc.so + 0xc51c] (test_channelMergerNode.html, test_bug972678.html)
I've been able to hit these same failures on an m-c run from August 1, however the frequency was 1/50 vs. the current extreme. I can try to narrow down when things got worse, but I'm afraid that'll be fuzzy at best.
(In reply to Ed Morley [:edmorley] from comment #53)
> The 'crash' is just the process being killed because it has exceeded the max
> permitted runtime for the entire job, surely?

Yes, that is what is happening: mochitest-4 is often taking more than 60 minutes to complete, the harness kills Firefox, and a crash dump is usually collected. When mochitest-4 passes, it often takes 55 to 60 minutes -- it barely completes in time.

(I had not realized this when talking about this yesterday - sorry!)

Android 2.3 mochitest chunks seem to have a wide range of completion times: m-4 takes about 60 minutes, but m-3 takes about 30 minutes.

The longest-running m-4 test is often 

Slowest: 229467ms - /tests/content/media/test/test_playback.html
Slowest: 232164ms - /tests/content/media/test/test_playback.html
Slowest: 218584ms - /tests/content/media/test/test_playback.html

It would be nice if we could speed-up or skip a few long-running tests.

Otherwise, I suppose we need to run Android 2.3 mochitests in more chunks.
See Also: 1046378
Depends on: 1044977
Depends on: 1053502
(In reply to Geoff Brown [:gbrown] from comment #56)
> It would be nice if we could speed-up or skip a few long-running tests.
> 
> Otherwise, I suppose we need to run Android 2.3 mochitests in more chunks.

We did the easy thing -- more chunks -- and that seems to have resolved it.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.