Closed Bug 873029 Opened 11 years ago Closed 10 years ago

Intermittent Android remoteautomation.py, testOrderedBroadcast | application crashed [@ libc.so + 0x23226]

Categories

(Firefox for Android Graveyard :: General, defect, P4)

ARM
Android
defect

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: RyanVM, Unassigned)

References

Details

(Keywords: crash, intermittent-failure, Whiteboard: [spurious])

https://tbpl.mozilla.org/php/getParsedLog.php?id=23020459&tree=Mozilla-Inbound

Android 4.0 Panda mozilla-inbound opt test robocop-1 on 2013-05-15 23:37:16 PDT for push 6358dfe4de69
slave: panda-0626

0 INFO SimpleTest START
1 INFO TEST-START | testOrderedBroadcast
Registering org.mozilla.gecko.test.receiver broadcast receiver
EventExpecter: no longer listening for Gecko:Ready
Registered listener for Robocop:Status
Loading JavaScript test from http://mochi.test:8888/tests/robocop/robocop_javascript.html?path=testOrderedBroadcast.js
INFO | automation.py | Application ran for: 0:00:07.554607
INFO | zombiecheck | Reading PID log: /tmp/tmpIKnzG8pidlog
mozcrash INFO | Downloading symbols from: http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-inbound-android/1368679383/fennec-24.0a1.en-US.android-arm.crashreporter-symbols.zip
Downloading symbols from: http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-inbound-android/1368679383/fennec-24.0a1.en-US.android-arm.crashreporter-symbols.zip
PROCESS-CRASH | testOrderedBroadcast | application crashed [@ libc.so + 0x23226]
Crash dump filename: /tmp/tmphqm_83/340eaffc-ee06-fb69-77decaa2-6b0c2c5b.dmp
Operating system: Android
                  0.0.0 Linux 3.2.0+ #2 SMP PREEMPT Thu Nov 29 08:06:57 EST 2012 armv7l pandaboard/pandaboard/pandaboard:4.0.4/IMM76I/5:eng/test-keys
CPU: arm
     2 CPUs

Crash reason:  SIGSEGV
Crash address: 0xd8

Thread 10 (crashed)
 0  libc.so + 0x23226
     r4 = 0x401075e4    r5 = 0x405e12bc    r6 = 0x5bd0167c    r7 = 0x01f94b60
     r8 = 0x01f94a90    r9 = 0x0000000b   r10 = 0x405dee0c    fp = 0x5bd018ec
     sp = 0x5bd011f8    lr = 0x400bf277    pc = 0x400bf226
    Found by: given as instruction pointer in context
 1  libicuuc.so + 0xa4a8b
     sp = 0x5bd01220    pc = 0x405a2a8d
    Found by: stack scanning
Nick, please look into this when you're done with generation and reviews.
Status: NEW → ASSIGNED
Priority: -- → P2
I did, just never commented on the ticket.  I looked at the backtrace and the logs, and it appears that the testOrderedBroadcast test code never actually gets called.  That is, Fennec crashes (deep in native code) before the TestJavascript.setUp() method is executed -- based on lack of the log line saying that the broadcast listener is being register.

There are some safety checks that we can test: verifying that we correctly unregister Gecko and Android listeners.

But until we start seeing this more frequently, or similar things on client devices, I believe this has nothing to do with the ordered broadcast code.
Awesome!
Priority: P2 → P4
Whiteboard: [spurious]
(In reply to TinderboxPushlog Robot from comment #4)
> RyanVM
> https://tbpl.mozilla.org/php/getParsedLog.php?id=24107282&tree=Mozilla-
> Inbound
> Android 4.0 Panda mozilla-inbound opt test mochitest-gl on 2013-06-13
> 08:25:50
> revision: f9e6eb0d5239
> slave: panda-0785
> 
> PROCESS-CRASH | remoteautomation.py | application crashed [@ libc.so +
> 0x23226]

This log doesn't mention testOrderedBroadcast.  Which supports my claim that this has nothing to do with these tests, and is truly intermittent.  In fact, testOrderedBroadcast is in rc1 or rc2, and this was gl!
(In reply to TinderboxPushlog Robot from comment #6)
> RyanVM
> https://tbpl.mozilla.org/php/getParsedLog.php?id=24181738&tree=Mozilla-
> Central
> Android 4.0 Panda mozilla-central opt test mochitest-1 on 2013-06-14 19:57:50
> revision: 05d9196b27a1
> slave: panda-0765
> 
> PROCESS-CRASH | remoteautomation.py | application crashed [@ libc.so +
> 0x23226]

This log is during mochitest-1 (and therefore doesn't mention testOrderedBroadcast).  RyanVM, can you update the title of this bug to be whatever you feel is more appropriate?
Flags: needinfo?(ryanvm)
Assignee: nalexander → nobody
Component: Client: Android → General
Flags: needinfo?(ryanvm)
Product: Firefox Health Report → Firefox for Android
Summary: Intermittent testOrderedBroadcast | application crashed [@ libc.so + 0x23226] → Intermittent Android remoteautomation.py, testOrderedBroadcast | application crashed [@ libc.so + 0x23226]
I am not sure that all of the crashes reported here are related, but some of the more recent ones have similar crash stacks. Comment 10 has:

 -   5  libxul.so!UTCToLocalStandardOffsetSeconds [DateTime.cpp:a0fa8c9992a5 : 19 + 0x9]
 -       sp = 0x68fffbe0    pc = 0x6359e45d
 -      Found by: stack scanning
 -   6  libxul.so!js::DateTimeInfo::updateTimeZoneAdjustment() [DateTime.cpp:a0fa8c9992a5 : 136 + 0x3]
 -       r4 = 0x65c466a0    sp = 0x68fffc20    pc = 0x6359e511
 -      Found by: call frame info
 -   7  libxul.so!JSCompartment::init(JSContext*) [jscompartment.cpp:a0fa8c9992a5 : 97 + 0xb]
 -       r4 = 0x65c466a0    r5 = 0x68df9530    r6 = 0x69223400    sp = 0x68fffc30
 -       pc = 0x6342e021
 -      Found by: call frame info
 -   8  libxul.so!js::NewCompartment(JSContext*, JS::Zone*, JSPrincipals*, JS::CompartmentOptions const&) [jsgc.cpp:a0fa8c9992a5 : 4739 + 0x7]
 -       r4 = 0x65c466a0    r5 = 0x68df9530    r6 = 0x69223400    sp = 0x68fffc40
 -       pc = 0x6344ca31
 -      Found by: call frame info
 -   9  libxul.so!JS_NewGlobalObject(JSContext*, JSClass*, JSPrincipals*, JS::CompartmentOptions const&) [jsapi.cpp:a0fa8c9992a5 : 3344 + 0x7]
 -       r4 = 0x68df9530    r5 = 0x68fffcb0    r6 = 0x684bc000    r7 = 0x63c7deb0
 -       r8 = 0x63c02fb4    r9 = 0x6851a65c   r10 = 0x6851a400    sp = 0x68fffc68
 -       pc = 0x634135fd
 -      Found by: call frame info
 -  10  libxul.so!mozilla::dom::workers::CreateDedicatedWorkerGlobalScope(JSContext*) [WorkerScope.cpp:a0fa8c9992a5 : 981 + 0x11]
 -       r4 = 0x68df9530    r5 = 0x68df9530    r6 = 0x68df9530    r7 = 0x5be48cd8
 -       r8 = 0x63c02fb4    r9 = 0x6851a65c   r10 = 0x6851a400    sp = 0x68fffc88
 -       pc = 0x62df0987
 -      Found by: call frame info

Comment 11 is very similar. In both failures, UTCToLocalStandardOffsetSeconds is being called in JSCompartment::init in 2 threads.

:bhackett -- could you look at this? (I'm not familiar with this code and I see you have been active in jscompartment.cpp.)
Flags: needinfo?(bhackett1024)
These crashes seem to consistently be under the localtime_r library call in ComputeLocalTime under UTCToLocalStandardOffsetSeconds, at least going by the stacks.  Very strange, we're definitely not passing any corrupt data in.  This doesn't seem to have anything to do with compartment code.
Flags: needinfo?(bhackett1024)
Closing bugs where TBPLbot has previously commented, but have now not been modified for >3 months & do not contain the whiteboard strings for disabled/annotated tests or use the keyword leave-open. Filter on: mass-intermittent-bug-closure-2014-07
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.