Closed Bug 903238 Opened 11 years ago Closed 11 years ago

greening up testsuite Android 4.0 Panda <gecko 26> opt test crashtest

Categories

(Testing :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Callek, Assigned: snorp)

References

Details

assign to snorp
cc sheriffs, blassey, ctalbert

from irc w/blassey and snorp just now
1) testsuite Android 4.0 Panda mozilla-inbound opt test crashtest was green for inbound f538bf3eafc3 and turned red on 82a46209e728
- [green] https://tbpl.mozilla.org/?tree=Mozilla-Inbound&jobname=Panda.*crashtest&rev=f538bf3eafc3&showall=1 
- [red] https://tbpl.mozilla.org/?tree=Mozilla-Inbound&jobname=Panda.*crashtest&rev=82a46209e728&showall=1

2) instead of backing out this changeset, blassey + snorp prefered to leave the test broken, blassey hid the test suite on tbpl, and snorp promised that he would fix the test "later". Hence, filing bug and assiging to snorp.

snorp, if you need a physical panda to debug this, let us know your address and we'll ship you one. We can also get you loaner-access to a production panda, which is even faster (we can get you one in hours if pinged). 

If however, this test suite is *not* something that will be fixed soon, please let us know and we can decide whether to disable the testsuite, or backout the landed changeset, instead of wasting CPU on a broken hidden test.
Any news on this?

Also, why did we hide a suite rather than backing out?
(In reply to Ed Morley [:edmorley UTC+1] from comment #1)
> Any news on this?
No, Snorp needs a panda to investigate. Also, that investigation will come after he finishes the work to get SkiaGL canvas working on b2g.

> Also, why did we hide a suite rather than backing out?
The test suite offers almost no value to us. It apparently only passes on our test boards and can't be run on actual devices. It going red is a good sign that we should figure out why it went red (something either started using more memory than it did before or started taking longer to run than it did before), but it doesn't actually indicate that anything is necessarily wrong and the tolerances are entirely arbitrary.
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #2)
> No, Snorp needs a panda to investigate. Also, that investigation will come
> after he finishes the work to get SkiaGL canvas working on b2g.

Snorp, have you requested a panda yet, so you're not waiting once the SkiaGL work is complete? :-) (comment 0 asks if you need one & there's no other comments in this bug, but don't know if you've discussed with Callek over IRC since).

> The test suite offers almost no value to us. It apparently only passes on
> our test boards and can't be run on actual devices. It going red is a good
> sign that we should figure out why it went red (something either started
> using more memory than it did before or started taking longer to run than it
> did before), but it doesn't actually indicate that anything is necessarily
> wrong and the tolerances are entirely arbitrary.

Thank you - agree with the logic (was just missing this explanation in the bug) :-)
Flags: needinfo?(snorp)
This is distorting the failure stats in bug 663657 & wasting resources - we should turn it off for now.
(In reply to Ed Morley [:edmorley UTC+1] from comment #3)
> (In reply to Brad Lassey [:blassey] (use needinfo?) from comment #2)
> > No, Snorp needs a panda to investigate. Also, that investigation will come
> > after he finishes the work to get SkiaGL canvas working on b2g.
> 
> Snorp, have you requested a panda yet, so you're not waiting once the SkiaGL
> work is complete? :-) (comment 0 asks if you need one & there's no other
> comments in this bug, but don't know if you've discussed with Callek over
> IRC since).

I have a panda now, yes. I hope to work on fixing the crashtests today.
Flags: needinfo?(snorp)
That's great - thank you :-)
I reproduced the crashtest hang:

#0  0x400b01f8 in __futex_syscall3 () from /Users/snorp/jimdb/lib/0123456789ABCDEF/system/lib/libc.so
#1  0x400b50c8 in __pthread_cond_timedwait_relative () from /Users/snorp/jimdb/lib/0123456789ABCDEF/system/lib/libc.so
#2  0x400b519c in __pthread_cond_timedwait () from /Users/snorp/jimdb/lib/0123456789ABCDEF/system/lib/libc.so
#3  0x5f4e69e0 in PR_WaitCondVar (cvar=0x6a4ec0e0, timeout=4294967295) at /Users/snorp/source/mozilla-central/nsprpub/pr/src/pthreads/ptsynch.c:385
#4  0x62e5c6dc in Wait (this=<optimized out>, interval=4294967295) at ../../dist/include/mozilla/CondVar.h:70
#5  Wait (interval=4294967295, this=<optimized out>) at ../../dist/include/mozilla/Monitor.h:47
#6  mozilla::ipc::SyncChannel::WaitForNotify (this=0x6a44454c) at /Users/snorp/source/mozilla-central/ipc/glue/SyncChannel.cpp:376
#7  0x62e5c76c in Send (reply=0x5d32e6bc, _msg=<optimized out>, this=0x6a44454c) at /Users/snorp/source/mozilla-central/ipc/glue/SyncChannel.cpp:150
#8  mozilla::ipc::SyncChannel::Send (this=0x6a44454c, _msg=<optimized out>, reply=0x5d32e6bc) at /Users/snorp/source/mozilla-central/ipc/glue/SyncChannel.cpp:103
#9  0x62e5aa02 in mozilla::ipc::RPCChannel::Send (this=0x6a44454c, msg=0x5bebf420, reply=0x5d32e6bc) at /Users/snorp/source/mozilla-central/ipc/glue/RPCChannel.cpp:92
#10 0x62eeeb70 in mozilla::layers::PCompositorChild::SendWillStop (this=0x6a444540) at /Users/snorp/source/mozilla-central/objdir-android/ipc/ipdl/PCompositorChild.cpp:179
#11 0x62e07728 in nsBaseWidget::DestroyCompositor (this=<optimized out>) at /Users/snorp/source/mozilla-central/widget/xpwidgets/nsBaseWidget.cpp:163
#12 0x62e08166 in nsBaseWidget::~nsBaseWidget (this=0x69096000, __in_chrg=<optimized out>) at /Users/snorp/source/mozilla-central/widget/xpwidgets/nsBaseWidget.cpp:209
#13 0x62e001f6 in nsWindow::~nsWindow (this=0x69096000, __in_chrg=<optimized out>) at /Users/snorp/source/mozilla-central/widget/android/nsWindow.cpp:187
#14 0x62e00214 in nsWindow::~nsWindow (this=0x69096000, __in_chrg=<optimized out>) at /Users/snorp/source/mozilla-central/widget/android/nsWindow.cpp:187
#15 0x62e0730e in Release (this=<optimized out>) at /Users/snorp/source/mozilla-central/widget/xpwidgets/nsBaseWidget.cpp:71
#16 nsBaseWidget::Release (this=<optimized out>) at /Users/snorp/source/mozilla-central/widget/xpwidgets/nsBaseWidget.cpp:71
#17 0x626befac in nsCOMPtr_base::~nsCOMPtr_base (this=0x6ce8d6a8, __in_chrg=<optimized out>) at ../../dist/include/nsCOMPtr.h:430
#18 0x62842028 in ~nsCOMPtr (this=0x6ce8d6a8, __in_chrg=<optimized out>) at ../../dist/include/nsCOMPtr.h:469
#19 nsDeviceContext::~nsDeviceContext (this=0x6ce8d680, __in_chrg=<optimized out>) at /Users/snorp/source/mozilla-central/gfx/src/nsDeviceContext.cpp:245
#20 0x6288b54e in Release (this=0x6ce8d680) at ../../dist/include/nsDeviceContext.h:34
#21 Release (this=0x6ce8d680) at /Users/snorp/source/mozilla-central/layout/base/nsPresContext.cpp:2997
#22 ~nsRefPtr (this=0x6d19d014, __in_chrg=<optimized out>) at ../../dist/include/nsAutoPtr.h:880
#23 nsPresContext::~nsPresContext (this=0x6d19d000, __in_chrg=<optimized out>) at /Users/snorp/source/mozilla-central/layout/base/nsPresContext.cpp:337
#24 0x6288b5a8 in nsPresContext::~nsPresContext (this=0x6d19d000, __in_chrg=<optimized out>) at /Users/snorp/source/mozilla-central/layout/base/nsPresContext.cpp:337
#25 0x628889f0 in nsPresContext::DeleteCycleCollectable (this=<optimized out>) at /Users/snorp/source/mozilla-central/layout/base/nsPresContext.cpp:345
#26 0x6288bf56 in nsPresContext::cycleCollection::DeleteCycleCollectable (this=<optimized out>, p=<optimized out>) at /Users/snorp/source/mozilla-central/layout/base/nsPresContext.h:149
#27 0x6310677e in SnowWhiteKiller::~SnowWhiteKiller (this=0x5d32e7a4, __in_chrg=<optimized out>) at /Users/snorp/source/mozilla-central/xpcom/base/nsCycleCollector.cpp:2002
#28 0x63104ee6 in nsCycleCollector::FreeSnowWhite (this=0x5be80000, aUntilNoSWInPurpleBuffer=<optimized out>) at /Users/snorp/source/mozilla-central/xpcom/base/nsCycleCollector.cpp:2109
#29 0x62ceb4ce in AsyncFreeSnowWhite::Run (this=0x5bec7180) at /Users/snorp/source/mozilla-central/js/xpconnect/src/XPCJSRuntime.cpp:231
#30 0x630ff5e4 in nsThread::ProcessNextEvent (this=0x5be3a640, mayWait=<optimized out>, result=0x5d32e837) at /Users/snorp/source/mozilla-central/xpcom/threads/nsThread.cpp:622
#31 0x630dcf5a in NS_ProcessNextEvent (thread=<optimized out>, mayWait=<optimized out>) at /Users/snorp/source/mozilla-central/objdir-android/xpcom/build/nsThreadUtils.cpp:238
#32 0x630ffc04 in Shutdown (this=0x66182400) at /Users/snorp/source/mozilla-central/xpcom/threads/nsThread.cpp:463
#33 nsThread::Shutdown (this=0x66182400) at /Users/snorp/source/mozilla-central/xpcom/threads/nsThread.cpp:425
#34 0x630fd6de in mozilla::LazyIdleThread::ShutdownThread (this=0x69ae9f60) at /Users/snorp/source/mozilla-central/xpcom/threads/LazyIdleThread.cpp:281
#35 0x630fd804 in mozilla::LazyIdleThread::Shutdown (this=0x69ae9f60) at /Users/snorp/source/mozilla-central/xpcom/threads/LazyIdleThread.cpp:425
#36 0x62b610ec in mozilla::dom::quota::QuotaManager::Observe (this=0x69a16b80, aSubject=<optimized out>, aTopic=<optimized out>, aData=<optimized out>) at /Users/snorp/source/mozilla-central/dom/quota/QuotaManager.cpp:1255
#37 0x630e669c in nsObserverList::NotifyObservers (this=<optimized out>, aSubject=0x0, aTopic=0x6382d702 "profile-before-change", someData=0x63995af0 <nsXREDirProvider::DoShutdown()::kShutdownPersist>) at /Users/snorp/source/mozilla-central/xpcom/ds/nsObserverList.cpp:96
#38 0x630e6c0a in NotifyObservers (someData=0x63995af0 <nsXREDirProvider::DoShutdown()::kShutdownPersist>, aTopic=0x6382d702 "profile-before-change", aSubject=0x0, this=0x5beb1670) at /Users/snorp/source/mozilla-central/xpcom/ds/nsObserverService.cpp:161
#39 nsObserverService::NotifyObservers (this=0x5beb1670, aSubject=0x0, aTopic=0x6382d702 "profile-before-change", someData=0x63995af0 <nsXREDirProvider::DoShutdown()::kShutdownPersist>) at /Users/snorp/source/mozilla-central/xpcom/ds/nsObserverService.cpp:150
#40 0x626c30de in nsXREDirProvider::DoShutdown (this=0x5d32eaa8) at /Users/snorp/source/mozilla-central/toolkit/xre/nsXREDirProvider.cpp:868
#41 0x626bf544 in ScopedXPCOMStartup::~ScopedXPCOMStartup (this=0x5be3b148, __in_chrg=<optimized out>) at /Users/snorp/source/mozilla-central/toolkit/xre/nsAppRunner.cpp:1125
#42 0x626c212a in XREMain::XRE_main (this=0x5d32ea8c, argc=<optimized out>, argv=<optimized out>, aAppData=<optimized out>) at /Users/snorp/source/mozilla-central/toolkit/xre/nsAppRunner.cpp:3956
#43 0x626c227c in XRE_main (argc=15, argv=0x5be46288, aAppData=0x5bc3cca0 <sAppData>, aFlags=<optimized out>) at /Users/snorp/source/mozilla-central/toolkit/xre/nsAppRunner.cpp:4133
#44 0x626be3fe in GeckoStart (data=0x5be54040, appData=0x5bc3cca0 <sAppData>) at /Users/snorp/source/mozilla-central/toolkit/xre/nsAndroidStartup.cpp:73
#45 0x5bc1308e in Java_org_mozilla_gecko_mozglue_GeckoLoader_nativeRun (jenv=0x190f4f8, jc=<optimized out>, jargs=0x22e00005) at /Users/snorp/source/mozilla-central/mozglue/android/APKOpen.cpp:379
#46 0x409257b4 in dvmPlatformInvoke () from /Users/snorp/jimdb/lib/0123456789ABCDEF/system/lib/libdvm.so
#47 0x4096dffa in dvmCallJNIMethod(unsigned int const*, JValue*, Method const*, Thread*) () from /Users/snorp/jimdb/lib/0123456789ABCDEF/system/lib/libdvm.so
#48 0x4094dd8c in dvmCheckCallJNIMethod(unsigned int const*, JValue*, Method const*, Thread*) () from /Users/snorp/jimdb/lib/0123456789ABCDEF/system/lib/libdvm.so
#49 0x409702ee in dvmResolveNativeMethod(unsigned int const*, JValue*, Method const*, Thread*) () from /Users/snorp/jimdb/lib/0123456789ABCDEF/system/lib/libdvm.so
#50 0x40937610 in dvmJitToInterpNoChain () from /Users/snorp/jimdb/lib/0123456789ABCDEF/system/lib/libdvm.so
#51 0x40937610 in dvmJitToInterpNoChain () from /Users/snorp/jimdb/lib/0123456789ABCDEF/system/lib/libdvm.so
So the Gecko thread is trying to tell the compositor to shut down, but the compositor appears to be locked up in some GL command. I've narrowed it down to one of the following crashtests:

load content/canvas/crashtests/360293-1.html
load content/canvas/crashtests/421715-1.html
load content/canvas/crashtests/743499-negative-size.html
load content/canvas/crashtests/794463-1.html
load layout/base/crashtests/640272.html
load layout/base/crashtests/640272-ref.html
load layout/style/crashtests/867487.html
load layout/style/crashtests/880862.html
load image/test/crashtests/delaytest.html?523528-1.gif
load image/test/crashtests/delaytest.html?523528-2.gif
load dom/plugins/test/crashtests/626602-1.html
Ok, it's just dom/plugins/test/crashtests/626602-1.html that's causing the issue. The test looks pretty benign, not sure what's going on. Will figure it out, but we could just disable that test and unhide the crashtests in the mean time if you want.
Depends on: 908363
Unhidden.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Thanks James! :-)
You need to log in before you can comment on or make changes to this bug.