Closed Bug 722941 Opened 13 years ago Closed 12 years ago

unknown exception in runTest: Intermittent Linux tp5 | stack found after process termination: terminated with SIGABRT [@ linux-gate.so + 0x424] [@ PollWrapper] on www.amazon.com/Kindle-Wireless-Reader-Wifi-Graphite/dp/B002Y27P3M/507846.html

Categories

(Testing :: Talos, defect)

15 Branch
x86
Linux
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mbrubeck, Unassigned)

References

()

Details

(Keywords: crash, intermittent-failure, Whiteboard: [red])

Crash Data

I've seen this before but I couldn't find an open bug for it. Similar to bug 701935. https://tbpl.mozilla.org/php/getParsedLog.php?id=8989725&tree=Firefox Rev3 Fedora 12 mozilla-central talos tp_responsiveness on 2012-01-31 15:13:18 PST for push 1410782d557d NOISE: Cycle 9: loaded http://localhost/page_load_test/tp5/yahoo.co.jp/www.yahoo.co.jp/index.html (next: http://localhost/page_load_test/tp5/amazon.com/www.amazon.com/Kindle-Wireless-Reader-Wifi-Graphite/dp/B002Y27P3M/507846.html) NOISE: Found crashdump: /tmp/tmprKrcUg/profile/minidumps/5a5ed356-53fb-8c70-3d889a48-2b3a6d90.dmp Operating system: Linux 0.0.0 Linux 2.6.31.5-127.fc12.i686.PAE #1 SMP Sat Nov 7 21:25:57 EST 2009 i686 CPU: x86 GenuineIntel family 6 model 23 stepping 10 2 CPUs Crash reason: SIGABRT Crash address: 0x86a Thread 0 (crashed) 0 linux-gate.so + 0x424 eip = 0x00ff2424 esp = 0xbff3a9f4 ebp = 0xbff3aa18 ebx = 0xb03f0600 esi = 0x00000000 edi = 0x00c58ff4 eax = 0xfffffffc ecx = 0x00000009 edx = 0xffffffff efl = 0x00000293 Found by: given as instruction pointer in context 1 libglib-2.0.so.0.2200.2 + 0x47a0b eip = 0x00d15a0c esp = 0xbff3aa20 ebp = 0xbff3aa38 Found by: previous frame's frame pointer 2 libxul.so!PollWrapper [nsAppShell.cpp : 66 + 0x12] eip = 0x01b9560b esp = 0xbff3aa40 ebp = 0xbff3aae8 Found by: previous frame's frame pointer 3 libglib-2.0.so.0.2200.2 + 0x3a882 eip = 0x00d08883 esp = 0xbff3aa70 ebp = 0xbff3aae8 ebx = 0x00db61a4 Found by: call frame info 4 libglib-2.0.so.0.2200.2 + 0x3ab73 eip = 0x00d08b74 esp = 0xbff3aaf0 ebp = 0xbff3ab28 Found by: previous frame's frame pointer 5 libxul.so!nsAppShell::ProcessNextNativeEvent [nsAppShell.cpp : 162 + 0x7] eip = 0x01b955ae esp = 0xbff3ab30 ebp = 0x36449a06 Found by: previous frame's frame pointer 6 libxul.so!nsBaseAppShell::DoProcessNextNativeEvent [nsBaseAppShell.cpp : 171 + 0x4] eip = 0x01baa52e esp = 0xbff3ab50 ebp = 0x36449a06 ebx = 0x024b8430 Found by: call frame info 7 libxul.so!nsBaseAppShell::OnProcessNextEvent [nsBaseAppShell.cpp : 324 + 0xd] eip = 0x01baa71b esp = 0xbff3ab70 ebp = 0x36449a06 ebx = 0x024b8430 esi = 0xb56c73d0 edi = 0xb763c740 Found by: call frame info 8 libxul.so!nsThread::ProcessNextEvent [nsThread.cpp : 619 + 0x8] eip = 0x01d00870 esp = 0xbff3abb0 ebp = 0xb76ebe90 ebx = 0x024b8430 esi = 0xb763c740 edi = 0x00000000 Found by: call frame info 9 libxul.so!NS_ProcessNextEvent_P [nsThreadUtils.cpp : 245 + 0x12] eip = 0x01cd043d esp = 0xbff3ac20 ebp = 0xb76ebe90 ebx = 0x024b8430 esi = 0x00000001 edi = 0xb761c9d0 Found by: call frame info 10 libxul.so!mozilla::ipc::MessagePump::Run [MessagePump.cpp : 134 + 0xb] eip = 0x01c56af6 esp = 0xbff3ac50 ebp = 0xb76ebe90 ebx = 0x024b8430 esi = 0xb76ebe80 edi = 0xb761c9d0 Found by: call frame info https://tbpl.mozilla.org/php/getParsedLog.php?id=8985510&tree=Firefox Rev3 Fedora 12 mozilla-central pgo talos tp_responsiveness on 2012-01-31 12:27:55 PST for push 29514d9b4216 NOISE: Cycle 9: loaded http://localhost/page_load_test/tp5/msn.com/www.msn.com/index.html (next: http://localhost/page_load_test/tp5/yahoo.co.jp/www.yahoo.co.jp/index.html) NOISE: Cycle 9: loaded http://localhost/page_load_test/tp5/yahoo.co.jp/www.yahoo.co.jp/index.html (next: http://localhost/page_load_test/tp5/amazon.com/www.amazon.com/Kindle-Wireless-Reader-Wifi-Graphite/dp/B002Y27P3M/507846.html) NOISE: Found crashdump: /tmp/tmpGnHeOQ/profile/minidumps/31b0b0e2-5975-2e80-2d5f1dbe-28d9f080.dmp Operating system: Linux 0.0.0 Linux 2.6.31.5-127.fc12.i686.PAE #1 SMP Sat Nov 7 21:25:57 EST 2009 i686 CPU: x86 GenuineIntel family 6 model 23 stepping 10 2 CPUs Crash reason: SIGABRT Crash address: 0x844 Thread 0 (crashed) 0 linux-gate.so + 0x424 eip = 0x00fee424 esp = 0xbf90d9b4 ebp = 0xbf90d9d8 ebx = 0xb03f0bf0 esi = 0x00000000 edi = 0x00c58ff4 eax = 0xfffffffc ecx = 0x00000009 edx = 0xffffffff efl = 0x00000293 Found by: given as instruction pointer in context 1 libglib-2.0.so.0.2200.2 + 0x47a0b eip = 0x001d5a0c esp = 0xbf90d9e0 ebp = 0xbf90d9f8 Found by: previous frame's frame pointer 2 libxul.so!PollWrapper [nsAppShell.cpp:29514d9b4216 : 66 + 0x1c] eip = 0x01d70f58 esp = 0xbf90da00 ebp = 0xbf90da98 Found by: previous frame's frame pointer 3 libglib-2.0.so.0.2200.2 + 0x3a882 eip = 0x001c8883 esp = 0xbf90da20 ebp = 0xbf90da98 ebx = 0x002761a4 esi = 0xb7647580 Found by: call frame info 4 libglib-2.0.so.0.2200.2 + 0x3ab73 eip = 0x001c8b74 esp = 0xbf90daa0 ebp = 0xbf90dad8 Found by: previous frame's frame pointer 5 libxul.so!nsAppShell::ProcessNextNativeEvent [nsAppShell.cpp:29514d9b4216 : 162 + 0x14] eip = 0x01d70ed4 esp = 0xbf90dae0 ebp = 0x00000000 Found by: previous frame's frame pointer 6 libxul.so!nsBaseAppShell::OnProcessNextEvent [nsBaseAppShell.cpp:29514d9b4216 : 171 + 0xb] eip = 0x01d89127 esp = 0xbf90db00 ebp = 0x00000000 ebx = 0x027819b4 Found by: call frame info 7 libxul.so!nsThread::ProcessNextEvent [nsThread.cpp:29514d9b4216 : 619 + 0x3a] eip = 0x01f2e947 esp = 0xbf90db40 ebp = 0x00000001 ebx = 0x027819b4 esi = 0xb763c740 edi = 0xbf90db88 Found by: call frame info 8 libxul.so!NS_ProcessNextEvent_P [nsThreadUtils.cpp:29514d9b4216 : 245 + 0x16] eip = 0x01ef54df esp = 0xbf90dbb0 ebp = 0x02775860 ebx = 0x027819b4 esi = 0xbf90dbdf edi = 0xb761c9d0 Found by: call frame info 9 libxul.so!mozilla::ipc::MessagePump::Run [MessagePump.cpp:29514d9b4216 : 134 + 0x12] eip = 0x01e4974f esp = 0xbf90dbf0 ebp = 0x02775860 ebx = 0x027819b4 esi = 0xb76ebe80 edi = 0xb761c9d0 Found by: call frame info 10 libxul.so!MessageLoop::Run [message_loop.cc:29514d9b4216 : 208 + 0xb] eip = 0x01f6395f esp = 0xbf90dc40 ebp = 0xbf90e1ec ebx = 0x027819b4 esi = 0xb56c73d0 edi = 0xb763c740
Whiteboard: [red] → [red][orange]
Crash Signature: [@ PollWrapper]
Summary: Intermittent red Linux tp5 | stack found after process termination: terminated with SIGABRT [@ linux-gate.so + 0x424] → Intermittent red Linux tp5 | stack found after process termination: terminated with SIGABRT [@ linux-gate.so + 0x424] [@ PollWrapper]
Depends on: 720852
(In reply to Matt Brubeck (:mbrubeck) from comment #0) > Crash reason: SIGABRT > Crash address: 0x86a > > Thread 0 (crashed) > 0 linux-gate.so + 0x424 > eip = 0x00ff2424 esp = 0xbff3a9f4 ebp = 0xbff3aa18 ebx = 0xb03f0600 > esi = 0x00000000 edi = 0x00c58ff4 eax = 0xfffffffc ecx = 0x00000009 > edx = 0xffffffff efl = 0x00000293 > Found by: given as instruction pointer in context > 1 libglib-2.0.so.0.2200.2 + 0x47a0b > eip = 0x00d15a0c esp = 0xbff3aa20 ebp = 0xbff3aa38 > Found by: previous frame's frame pointer % addr2line -if -e glib2-debuginfo-2.22.2-1.fc12.i686/usr/lib/debug/usr/lib/libglib-2.0.so.debug 0x47a0b IA__g_poll /usr/src/debug/glib-2.22.2/glib/gpoll.c:127 http://git.gnome.org/browse/glib/tree/glib/gpoll.c?id=2.22.2#n127 I can't think of a good reason why poll should raise SIGABRT (particularly when there is no error message). The test suite uses SIGABRT to kill the process, and that seems more likely, but I don't see any timeout reported in the output.
Same stack, morphing to cover mochitest as well (happy to break out if that is preferred). https://tbpl.mozilla.org/php/getParsedLog.php?id=11780778&tree=Mozilla-Inbound
Summary: Intermittent red Linux tp5 | stack found after process termination: terminated with SIGABRT [@ linux-gate.so + 0x424] [@ PollWrapper] → Intermittent Linux tp5 and mochitest | stack found after process termination: terminated with SIGABRT [@ linux-gate.so + 0x424] [@ PollWrapper]
Comment 200 is a timeout in browser_tilt_03_tab_switch.js and so would be better filed separately. The stack here merely means that the app is waiting for an event.
Summary: Intermittent Linux tp5 and mochitest | stack found after process termination: terminated with SIGABRT [@ linux-gate.so + 0x424] [@ PollWrapper] → Intermittent Linux tp5 | stack found after process termination: terminated with SIGABRT [@ linux-gate.so + 0x424] [@ PollWrapper]
Summary: Intermittent Linux tp5 | stack found after process termination: terminated with SIGABRT [@ linux-gate.so + 0x424] [@ PollWrapper] → Intermittent Linux tp5 | stack found after process termination: terminated with SIGABRT [@ linux-gate.so + 0x424] [@ PollWrapper] on www.amazon.com/Kindle-Wireless-Reader-Wifi-Graphite/dp/B002Y27P3M/507846.html
I see that "terminated" here is the transitive verb. The terminator is talos: http://hg.mozilla.org/build/talos/file/d00eba6055f5/talos/ffprocess_linux.py#l94
An unknown exception is raised in this block in runTest(): http://hg.mozilla.org/build/talos/annotate/d00eba6055f5/talos/ttest.py#l243 When cleanupAndCheckForCrashes raises its own exception, we seem to lose any info about the previous exception. My python is sketchy and I don't know why we don't get the exception traceback printed after this "raise" re-raises the previous exception: http://hg.mozilla.org/build/talos/annotate/d00eba6055f5/talos/ttest.py#l437
Component: General → Talos
Product: Core → Testing
QA Contact: general → talos
Summary: Intermittent Linux tp5 | stack found after process termination: terminated with SIGABRT [@ linux-gate.so + 0x424] [@ PollWrapper] on www.amazon.com/Kindle-Wireless-Reader-Wifi-Graphite/dp/B002Y27P3M/507846.html → unknown exception in runTest: Intermittent Linux tp5 | stack found after process termination: terminated with SIGABRT [@ linux-gate.so + 0x424] [@ PollWrapper] on www.amazon.com/Kindle-Wireless-Reader-Wifi-Graphite/dp/B002Y27P3M/507846.html
Blocks: 438871
Depends on: 720852, 755998
(In reply to Karl Tomlinson (:karlt) from comment #203) > An unknown exception is raised in this block in runTest(): > http://hg.mozilla.org/build/talos/annotate/d00eba6055f5/talos/ttest.py#l243 > > When cleanupAndCheckForCrashes raises its own exception, we seem to lose any > info about the previous exception. > > My python is sketchy and I don't know why we don't get the exception > traceback printed after this "raise" re-raises the previous exception: > http://hg.mozilla.org/build/talos/annotate/d00eba6055f5/talos/ttest.py#l437 It seems my understanding of "current scope" in "If no expressions are present, ``raise`` re-raises the last exception that was active in the current scope" wasn't quite right. "As of Python 1.5, the [exception] variables are restored to their previous values (before the call) when returning from a function that handled an exception" implies that cleanupAndCheckForCrashes exceptions might need to be handled in a separate function, not just a separate try block, in order for ``raise`` to re-raise the original exception. But it would seem preferable that cleanupAndCheckForCrashes didn't raise an exception unless something has gone wrong. Finding a stack after killing the process is expected.
Blocks: 738716
Given bug 720852 and the consistent elapsed times of 369x seconds, the original exception is most likely a timeout.
Severity: normal → critical
https://tbpl.mozilla.org/php/getParsedLog.php?id=13181516&tree=Mozilla-Aurora Calling this Mozilla 15 isn't entirely accurate, since it's really the Talos branch that 15 and below pull, but it's WFM now and that WFM will ride the trains with 15.
Version: Trunk → 15 Branch
Whiteboard: [red][orange] → [red]
Crash Signature: [@ PollWrapper] → [@ PollWrapper] [@ linux-gate.so + 0x424]
This, which was about tp5 hanging because it was hitting the network, has been fixed since tp5 became tp5n, the flavor that no longer hit the network. Dunno what unfiled things the trunk failures that were called this between last July and now were (the last few seem to be a different hang, in dirtypaint), but they were not this. The Aurora failures starting in November were the permaorange, not tp5, not talos, bug 823989.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.