Closed
Bug 889869
Opened 11 years ago
Closed 10 years ago
Intermittent LOOK AT THE STACK ipc | application timed out after 330 seconds with no output | application crashed [@ linux-gate.so + 0x424] | libxcb.so.1.1.0 + 0x811f
Categories
(Core :: Graphics: Layers, defect, P5)
Tracking
()
RESOLVED
DUPLICATE
of bug 975216
People
(Reporter: RyanVM, Unassigned)
References
Details
(Keywords: crash, intermittent-failure)
Crash Data
Attachments
(1 file)
768.39 KB,
image/png
|
Details |
https://tbpl.mozilla.org/php/getParsedLog.php?id=24886335&tree=Mozilla-Central Ubuntu VM 12.04 mozilla-central opt test crashtest-ipc on 2013-07-03 07:38:18 PDT for push 2cae857c17cb slave: tst-linux32-ec2-019 07:48:43 INFO - REFTEST TEST-START | file:///builds/slave/test/build/tests/reftest/tests/layout/style/crashtests/645951-1.html 07:48:43 INFO - REFTEST TEST-LOAD | file:///builds/slave/test/build/tests/reftest/tests/layout/style/crashtests/645951-1.html | 1815 / 2487 (72%) 07:54:13 WARNING - TEST-UNEXPECTED-FAIL | file:///builds/slave/test/build/tests/reftest/tests/layout/style/crashtests/645951-1.html | application timed out after 330 seconds with no output 07:54:13 INFO - args: ['/builds/slave/test/build/tests/bin/screentopng'] 07:54:13 INFO - Xlib: extension "RANDR" missing on display ":0". 07:54:22 INFO - SCREENSHOT: <see attached> 07:54:22 INFO - INFO | automation.py | Application ran for: 0:11:23.070918 07:54:22 INFO - INFO | zombiecheck | Reading PID log: /tmp/tmpFtLN8wpidlog 07:54:22 INFO - ==> process 2286 launched child process 2317 07:54:22 INFO - ==> process 2317 launched child process 2348 07:54:22 INFO - INFO | zombiecheck | Checking for orphan process with PID: 2317 07:54:22 INFO - INFO | zombiecheck | Checking for orphan process with PID: 2348 07:54:22 INFO - mozcrash INFO | Downloading symbols from: http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1372857248/firefox-25.0a1.en-US.linux-i686.crashreporter-symbols.zip 07:54:22 INFO - Downloading symbols from: http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1372857248/firefox-25.0a1.en-US.linux-i686.crashreporter-symbols.zip 07:55:03 WARNING - PROCESS-CRASH | file:///builds/slave/test/build/tests/reftest/tests/layout/style/crashtests/645951-1.html | application crashed [@ linux-gate.so + 0x424] 07:55:03 INFO - Crash dump filename: /tmp/tmp2mR9tf/minidumps/2d56bc86-7869-1260-7f6b698b-7d170286.dmp 07:55:03 INFO - Operating system: Linux 07:55:03 INFO - 0.0.0 Linux 3.2.0-23-generic-pae #36-Ubuntu SMP Tue Apr 10 22:19:09 UTC 2012 i686 07:55:03 INFO - CPU: x86 07:55:03 INFO - GenuineIntel family 6 model 45 stepping 7 07:55:03 INFO - 1 CPU 07:55:03 INFO - Crash reason: SIGABRT 07:55:03 INFO - Crash address: 0x8eb 07:55:03 INFO - Thread 0 (crashed) 07:55:03 INFO - 0 linux-gate.so + 0x424 07:55:03 INFO - eip = 0xb7702424 esp = 0xbfd21a50 ebp = 0x00000000 ebx = 0xbfd21aa8 07:55:03 INFO - esi = 0x00000000 edi = 0xb758fff4 eax = 0xfffffffc ecx = 0x00000001 07:55:03 INFO - edx = 0xffffffff efl = 0x00200282 07:55:03 INFO - Found by: given as instruction pointer in context 07:55:03 INFO - 1 libc-2.15.so + 0xdc37f 07:55:03 INFO - eip = 0xb74cb380 esp = 0xbfd21a60 ebp = 0x00000000 07:55:03 INFO - Found by: stack scanning 07:55:03 INFO - 2 libxcb.so.1.1.0 + 0x1fff3 07:55:03 INFO - eip = 0xb3773ff4 esp = 0xbfd21a74 ebp = 0x00000000 07:55:03 INFO - Found by: stack scanning 07:55:03 INFO - 3 libxcb.so.1.1.0 + 0x811f 07:55:03 INFO - eip = 0xb375c120 esp = 0xbfd21a80 ebp = 0x00000000 07:55:03 INFO - Found by: stack scanning 07:55:03 INFO - 4 libxcb.so.1.1.0 + 0x9932 07:55:03 INFO - eip = 0xb375d933 esp = 0xbfd21a90 ebp = 0x00000000 07:55:03 INFO - Found by: stack scanning 07:55:03 INFO - 5 libxcb.so.1.1.0 + 0x1fff3 07:55:03 INFO - eip = 0xb3773ff4 esp = 0xbfd21ac0 ebp = 0x00000000 07:55:03 INFO - Found by: stack scanning 07:55:03 INFO - 6 libxcb.so.1.1.0 + 0x9a4f 07:55:03 INFO - eip = 0xb375da50 esp = 0xbfd21ad0 ebp = 0x00000000 07:55:03 INFO - Found by: stack scanning 07:55:03 INFO - 7 libxcb.so.1.1.0 + 0x1fff3 07:55:03 INFO - eip = 0xb3773ff4 esp = 0xbfd21ae4 ebp = 0x00000000 07:55:03 INFO - Found by: stack scanning
Updated•11 years ago
|
Crash Signature: [@ linux-gate.so@0x424]
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Reporter | ||
Comment 3•11 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=25343400&tree=Mozilla-Inbound
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Reporter | ||
Comment 20•11 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=25844189&tree=Birch
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 22•11 years ago
|
||
Looking at the stack frame candidates in the log of comment 20 and using addr2line -if -e with dbg packages from precise, the application appears to be waiting in poll for the X server to reply with this stack. 3 _xcb_conn_wait /build/buildd/libxcb-1.8.1/build/src/../../src/xcb_conn.c:400 4 _xcb_in_wake_up_next_reader /build/buildd/libxcb-1.8.1/build/src/../../src/xcb_in.c:621 6 wait_for_reply /build/buildd/libxcb-1.8.1/build/src/../../src/xcb_in.c:390 11 xcb_wait_for_reply /build/buildd/libxcb-1.8.1/build/src/../../src/xcb_in.c:420 16 _XReply /build/buildd/libx11-1.4.99.1/build/src/../../src/xcb_io.c:601 22 XScreenSaverQueryInfo /build/buildd/libxss-1.2.1/build/src/../../src/XScrnSaver.c:220 (discriminator 2) There isn't much remarkable about that. I may have suspected an X server hang except that the screenshot took only 9 seconds to capture (from the server). That leaves only speculation. Perhaps a file descriptor problem. Perhaps https://bugs.freedesktop.org/show_bug.cgi?id=56508 No clues here.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 35•11 years ago
|
||
Do we have a Linux machine in the test farm that reproduces this more often than others? We may have to remote into that box to debug. I see these 3 machines in the logs: slave: tst-linux32-ec2-019 slave: tst-linux32-ec2-084 slave: tst-linux32-ec2-090 Note that the symptoms Karl noted in comment 22 seem identical to this resolved bug: https://bugzilla.mozilla.org/show_bug.cgi?id=555352#c31
Updated•11 years ago
|
Summary: Intermittent LOOK AT THE STACK 645951-1.html | application timed out after 330 seconds with no output | application crashed [@ linux-gate.so + 0x424] → Intermittent LOOK AT THE STACK ipc 645951-1.html | application timed out after 330 seconds with no output | application crashed [@ linux-gate.so + 0x424]
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•11 years ago
|
Summary: Intermittent LOOK AT THE STACK ipc 645951-1.html | application timed out after 330 seconds with no output | application crashed [@ linux-gate.so + 0x424] → Intermittent LOOK AT THE STACK ipc | application timed out after 330 seconds with no output | application crashed [@ linux-gate.so + 0x424] | libxcb.so.1.1.0 + 0x811f
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•11 years ago
|
Priority: -- → P5
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 61•11 years ago
|
||
This is only happens on ipc tests. The X server seems to be responding to other clients, but the client here is wait for reply, suggesting that libxcb may be confused about its state. Could browser.tabs.remote cause Xlib to be used from more than one thread?
Component: Layout → Graphics: Layers
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 65•11 years ago
|
||
(In reply to Karl Tomlinson (:karlt) from comment #61) > Could browser.tabs.remote cause Xlib to be used from more than one thread? Fwiw, that was my impression too on the duped bug 943241.
Comment 66•11 years ago
|
||
Can you explain in small simple words I can understand what makes a test failure _this bug_, or alternately what makes it not this bug? Right now, you have it set up to get tbplbot spammed for every single 330 seconds without output hang on Linux, since they all get a SIGABRT in linux-gate.so + 0x424. Soon, tbpl is going to blacklist that signature, because we do just throw every single unfiled timeout into whatever bug happens to have that in the summary.
Comment 67•11 years ago
|
||
The significant things to match here are: 1) This happens only in reftest-ipc and crashtest-ipc 2) libxcb.so.1.1.0 + 0x811f is on the stack I don't know what libxcb.so.1.1.0 + 0x1fff3 is but given it occurs multiple times on the stack and addr2line doesn't know, it may not be a return address but just a pointer on the stack to some data, which the stack scanning algorithm thought might be interesting. Yes, linux-gate.so + 0x424 just means a system call afaik.
Comment 68•11 years ago
|
||
Bug 910488 may be the same issue. It was semi-reproducible before graffiti covered over the bug.
Comment 69•11 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=31424675&tree=Mozilla-Inbound
Comment 70•11 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=31607504&tree=Mozilla-Inbound
Reporter | ||
Comment 71•11 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=32042926&tree=Fx-Team
Reporter | ||
Comment 72•11 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=32229481&tree=Fx-Team
Comment 73•10 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=33234562&tree=Mozilla-Inbound
Reporter | ||
Comment 74•10 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=33578160&tree=Mozilla-Esr24
Reporter | ||
Comment 75•10 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=33831500&tree=Fx-Team
Reporter | ||
Comment 76•10 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=33891301&tree=B2g-Inbound
Reporter | ||
Comment 77•10 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=34076224&tree=Mozilla-Inbound
Reporter | ||
Comment 78•10 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=34292210&tree=Mozilla-Inbound
Comment 79•10 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=34322865&tree=Mozilla-Aurora
Reporter | ||
Comment 81•10 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=34860496&tree=Mozilla-Inbound
Reporter | ||
Comment 82•10 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=34861548&tree=Fx-Team https://tbpl.mozilla.org/php/getParsedLog.php?id=34857513&tree=Fx-Team
Reporter | ||
Comment 86•10 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=34930082&tree=Mozilla-Inbound Karl and Matt, I know we did a little IRC chatting about this bug last night. Any chance we could summarize that here? This is getting more frequent now that we're running reftests on B2G desktop builds and will get even more so once we start running more than just reftest-sanity on them. I think I could justifiably argue that this should block that work, actually.
Flags: needinfo?(matt.woodrow)
Flags: needinfo?(karlt)
Comment 87•10 years ago
|
||
Fwiw, getting more reftests running on b2g desktop is pretty low priority for me atm, for two reasons: 1) I'm going to shift focus towards intermittent issues/log mangling 2) b2g desktop is being replaced by mulet and I'd like to avoid triaging reftest failures on two separate platforms if at all possible.
Comment 88•10 years ago
|
||
All we decided was that b2g refests, along with linux Cipc/Ripc were places where we are accessing Xlib from multiple threads. This should be ok (we initialize Xlib in threaded mode for trunk builds), but it's plausible that we're hitting bugs that are specific to this code path.
Reporter | ||
Comment 89•10 years ago
|
||
Any suggestions for how to proceed? This is quickly climbing the ranks.
Comment 90•10 years ago
|
||
I wonder whether this could be the same issue that http://cgit.freedesktop.org/xcb/libxcb/commit/?id=23911a707b8845bff52cd7853fc5d59fb0823cef addressed. I expect the fix is in 1.8.1-1ubuntu0.1 https://bugs.launchpad.net/ubuntu/+source/libxcb/+bug/1059276/comments/26 Probably worth going to 1.8.1-1ubuntu0.2 to fix the memory safety bug too http://changelogs.ubuntu.com/changelogs/pool/main/libx/libxcb/libxcb_1.8.1-1ubuntu0.2/changelog
Flags: needinfo?(karlt)
Reporter | ||
Comment 91•10 years ago
|
||
By my perusal of the logs: libxcb-devel.i686 0:1.5-1.el6 Ouch. I'll file a RelEng bug for getting that updated.
Flags: needinfo?(matt.woodrow)
Comment 92•10 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #91) > libxcb-devel.i686 0:1.5-1.el6 I'm guessing that is on a CentOS build machine, but we need the update on the Ubuntu test machines.
Updated•10 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•