Closed
Bug 591898
Opened 14 years ago
Closed 12 years ago
[Linux SeaMonkey 2.1, crashtest] 457362-1.xhtml segfaults on tinderbox but not locally
Categories
(SeaMonkey :: Testing Infrastructure, defect)
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: iannbugzilla, Unassigned)
References
(Blocks 1 open bug, )
Details
(Keywords: intermittent-failure, regression)
Attachments
(1 file)
18.51 KB,
text/plain
|
Details |
On 13th August the SeaMonkey Linux comm-central-trunk debug test crashtest was switched from cn-sea-qm-centos5-01 to cb-seamonkey-linux-01, since then Crashtest 457362-1.xhtml segfaults on that tinderbox with: TEST-UNEXPECTED-FAIL | file:///builds/slave/comm-central-trunk-linux-debug-unittest-crashtest/build/reftest/tests/layout/base/crashtests/457362-1.xhtml | Exited with code -11 during test run If you run the tests locally with: make crashtest TEST_PATH=layout/base/crashtests/crashtests.list there is no segfault: REFTEST INFO | Result summary: REFTEST INFO | Successful: 305 (0 pass, 305 load only) REFTEST INFO | Unexpected: 0 (0 unexpected fail, 0 unexpected pass, 0 unexpected asserts, 0 unexpected fixed asserts, 0 failed load, 0 exception) REFTEST INFO | Known problems: 2 (0 known fail, 0 known asserts, 0 random, 2 skipped, 0 slow) REFTEST INFO | Total canvas count = 0 The pushes that happened in the window between passing and failing are: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=29114207a571&tochange=e1d20276ef6d http://hg.mozilla.org/comm-central/pushloghtml?fromchange=16d674d435cf&tochange=579f8b02ac29
Comment 1•14 years ago
|
||
Seems more like releng territory, punting to them.
Assignee: server-ops → nobody
Component: Server Operations: Tinderbox Maintenance → Release Engineering
QA Contact: mrz → release
Comment 2•14 years ago
|
||
(In reply to comment #0) > On 13th August the SeaMonkey Linux comm-central-trunk debug test crashtest was > switched from cn-sea-qm-centos5-01 to cb-seamonkey-linux-01, It was never switched between any boxes, it just executes on whatever box is free for it, those two are just two random possibilities out of the five we have. If it only fails on a single box, it might be a box problem and we need to find it, else this is a code problem - which from all I've seen is way more likely.
Component: Release Engineering → Testing Infrastructure
Product: mozilla.org → SeaMonkey
QA Contact: release → testing-infrastructure
Version: other → Trunk
Updated•14 years ago
|
Summary: Crashtest 457362-1.xhtml segfaults on tinderbox but not locally → [SeaMonkey 2.1, crashtest] 457362-1.xhtml segfaults on tinderbox but not locally
It's just strange that the tests work locally but don't on tinderbox, someone with tinderbox access would need to debug the seg fault. Perhaps a package requirement introduced by one of the pushes in http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=29114207a571&tochange=e1d20276ef6d
Comment 4•14 years ago
|
||
(In reply to comment #3) > It's just strange that the tests work locally but don't on tinderbox, someone > with tinderbox access would need to debug the seg fault. Perhaps a package > requirement introduced by one of the pushes in > http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=29114207a571&tochange=e1d20276ef6d I haven't seen an trace of any such thing at least for Linux. But then, all the OSes tested for Firefox are different than what we test right now, because we can't afford Talos minis and they run all their tests exclusively on such boxes, with different OSes than the builder machines, while we run them all on the builder machines. So if anything is needed on test boxes only, we don't see it, and have not much of a chance to get the right packages. Oh, and I doubt anyone with debug knowledge and access to out build boxes exists, unless Callek know enough (I don't), and then we have few enough boxes that we can't just set one aside for a longer period to dig in very much.
Comment 5•14 years ago
|
||
I got a hang during that test, attached ddd to it and got the output in the attachment. I cut it after line #40 since it was endlessly going on with the same output around nsWindow::OnExposeEvent (hit ctrl+c at about #6000) But it's not completely reproducable, I also got clean testruns whithout any crash or hang for that test.
Comment 6•14 years ago
|
||
I let the test run again and at some point, I've got the following output: REFTEST TEST-START | file:///home/i6stud/sibresch/nobackup/seamonkey_hg/trees/comm-central/mozilla/layout/base/crashtests/457362-1.xhtml ++DOMWINDOW == 92 (0x2ab7fff84468) [serial = 1526] [outer = 0x2ab7f80b0c00] WARNING: g_closure_ref: assertion `closure->ref_count < CLOSURE_MAX_REF_COUNT' failed: 'glib warning', file /home/i6stud/sibresch/nobackup/seamonkey_hg/trees/comm-central/mozilla/toolkit/xre/nsSigHandlers.cpp, line 193 (seamonkey-bin:31479): GLib-GObject-CRITICAL **: g_closure_ref: assertion `closure->ref_count < CLOSURE_MAX_REF_COUNT' failed WARNING: g_closure_ref: assertion `closure->ref_count < CLOSURE_MAX_REF_COUNT' failed: 'glib warning', file /home/i6stud/sibresch/nobackup/seamonkey_hg/trees/comm-central/mozilla/toolkit/xre/nsSigHandlers.cpp, line 193 The second block was repeated endlessly again. I interrupted the test again and crashtest.log had 4.7 Million lines.
Comment 7•14 years ago
|
||
m-c rev 29114207a571 doesn't give me a hang, but m-c rev 9fd11a17eb1a does (all tests run with a 64bit Seamonkey debug build). I sometimes also get a warning after 457362-1.xhtml is loaded, but no idea if this is related: WARNING: ContentViewer exists outside gHistoryMaxViewer range: '!viewer', file /home/i6stud/sibresch/nobackup/seamonkey_hg/trees/comm-central/mozilla/docshell/shistory/src/nsSHistory.cpp, line 846 The strange thing is, I don't see the issue, if I only run the crashtests from mozilla/layout/base/. I have to run make crashtest in mozilla/. CCing Markus as the patch author of bug 506826 and roc/dbaron as the reviewers. Perhaps they can tell us more about this issue.
Comment 8•14 years ago
|
||
Additional info, every time I attach ddd I get a different beginning of the stack trace. Only the repeating OnExposeEvent calls are the same.
Comment 9•14 years ago
|
||
I've tested again on another machine with a newer kernel (2.6.31.12 instead of 2.6.27.45) and a newer Xorg (1.6.5 instead of 1.5.2) and can't see the problem there. So maybe it's an issue with the OS itself :/
Comment 10•14 years ago
|
||
(In reply to comment #9) > I've tested again on another machine with a newer kernel (2.6.31.12 instead of > 2.6.27.45) and a newer Xorg (1.6.5 instead of 1.5.2) and can't see the problem > there. So maybe it's an issue with the OS itself :/ Make sure what you test is a debug build, as you found an assertion failure there, and assertions are fatal on debug but ignored on optimized builds. It's of course entirely possible that the GTK version in CentOS 5 has some subtle problem we are running into there, but while we can upgrade it in some way as long as it doesn't harm the runtime requirements of the builds generated on the same machines, we'd need RPMs applicable to this OS. And we can't run tests on any other platform, as we can't afford the luxury of running Talos with a different set of machines and OSes like FF does.
Reporter | ||
Comment 11•14 years ago
|
||
I've tested again on a debug build, still not crashing. My kernel is 2.6.32.19-163.fc12.i686 My xorg-x11-server-Xorg is 1.7.6-4.fc12.i686 My gtk2 is 2.18.9-3.fc12.i686 My gcc is 4.4.4 2010630 (Red Hat 4.4.4-10)
Updated•14 years ago
|
Summary: [SeaMonkey 2.1, crashtest] 457362-1.xhtml segfaults on tinderbox but not locally → [Linux SeaMonkey 2.1, crashtest] 457362-1.xhtml segfaults on tinderbox but not locally
Comment 12•14 years ago
|
||
http://brasstacks.mozilla.com/topfails/tests/SeaMonkey doesn't list this test :-/
Depends on: 564234
Updated•14 years ago
|
Whiteboard: [orange]
Comment 13•12 years ago
|
||
Mass marking whiteboard:[orange] bugs WFM (to clean up TBPL bug suggestions) that: * Haven't changed in > 6months * Whose whiteboard contains none of the strings: {disabled,marked,random,fuzzy,todo,fails,failing,annotated,leave open,time-bomb} * Passed a (quick) manual inspection of bug summary/whiteboard to ensure they weren't a false positive. I've also gone through and searched for cases where the whiteboard wasn't labelled correctly after test disabling, by using attachment description & basic comment searches. However if the test for which this bug was about has in fact been disabled/annotated/..., please accept my apologies & reopen/mark the whiteboard appropriately so this doesn't get re-closed in the future (and please ping me via IRC or email so I can try to tweak the saved searches to avoid more edge cases). Sorry for the spam! Filter on: #FFA500
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WORKSFORME
Assignee | ||
Updated•12 years ago
|
Keywords: intermittent-failure
Assignee | ||
Updated•12 years ago
|
Whiteboard: [orange]
You need to log in
before you can comment on or make changes to this bug.
Description
•