Closed Bug 473680 Opened 16 years ago Closed 6 years ago

setTimeout-heavy crashtests time out occasionally

Categories

(Core :: DOM: Core & HTML, defect, P5)

defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: roc, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: intermittent-failure, Whiteboard: [3 tests disabled][notacrash])

Attachments

(1 file, 1 obsolete file)

Similar to bug 465475, this test has a heavy reliance on setTimeouts --- we do 10 iterations, each of which requires two chained setTimeouts. This may make the test unreliable on a heavily loaded machine. I suggest we introduce a time cap, and stop after 20 iterations or 5 seconds, whichever is earlier.
Keywords: checkin-needed
Whiteboard: [needs landing]
Pushed 9813b594352a
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Keywords: checkin-needed
Whiteboard: [needs landing]
This seems to have fixed the unreliability of the test.
It's still timing out quite frequently on the Windows unit test box, so I don't think it has. (Although I don't understand why the test after it also times out if and only if it times out.)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
> (Although I don't understand why the test after it also times > out if and only if it times out.) Pardon?
Whenever the problem happens, which is pretty frequently on the Windows unit test box today, the error is: REFTEST TEST-UNEXPECTED-FAIL | file:///e:/builds/moz2_slave/mozilla-central-win32-unittest/build/content/base/crashtests/458637-1.html | timed out waiting for onload to fire REFTEST TEST-UNEXPECTED-FAIL | file:///e:/builds/moz2_slave/mozilla-central-win32-unittest/build/content/base/crashtests/472593-1.html | timed out waiting for onload to fire In other words, whatever goes wrong here (intermittently) causes two tests to time out: exactly the same two tests, and in perfect correlation.
we're seeing this very often on the windows tracemonkey unit test machine
Saw this again today, on the Windows mozilla-central unit test box. Got both failures from comment 6. http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1233878688.1233882723.8582.gz
(In reply to comment #6) > In other words, whatever goes wrong here (intermittently) causes two tests to > time out: exactly the same two tests, and in perfect correlation. Note that the two tests are consecutive in the manifest: 34 load 458637-1.html 35 load 472593-1.html http://mxr.mozilla.org/mozilla-central/source/content/base/crashtests/crashtests.list
The same pair failed again today: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1233946292.1233950785.28536.gz WINNT 5.2 mozilla-central unit test on 2009/02/06 10:51:32 See also bug 473968, which I filed back in mid-January on timeout-pairs in 458637-1.html and its immediate successor at that time, which was 421715-1.html. Based on that bug and this bug, it looks like crashtest 458637-1.html often sporadically times out **and makes its successor time out**, probably through no fault of the successor. I propose disabling 458637-1.html until it's fixed (because a few sporadic failures per day is pretty bad for a unittest), but leaving 472593-1.html enabled for now, to see if it still sporadically fails on its own.
Here's a patch to disable it for now.
(In reply to comment #11) > The same pair failed again today: > http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1233946292.1233950785.28536.gz > WINNT 5.2 mozilla-central unit test on 2009/02/06 10:51:32 Actually, on further inspection, these tests failed in 4 out of the last 8 cycles on that Windows unittest box -- the above-linked failure, plus these 3 more: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1233951780.1233959536.14554.gz http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1233939339.1233947825.23911.gz http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1233922725.1233929770.15780.gz
Disable-test patch pushed: http://hg.mozilla.org/mozilla-central/rev/c3ddf1a94340 (This crashtest clearly needs investigation/fixing, but until it gets attention, it doesn't do us any good to have it enabled -- it's just making tinderbox excessively orange right now.)
Comment on attachment 360978 [details] [diff] [review] patch to disable crashtest 458637-1.html >-load 458637-1.html >+# load 458637-1.html # sporadically fails -- see bug 473680 If you want to do this, you should use the "skip" keyword: skip load 458637-1.html which will skip it and report it as a known failure.
Attachment #360978 - Attachment is obsolete: true
Comment on attachment 360978 [details] [diff] [review] patch to disable crashtest 458637-1.html (In reply to comment #15) > If you want to do this, you should use the "skip" keyword: Ah, thanks! Fixed to use 'skip' now: http://hg.mozilla.org/mozilla-central/rev/e88a0268f8ed
Another timeout-heavy crashtest failed inexplicably today, this time on Mac: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1234135066.1234140905.1257.gz layout/base/crashtests/403175-1.html | timed out waiting for reftest-wait to be removed (after onload fired) If it does that again, I'll mark it as skip too.
Depends on: 477409
Summary: content/base/crashtests/458637-1.html may be unreliable → setTimeout-heavy crashtests time out occasionally (e.g. 458637-1.html, 403175-1.html)
The timeout of mochitest browser/base/content/test/browser_bug321000.js makes me think this might be a machine or setTimeout issue rather than a reftest harness issue: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1234333287.1234337680.5516.gz#err13
(That's also a recent change; see bug 474081.)
Maybe the 10 second reftest load timeout is just not enough for the test machines when under load?
Blocks: 438871
Whiteboard: [orange]
{ http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1238095284.1238103966.30828.gz WINNT 5.2 mozilla-1.9.1 unit test on 2009/03/26 12:21:24 REFTEST TEST-UNEXPECTED-FAIL | file:///e:/builds/moz2_slave/mozilla-1.9.1-win32-unittest/build/content/base/crashtests/458637-1.html | timed out waiting for onload to fire }
(In reply to comment #22) > file:///e:/builds/moz2_slave/mozilla-1.9.1-win32-unittest/build/content/base/crashtests/458637-1.html > | timed out waiting for onload to fire I already disabled this test on mozilla-central due to sporadic timeouts, in comment 16 (changeset e88a0268f8ed). If it times out a lot on 1.9.1, we should probably disable it there, too.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1238517486.1238525138.10617.gz WINNT 5.2 mozilla-1.9.1 unit test on 2009/03/31 09:38:06 (In reply to comment #24) > If it times out a lot on 1.9.1, we should probably disable it there, too. Yes, please.
Whiteboard: [orange] → [orange] [notacrash]
peterv, you've hacked on setTimeout. Any idea what could be happening with these tests?
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1263823761.1263824833.18577.gz Linux mozilla-central debug test crashtest on 2010/01/18 06:09:21
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264041886.1264043373.28468.gz Linux mozilla-central debug test crashtest on 2010/01/20 18:44:46 s: moz2-linux-slave01 REFTEST TEST-UNEXPECTED-FAIL | file:///builds/moz2_slave/mozilla-central-linux-debug-unittest-crashtest/build/reftest/tests/content/base/crashtests/399712-1.html | timed out waiting for reftest-wait to be removed (after onload fired)
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264107226.1264109101.30144.gz http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264108604.1264110108.9376.gz REFTEST TEST-UNEXPECTED-FAIL | file:///builds/moz2_slave/mozilla-central-linux-debug-unittest-crashtest/build/reftest/tests/content/base/crashtests/399712-1.html | timed out waiting for reftest-wait to be removed (after onload fired)
Summary: setTimeout-heavy crashtests time out occasionally (e.g. 458637-1.html, 403175-1.html) → setTimeout-heavy crashtests time out occasionally (e.g. 458637-1.html, 403175-1.html, 399712-1.html)
Assignee: roc → nobody
Component: Layout → DOM
QA Contact: layout → general
OS: Mac OS X → All
Hardware: x86 → All
I disabled 399712 -1.html in http://hg.mozilla.org/mozilla-central/rev/39a7d5b8ee6d. Now three tests are disabled: content/base/crashtests/458637-1.html content/base/crashtests/399712-1.html layout/base/crashtests/403175-1.html
Summary: setTimeout-heavy crashtests time out occasionally (e.g. 458637-1.html, 403175-1.html, 399712-1.html) → setTimeout-heavy crashtests time out occasionally
Whiteboard: [orange] [notacrash] → [orange][3 tests disabled][notacrash]
A fix for bug 558306 landed today. I wonder if it could have been the cause. (Do Tinderbox machines run NTP clients and occasionally go backwards in time?)
Whiteboard: [orange][3 tests disabled][notacrash] → [3 tests disabled][notacrash]
Priority: -- → P5

I stumbled upon this bug recently and decided to see what the current state of things was. Of the 3 disabled tests, only bug 458637-1.html appears to cause issues anymore (intermittent assertions on debug builds and timeouts on Android). I'm going to close this bug out, re-enable the tests (except as noted for bug 458637-1.html previously), and have opened bug 1519651 for any follow-ups there.

Status: REOPENED → RESOLVED
Closed: 16 years ago6 years ago
Resolution: --- → WORKSFORME
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: