setTimeout-heavy crashtests time out occasionally

RESOLVED WORKSFORME

Status

()

defect
P5
normal
RESOLVED WORKSFORME
10 years ago
a month ago

People

(Reporter: roc, Unassigned)

Tracking

(Depends on 1 bug, Blocks 1 bug, {intermittent-failure})

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [3 tests disabled][notacrash])

Attachments

(1 attachment, 1 obsolete attachment)

Similar to bug 465475, this test has a heavy reliance on setTimeouts --- we do 10 iterations, each of which requires two chained setTimeouts. This may make the test unreliable on a heavily loaded machine. I suggest we introduce a time cap, and stop after 20 iterations or 5 seconds, whichever is earlier.
Keywords: checkin-needed
Whiteboard: [needs landing]
Pushed 9813b594352a
Status: NEW → RESOLVED
Last Resolved: 10 years ago
Resolution: --- → FIXED
Keywords: checkin-needed
Whiteboard: [needs landing]
This seems to have fixed the unreliability of the test.
It's still timing out quite frequently on the Windows unit test box, so I don't think it has.  (Although I don't understand why the test after it also times out if and only if it times out.)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
> (Although I don't understand why the test after it also times
> out if and only if it times out.)

Pardon?
Whenever the problem happens, which is pretty frequently on the Windows unit test box today, the error is:

REFTEST TEST-UNEXPECTED-FAIL | file:///e:/builds/moz2_slave/mozilla-central-win32-unittest/build/content/base/crashtests/458637-1.html | timed out waiting for onload to fire
REFTEST TEST-UNEXPECTED-FAIL | file:///e:/builds/moz2_slave/mozilla-central-win32-unittest/build/content/base/crashtests/472593-1.html | timed out waiting for onload to fire

In other words, whatever goes wrong here (intermittently) causes two tests to time out:  exactly the same two tests, and in perfect correlation.

Comment 7

10 years ago
we're seeing this very often on the windows tracemonkey unit test machine
Saw this again today, on the Windows mozilla-central unit test box.  Got both failures from comment 6.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1233878688.1233882723.8582.gz
(In reply to comment #6)
> In other words, whatever goes wrong here (intermittently) causes two tests to
> time out:  exactly the same two tests, and in perfect correlation.

Note that the two tests are consecutive in the manifest:
34 load 458637-1.html
35 load 472593-1.html
http://mxr.mozilla.org/mozilla-central/source/content/base/crashtests/crashtests.list
The same pair failed again today:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1233946292.1233950785.28536.gz
WINNT 5.2 mozilla-central unit test on 2009/02/06 10:51:32

See also bug 473968, which I filed back in mid-January on timeout-pairs in 458637-1.html and its immediate successor at that time, which was 421715-1.html.

Based on that bug and this bug, it looks like crashtest 458637-1.html often sporadically times out **and makes its successor time out**, probably through no fault of the successor.

I propose disabling 458637-1.html until it's fixed (because a few sporadic failures per day is pretty bad for a unittest), but leaving 472593-1.html enabled for now, to see if it still sporadically fails on its own.
Here's a patch to disable it for now.
(In reply to comment #11)
> The same pair failed again today:
> http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1233946292.1233950785.28536.gz
> WINNT 5.2 mozilla-central unit test on 2009/02/06 10:51:32

Actually, on further inspection, these tests failed in 4 out of the last 8 cycles on that Windows unittest box -- the above-linked failure, plus these 3 more:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1233951780.1233959536.14554.gz
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1233939339.1233947825.23911.gz
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1233922725.1233929770.15780.gz
Disable-test patch pushed: http://hg.mozilla.org/mozilla-central/rev/c3ddf1a94340

(This crashtest clearly needs investigation/fixing, but until it gets attention, it doesn't do us any good to have it enabled -- it's just making tinderbox excessively orange right now.)
Comment on attachment 360978 [details] [diff] [review]
patch to disable crashtest 458637-1.html

>-load 458637-1.html
>+# load 458637-1.html # sporadically fails -- see bug 473680

If you want to do this, you should use the "skip" keyword:

skip load 458637-1.html

which will skip it and report it as a known failure.
Attachment #360978 - Attachment is obsolete: true
Comment on attachment 360978 [details] [diff] [review]
patch to disable crashtest 458637-1.html

(In reply to comment #15)
> If you want to do this, you should use the "skip" keyword:

Ah, thanks!  Fixed to use 'skip' now:
http://hg.mozilla.org/mozilla-central/rev/e88a0268f8ed

Comment 17

10 years ago
Another timeout-heavy crashtest failed inexplicably today, this time on Mac:

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1234135066.1234140905.1257.gz

layout/base/crashtests/403175-1.html | timed out waiting for reftest-wait to be removed (after onload fired)

If it does that again, I'll mark it as skip too.

Updated

10 years ago
Depends on: 477409
Summary: content/base/crashtests/458637-1.html may be unreliable → setTimeout-heavy crashtests time out occasionally (e.g. 458637-1.html, 403175-1.html)

Comment 19

10 years ago
The timeout of mochitest browser/base/content/test/browser_bug321000.js makes me think this might be a machine or setTimeout issue rather than a reftest harness issue:

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1234333287.1234337680.5516.gz#err13

Comment 20

10 years ago
(That's also a recent change; see bug 474081.)
Maybe the 10 second reftest load timeout is just not enough for the test machines when under load?

Updated

10 years ago
Blocks: 438871
Whiteboard: [orange]
{
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1238095284.1238103966.30828.gz
WINNT 5.2 mozilla-1.9.1 unit test on 2009/03/26 12:21:24

REFTEST TEST-UNEXPECTED-FAIL | file:///e:/builds/moz2_slave/mozilla-1.9.1-win32-unittest/build/content/base/crashtests/458637-1.html | timed out waiting for onload to fire
}
Duplicate of this bug: 473968
(In reply to comment #22)
> file:///e:/builds/moz2_slave/mozilla-1.9.1-win32-unittest/build/content/base/crashtests/458637-1.html
> | timed out waiting for onload to fire

I already disabled this test on mozilla-central due to sporadic timeouts, in comment 16 (changeset e88a0268f8ed).  If it times out a lot on 1.9.1, we should probably disable it there, too.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1238517486.1238525138.10617.gz
WINNT 5.2 mozilla-1.9.1 unit test on 2009/03/31 09:38:06

(In reply to comment #24)
> If it times out a lot on 1.9.1, we should probably disable it there, too.

Yes, please.

Updated

10 years ago
Whiteboard: [orange] → [orange] [notacrash]

Comment 27

9 years ago
peterv, you've hacked on setTimeout. Any idea what could be happening with these tests?
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1263823761.1263824833.18577.gz
Linux mozilla-central debug test crashtest on 2010/01/18 06:09:21

Updated

9 years ago
Duplicate of this bug: 540467
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264041886.1264043373.28468.gz
Linux mozilla-central debug test crashtest on 2010/01/20 18:44:46
s: moz2-linux-slave01

REFTEST TEST-UNEXPECTED-FAIL | file:///builds/moz2_slave/mozilla-central-linux-debug-unittest-crashtest/build/reftest/tests/content/base/crashtests/399712-1.html | timed out waiting for reftest-wait to be removed (after onload fired)
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264107226.1264109101.30144.gz
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264108604.1264110108.9376.gz
REFTEST TEST-UNEXPECTED-FAIL | file:///builds/moz2_slave/mozilla-central-linux-debug-unittest-crashtest/build/reftest/tests/content/base/crashtests/399712-1.html | timed out waiting for reftest-wait to be removed (after onload fired)
Summary: setTimeout-heavy crashtests time out occasionally (e.g. 458637-1.html, 403175-1.html) → setTimeout-heavy crashtests time out occasionally (e.g. 458637-1.html, 403175-1.html, 399712-1.html)

Updated

9 years ago
Assignee: roc → nobody

Updated

9 years ago
Component: Layout → DOM
QA Contact: layout → general

Updated

9 years ago
OS: Mac OS X → All
Hardware: x86 → All

Comment 32

9 years ago
I disabled 399712 -1.html in http://hg.mozilla.org/mozilla-central/rev/39a7d5b8ee6d.

Now three tests are disabled:
content/base/crashtests/458637-1.html
content/base/crashtests/399712-1.html
layout/base/crashtests/403175-1.html
Summary: setTimeout-heavy crashtests time out occasionally (e.g. 458637-1.html, 403175-1.html, 399712-1.html) → setTimeout-heavy crashtests time out occasionally
Whiteboard: [orange] [notacrash] → [orange][3 tests disabled][notacrash]

Comment 33

9 years ago
A fix for bug 558306 landed today.  I wonder if it could have been the cause.  (Do Tinderbox machines run NTP clients and occasionally go backwards in time?)
Whiteboard: [orange][3 tests disabled][notacrash] → [3 tests disabled][notacrash]
Priority: -- → P5

I stumbled upon this bug recently and decided to see what the current state of things was. Of the 3 disabled tests, only bug 458637-1.html appears to cause issues anymore (intermittent assertions on debug builds and timeouts on Android). I'm going to close this bug out, re-enable the tests (except as noted for bug 458637-1.html previously), and have opened bug 1519651 for any follow-ups there.

Status: REOPENED → RESOLVED
Last Resolved: 10 years ago3 months ago
Resolution: --- → WORKSFORME
Component: DOM → DOM: Core & HTML
Product: Core → Core
You need to log in before you can comment on or make changes to this bug.