Closed Bug 895376 Opened 11 years ago Closed 11 years ago

Distinguish real timeouts from long running tests in mochitest-browser

Tracking

(Not tracked)

Status:

RESOLVED FIXED

Milestone:

mozilla25

People

(Reporter: mak, Assigned: mak)

References

Details

Attachments

(1 file, 3 obsolete files)

895376.diff 11 years ago Marco Bonardo [:mak] (Away Apr 25 - May 5) 5.89 KB, patch		Details \| Diff \| Splinter Review
patch v1.1 11 years ago Marco Bonardo [:mak] (Away Apr 25 - May 5) 6.80 KB, patch		Details \| Diff \| Splinter Review
patch v1.2 11 years ago Marco Bonardo [:mak] (Away Apr 25 - May 5) 6.79 KB, patch	Gavin : review+	Details \| Diff \| Splinter Review
patch v1.3 11 years ago Marco Bonardo [:mak] (Away Apr 25 - May 5) 8.95 KB, patch		Details \| Diff \| Splinter Review

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Assignee

Description

•

11 years ago

After various cases, I figured our current way of handling timeouts in mochitest-browser is not very good.
The problem is that after a given amount of time (multiple of 30 seconds slots) we just kill the test and notify timeout.
This has 2 negative effects:
1. timing out a running test we may leave the next tests in a bad shape.
2. we timeout regardless the test being really hanged, or just slow. This makes very often hard to distinguish between real test hangs from tests taking longer than expected. I saw different bugs disabling timing out tests, or waiting weeks to be investigated, when would have been enough to add a requestLongerTimeout call to fix the orange in the short term.

Thus, I'd like to change the behavior so that if a test is taking longer than expected but it's not hanged, we print out an error but let the test continue.
On the other side, if the test is really hanging, we proceed with current timeout.
To detect if the test is hanging or not I'd just use output, if the test printed a success/failure/info less than 30s before the timeout, it's pretty clear we are killing an alive test.  We can still maintain a protection against a looping test that could print info in a loop, like a 5 minutes limit (some tests are using a requestLongerTimeout(10)).

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Assignee

Comment 1

•

11 years ago

Attached patch 895376.diff (obsolete) — Details — Splinter Review

I still want to get a Try run, in the meanwhile I'd like to get an opinion.

Attachment #777752 - Flags: feedback?(gavin.sharp)

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Assignee

Comment 2

•

11 years ago

The only doubt I still have, is whether this may be confusing in case after some delays the test will really timeout.
something like:
ERROR: test is taking too much time, please requestLongerTimeout and investigate
ERROR: test is taking too much time, please requestLongerTimeout and investigate
ERROR: test is taking too much time, please requestLongerTimeout and investigate
ERROR: timed out

I was wondering if I should silently mark the fact, and when the test finishes, if it didn't timeout, throw the error. That would leave less space for misunderstanding.

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Assignee

Comment 3

•

11 years ago

Attached patch patch v1.1 (obsolete) — Details — Splinter Review

Yeah, this is better.

Attachment #777752 - Attachment is obsolete: true

Attachment #777752 - Flags: feedback?(gavin.sharp)

Attachment #777764 - Flags: feedback?(gavin.sharp)

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Assignee

Comment 4

•

11 years ago

Attached patch patch v1.2 (obsolete) — Details — Splinter Review

cleanup, sorry for spam.
now going to try.

Attachment #777764 - Attachment is obsolete: true

Attachment #777764 - Flags: feedback?(gavin.sharp)

Attachment #777770 - Flags: review?(gavin.sharp)

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Assignee

Comment 5

•

11 years ago

https://tbpl.mozilla.org/?tree=Try&rev=ac6fdefc0c44

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Assignee

Comment 6

•

11 years ago

unlucky base changeset https://tbpl.mozilla.org/?tree=Try&rev=15823d07c1a6

:Gavin Sharp [email: gavin@gavinsharp.com]

Comment 7

•

11 years ago

Comment on attachment 777770 [details] [diff] [review]
patch v1.2

>diff --git a/testing/mochitest/browser-harness.xul b/testing/mochitest/browser-harness.xul

>       addResult: function addResult(result) {
>+        this.lastOutputTime = Date.now();

Hmm, I think there are cases where we call addResult that aren't good representations of test activity (e.g. console messages), but I suppose this is probably close enough.

>diff --git a/testing/mochitest/browser-test.js b/testing/mochitest/browser-test.js

>+      // Notify a long running test problem if it didn't end up in a timeout.
>+      if (this.currentTest.unexpectedTimeouts && !this.currentTest.timedOut) {
>+        let msg = "The test took more time than expected, please " +
>+                  "requestLongerTimeout for it and file a bug to investigate " +
>+                  "why it's so slow.";

I would rephrase:
"Test execution time exceeded the timeout threshold. This test should be rewritten or split up to avoid taking so long to run. If that's not possible, use requestLongerTimeout(), but only as a last resort."

>+        if (Date.now() - self.currentTest.lastOutputTime < (gTimeoutSeconds / 2) * 1000 &&
>+            ++self.currentTest.unexpectedTimeouts <= MAX_UNEXPECTED_TIMEOUTS) {
>+            self.currentTest.scope.__waitTimer =
>+              setTimeout(arguments.callee, gTimeoutSeconds * 1000);
>+          return;

Not your bug, but I would take this as an opportunity to get rid of arguments.callee (use a named function).

Attachment #777770 - Flags: review?(gavin.sharp) → review+

:Gavin Sharp [email: gavin@gavinsharp.com]

Comment 8

•

11 years ago

Sheriff heads-up: we're going to change browser-chrome test timeout behavior to be a bit more forgiving (which will sometimes result in timeouts failing with a different error message).

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Assignee

Comment 9

•

11 years ago

(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #7)
> Hmm, I think there are cases where we call addResult that aren't good
> representations of test activity (e.g. console messages), but I suppose this
> is probably close enough.

Yes, I thought about that, though that kind of activity would not happen after 15 seconds of inactivity, would rather happen after something initiated by the test, so it's still good activity... Unless there's a loop and then we still have the protection.

cmtalbert

Comment 10

•

11 years ago

This sounds like a great change. 

One question for Ed: If we change the timeout message, then do we need to change the orange suggestion regexes in tbpl? If so, can we get a bug on file for that please (if you haven't already).

Flags: needinfo?(emorley)

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Assignee

Comment 11

•

11 years ago

Attached patch patch v1.3 — Details — Splinter Review

Fix the comment, fix arguments.callee, and a strict warning about function declaration not at top level

Attachment #777770 - Attachment is obsolete: true

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Assignee

Comment 12

•

11 years ago

Notice that we are not changing the timeout message, the timeout message is still the same.
We are distinguishing real time outs from tests taking too long. The former stays the same, the latter will now get a dedicated error.

:Gavin Sharp [email: gavin@gavinsharp.com]

Comment 13

•

11 years ago

To be clear, we're not changing the timeout message for tests that actually time out. We are introducing another class of "timeouts", though, that use a different message to indicate that the test took too much time despite continuing to produce output and eventually passing (i.e. to identify tests that just need a requestLongerTimeout or to be split up, most likely).

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Assignee

Updated

•

11 years ago

Summary: Change timeout behavior of browser-chrome tests → Distinguish real timeouts from long running tests in mochitest-browser

Ed Morley [:emorley]

Comment 14

•

11 years ago

Looks great to me - thank you for the heads up! :-)

Flags: needinfo?(emorley)

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Assignee

Comment 15

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/6d5466692056

Target Milestone: --- → mozilla25

Ryan VanderMeulen [:RyanVM]

Comment 16

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/6d5466692056

Status: ASSIGNED → RESOLVED

Closed: 11 years ago

Flags: in-testsuite+

Resolution: --- → FIXED

Ed Morley [:emorley]

Comment 17

•

11 years ago

Example new intermittent: bug 896967.

Are the numerous failures (of form "This test exceeded the timeout threshold. It should be rewritten or split up. If that's not possible, use requestLongerTimeout(N), but only as a last resort.") in later tests expected?

Marco Castelluccio [:marco]

Comment 18

•

11 years ago

Could this have caused:
"TEST-INFO | checking window state
TEST-UNEXPECTED-FAIL | (browser-test.js) | Found an unexpected"

I get this error while running the webapprt-test-chrome tests (and I didn't have problems in early July)

Nobody; OK to take it and work on it

Updated

•

6 years ago

Component: BrowserTest → Mochitest

You need to log in before you can comment on or make changes to this bug.