Last Comment Bug 637101 - (Try to) Ensure screenshots on test timeouts are not taken too late
: (Try to) Ensure screenshots on test timeouts are not taken too late
Status: RESOLVED FIXED
: autotest-issue
Product: Testing
Classification: Components
Component: Mochitest (show other bugs)
: Trunk
: All All
: P2 normal (vote)
: mozilla14
Assigned To: Serge Gautherie (:sgautherie)
:
Mentors:
Depends on: 414049
Blocks: 736363 593491 629996 633725 647323 679848 687972 706534
  Show dependency treegraph
 
Reported: 2011-02-26 21:46 PST by Serge Gautherie (:sgautherie)
Modified: 2012-04-25 11:33 PDT (History)
3 users (show)
bugzillamozillaorg_serge_20140323: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
wontfix
wontfix
fixed
wontfix


Attachments
Patch using setTimeout to ensure flush. (1.10 KB, patch)
2012-04-02 15:42 PDT, Sam Garrett
bugzillamozillaorg_serge_20140323: review-
Details | Diff | Review
(Bv1) Add a 1 second delay before calling killTest(), Fix bug 647323 regression, Remove extra ';' and spaces [Checked in: Comment 17] (5.87 KB, patch)
2012-04-04 12:17 PDT, Serge Gautherie (:sgautherie)
ted: review+
Details | Diff | Review
(Cv1-ab) Add a 1 second delay before calling finish(), Remove extra ';' and spaces [Checkin: Comment 20] (5.19 KB, patch)
2012-04-05 17:09 PDT, Serge Gautherie (:sgautherie)
akeybl: approval‑mozilla‑aurora+
akeybl: approval‑mozilla‑beta+
Details | Diff | Review

Description Serge Gautherie (:sgautherie) 2011-02-26 21:46:46 PST
(I had already noticed a bug where the screenshot appeared to be taken too late.
I just noticed this had been happening for bug 633725. This time, the "more explicit" picture [not copied/attached here, not to bloat] gave me a hint at what the cause could be :->)

Bug 414049 added a screenshot feature at automation layer,
which probably works great when test harness has hung or the like:
http://mxr.mozilla.org/comm-central/source/mozilla/build/automation.py.in#766
727   def waitForFinish(self, proc, utilityPath, timeout, maxTime, startTime, debuggerInfo, symbolsPath):
760       while line != "" and not didTimeout:
766         if self.UNIXISH and not debuggerInfo and not self.haveDumpedScreen and "TEST-UNEXPECTED-FAIL" in line and "Test timed out" in line:
767           self.dumpScreen(utilityPath)

But, in the cases I noticed, only one test times out then mochitest harness carries on:
http://mxr.mozilla.org/comm-central/source/mozilla/testing/mochitest/tests/SimpleTest/TestRunner.js#68
62 TestRunner._checkForHangs = function() {
65     if (runtime >= TestRunner.timeout * TestRunner._timeoutFactor) {
68       frameWindow.SimpleTest.ok(false, "Test timed out.");
81       frameWindow.SimpleTest.finish();

My first thought would be to use a (not too) short setTimeout(finish) to let the right image be captured (most of the times).
Comment 1 Serge Gautherie (:sgautherie) 2011-09-07 06:40:00 PDT
(In reply to Serge Gautherie (:sgautherie) from comment #0)
> (I had already noticed a bug where the screenshot appeared to be taken too
> late.

Adding bug 629996 dependency, which hopefully was that bug.
Comment 2 Serge Gautherie (:sgautherie) 2012-03-09 04:51:15 PST
Sam, would you be interested in fixing this bug?
Comment 3 Sam Garrett 2012-03-16 13:28:14 PDT
(In reply to Serge Gautherie (:sgautherie) from comment #2)
> Sam, would you be interested in fixing this bug?

Sure I can take care of it.
Comment 4 Sam Garrett 2012-03-19 14:37:39 PDT
(In reply to Serge Gautherie (:sgautherie) from comment #0)
> My first thought would be to use a (not too) short setTimeout(finish) to let
> the right image be captured (most of the times).

Do you want me to just make the timeout here (http://mxr.mozilla.org/comm-central/source/mozilla/testing/mochitest/tests/SimpleTest/TestRunner.js#116) shorter then or what?

Also can you assign the bug to me?
Comment 5 Serge Gautherie (:sgautherie) 2012-03-19 15:45:36 PDT
(In reply to Sam Garrett from comment #4)

> Do you want me to just make the timeout here
> (http://mxr.mozilla.org/comm-central/source/mozilla/testing/mochitest/tests/
> SimpleTest/TestRunner.js#116) shorter then or what?

No, the _checkForHangs() "loop" is just fine.

What is needed is:
*"flush()" the output, if that's possible.
*add a little delay so automation.py has time to notice and process "Test timed out".

> Also can you assign the bug to me?

I would rather do that when a working patch has been attached.
Comment 6 Sam Garrett 2012-03-26 18:18:18 PDT
What I have right now is just replacing:

frameWindow.SimpleTest.finish();

with:

setTimeout(frameWindow.SimpleTest.finish, 1000);

Is there a way for me to explicitly flush the output? I looked through SimpleTest.js and didn't see anything. Or even check that automation.py is done?
Comment 7 Serge Gautherie (:sgautherie) 2012-03-29 09:07:00 PDT
(In reply to Sam Garrett from comment #6)

> setTimeout(frameWindow.SimpleTest.finish, 1000);

1 second seems good (for a start).
Add a comment to explain why the delay.

> Is there a way for me to explicitly flush the output? I looked through
> SimpleTest.js and didn't see anything.

There is none, afaict.
http://mxr.mozilla.org/comm-central/source/mozilla/testing/mochitest/tests/SimpleTest/MozillaLogger.js
https://developer.mozilla.org/en/DOM/window

> Or even check that automation.py is done?

No, afaik.
Comment 8 Sam Garrett 2012-04-02 15:42:53 PDT
Created attachment 611623 [details] [diff] [review]
Patch using setTimeout to ensure flush.

Here is the patch using setTimeout. I noticed that we now have the killTest function so I added it there.
Comment 9 Serge Gautherie (:sgautherie) 2012-04-04 09:19:50 PDT
Comment on attachment 611623 [details] [diff] [review]
Patch using setTimeout to ensure flush.

Sorry, taking over: I'm spending more time trying to explain than it's needed to fix this bug :-|
Comment 10 Serge Gautherie (:sgautherie) 2012-04-04 12:17:18 PDT
Created attachment 612290 [details] [diff] [review]
(Bv1) Add a 1 second delay before calling killTest(), Fix bug 647323 regression, Remove extra ';' and spaces
[Checked in: Comment 17]

Sam, I'm told my comment 9 may seem rude and I apologize for that:
I simply thought more effective at this point to attach a patch and let you check it (by example) than try to do more explanations wrt this very bug :-|
I'm pleased you were interested in it and I hope you can fix other bugs as you fixed bug 637103 :-)
Comment 11 Serge Gautherie (:sgautherie) 2012-04-04 13:39:37 PDT
Comment on attachment 612290 [details] [diff] [review]
(Bv1) Add a 1 second delay before calling killTest(), Fix bug 647323 regression, Remove extra ';' and spaces
[Checked in: Comment 17]

Per bug 687972 comment 630, it looks like this patch did help :-)
(Unless I was just (too) lucky.)
Comment 12 Mozilla RelEng Bot 2012-04-04 14:04:37 PDT
Try run for 3844b13004f6 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=3844b13004f6
Results (out of 120 total builds):
    success: 109
    warnings: 11
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/sgautherie.bz@free.fr-3844b13004f6
Comment 13 Serge Gautherie (:sgautherie) 2012-04-04 17:55:39 PDT
(In reply to Mozilla RelEng Bot from comment #12)
>     https://tbpl.mozilla.org/?tree=Try&rev=3844b13004f6

Succeeded, including one random timeout.
Comment 14 Mozilla RelEng Bot 2012-04-04 18:04:52 PDT
Try run for 85e1ded6cda4 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=85e1ded6cda4
Results (out of 31 total builds):
    success: 17
    warnings: 14
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/sgautherie.bz@free.fr-85e1ded6cda4
Comment 15 Mozilla RelEng Bot 2012-04-04 18:04:54 PDT
Try run for 7ae3d02e3b08 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=7ae3d02e3b08
Results (out of 31 total builds):
    exception: 1
    success: 16
    warnings: 14
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/sgautherie.bz@free.fr-7ae3d02e3b08
Comment 16 Serge Gautherie (:sgautherie) 2012-04-04 18:34:18 PDT
(In reply to Mozilla RelEng Bot from comment #15)
>     https://tbpl.mozilla.org/?tree=Try&rev=7ae3d02e3b08

Forced timeout, without fix: all "too late", except "Rev3 Fedora 12 try opt test".


(In reply to Mozilla RelEng Bot from comment #14)
>     https://tbpl.mozilla.org/?tree=Try&rev=85e1ded6cda4

Forced timeout, with fix: all "good" :-)
Comment 17 Serge Gautherie (:sgautherie) 2012-04-05 12:06:16 PDT
Comment on attachment 612290 [details] [diff] [review]
(Bv1) Add a 1 second delay before calling killTest(), Fix bug 647323 regression, Remove extra ';' and spaces
[Checked in: Comment 17]

https://hg.mozilla.org/mozilla-central/rev/b7adf1bde2f6
Comment 18 Serge Gautherie (:sgautherie) 2012-04-05 17:09:03 PDT
Created attachment 612748 [details] [diff] [review]
(Cv1-ab) Add a 1 second delay before calling finish(), Remove extra ';' and spaces [Checkin: Comment 20]

Bv1, for Aurora (and Beta).


[Approval Request Comment]
Regression caused by (bug #): Bug 414049.
User impact if declined: None, but useless screenshot on test timeout.
Testing completed (on m-c, etc.): Try and this comment.
Risk to taking this patch (and alternatives if risky): None, test-only.
String changes made by this patch: None.
Comment 19 Alex Keybl [:akeybl] 2012-04-09 12:57:20 PDT
Comment on attachment 612748 [details] [diff] [review]
(Cv1-ab) Add a 1 second delay before calling finish(), Remove extra ';' and spaces [Checkin: Comment 20]

[Triage Comment]
Test-only change. Approved for Aurora 13 and Beta 12.
Comment 20 Jens Hatlak (:InvisibleSmiley) 2012-04-23 13:42:18 PDT
Comment on attachment 612748 [details] [diff] [review]
(Cv1-ab) Add a 1 second delay before calling finish(), Remove extra ';' and spaces [Checkin: Comment 20]

http://hg.mozilla.org/releases/mozilla-aurora/rev/551fcc0596d1
Comment 21 Serge Gautherie (:sgautherie) 2012-04-24 12:25:12 PDT
Missed Gecko 12 :-|
Comment 22 Serge Gautherie (:sgautherie) 2012-04-25 11:33:21 PDT
Ftr (that may be unrelated, but just in case),
I wonder if this somehow had the (positive) side-effect of letting "too slow" (!= hang) test complete:
I have recently noticed a couple "false positive" timeouts on SeaMonkey where the test actually continues after the timeout is logged (and screenshot taken) then ends with no (other) failure in about 5mn(30s).

One example:
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Aurora/1335354618.1335357762.3281.gz
WINNT 5.2 comm-aurora debug test mochitests-4/5 on 2012/04/25 04:50:18
s: cn-sea-qm-win2k3-01
{
109290 INFO TEST-PASS | /tests/layout/style/test/test_value_computation.html | should not get initial value for 'visibility:collapse' on elementf. - collapse should not equal visible
109291 ERROR TEST-UNEXPECTED-FAIL | /tests/layout/style/test/test_value_computation.html | Test timed out.
[...]
109292 INFO TEST-PASS | /tests/layout/style/test/test_value_computation.html | should be testing with values that compute to different things for 'white-space' - pre should not equal normal
[...]
109540 INFO TEST-END | /tests/layout/style/test/test_value_computation.html | finished in 325202ms
109541 INFO TEST-START | /tests/layout/style/test/test_value_storage.html
}

Note You need to log in before you can comment on or make changes to this bug.