Closed Bug 478165 Opened 16 years ago Closed 7 years ago

TinderboxPrint'ing both FAIL and timeout :-/

Categories

(Testing :: General, defect, P5)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: sgautherie, Unassigned)

References

()

Details

(Whiteboard: [unittest])

Attachments

(1 file)

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1234413647.1234416025.17818.gz Linux mozilla-central unit test on 2009/02/11 20:40:47 { TinderboxPrint: mochitest<br/><em class="testfail">FAIL</em> NEXT ERROR buildbot.slave.commands.TimeoutError: command timed out: 300 seconds without output, killing pid 868 TinderboxPrint: mochitest <em class="testfail">timeout</em><br/> } 1) (nit) Based on the other TinderboxPrints format, the timeout |<br/>| should move and replace the space. 2) (core issue) The first TinderboxPrint comes from MozillaMochitest.createSummary(), and was added by http://hg.mozilla.org/build/buildbotcustom/rev/edad22775516 (but "this" code already existed in the other tests ... so all tests should be affected.) The second TinderboxPrint comes from ShellCommandReportTimeout.evaluateCommand(). Trouble is we should have one "TinderboxPrint: mochitest[...]" only. (In this case, I think we should TinderboxPrint the timeout report only.)
Isn't a timeout a failure?
Sure, but a "FAIL" is not a "timeout", etc. Not really wrong, but quite confusing yet, I say.
Summary: TinderboxPrint'ing both FAIL and timeout now :-/ → TinderboxPrint'ing both FAIL and timeout :-/
Not every FAIL is a timeout, no, but every timeout *is* a FAIL. Have you seen a case where it prints timeout when it's not really a timeout?
(In reply to comment #3) > Not every FAIL is a timeout, no, but every timeout *is* a FAIL. Obviously ! (Then what ?) > Have you seen a case where it prints timeout when it's not really a timeout? Not that I am aware of. (Should I ?)
(In reply to comment #4) > > Have you seen a case where it prints timeout when it's not really a timeout? > > Not that I am aware of. (Should I ?) What's the bug here the? If timeout and FAIL are printed when we timeout, and only FAIL is printed on regular failures, I don't see a problem.
Current display is { mochitest FAIL mochitest timeout } Before looking closely, it looks like 2 tests suites had a failure, not 1 only. Reporting 2 keywords looks like there would have been 2 failures, not 1 only. This is different than getting "mochitest FAIL LEAK" (= a failure which caused a leak) for example. The current report is not technically wrong, but sure is not user friendly.
OK, are you going to take this on?
Ftb, I would need someone to tell me what the best way to fix this is. Currently, I plan to work on bug 473259 first.
Moving to future for now.
Component: Release Engineering → Release Engineering: Future
There's a worse case too: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1234793738.1234798260.27879.gz&fulltext=1 Linux mozilla-central unit test on 2009/02/16 06:15:38 { TinderboxPrint: TUnit<br/>492/0 NEXT ERROR buildbot.slave.commands.TimeoutError: command timed out: 300 seconds without output, killing pid 31120 TinderboxPrint: check <em class="testfail">timeout</em><br/> } http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1234793739.1234798354.28329.gz&fulltext=1 WINNT 5.2 mozilla-central unit test on 2009/02/16 06:15:39 { TinderboxPrint: TUnit<br/>530/0 buildbot.slave.commands.TimeoutError: command timed out: 300 seconds without output TinderboxPrint: check <em class="testfail">timeout</em><br/> } http://tinderbox.mozilla.org/showlog.cgi?log=Thunderbird/1234793858.1234797392.23947.gz&fulltext=1 Linux comm-central check on 2009/02/16 06:17:38 { TinderboxPrint: TUnit<br/>375/0 buildbot.slave.commands.TimeoutError: command timed out: 1800 seconds without output, killing pid 31324 TinderboxPrint: check <em class="testfail">timeout</em><br/> } 1) 'TUnit'/'check' mismatch, which *is* confusing, as it *does* looks like 2 different tests suites. 2) 1st line seems to report success and 2nd failure: can't be more confusing than that...
Fixes comment 0 point 1 and comment 10 point 1. (untested, but obvious)
Attachment #362591 - Flags: review?
Attachment #362591 - Flags: review? → review?(ccooper)
Attachment #362591 - Flags: review?(ccooper) review?(ccooper)
Attachment #362591 - Flags: review+
Attachment #362591 - Flags: checked‑in+
Attachment #362591 - Flags: checked‑in+ review+
Comment on attachment 362591 [details] [diff] [review] (Av1) Fix timeout error format and unify test names [Checkin: Comment 12] changeset: 192:4df7f3be22bd
Comment 10 example with Mochitest: { http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1237951288.1237963779.7861.gz Linux comm-central dep unit test on 2009/03/24 20:21:28 mochitest 35015/9/1062 mochitest timeout } Note that mochitest total number should be more like 75k than 35k. It looks like: 1) the test harness aborted the test suite with |Too many test timeouts, giving up.|. 2) then printed the figures it had so far, then |SimpleTest FINISHED|. 3) then buildbot detected an additional timeout ... as if runtests.py had failed to actually exit ... which reminds of { http://mxr.mozilla.org/mozilla-central/source/testing/mochitest/runtests.py.in 474 status = automation.runApp(testURL, browserEnv, options.app, 485 server.stop() 487 automation.processLeakLog(LEAK_REPORT_FILE, options.leakThreshold) 492 # hanging due to non-halting threads is no fun; assume we hit the errors we 493 # were going to hit already and exit with a success code 494 sys.exit(0) } Note that the leak report is missing, as if server.stop() call never returned. To summarize, I see 3 issues: 1- this bug, TinderboxPrinting both the figures and timeout (separately). 2- maybe the not-run tests number should be added to the ToDo (or Fail) number by the test harness, or something. 3- the server (thread) not stoppping.
Mass move of bugs from Release Engineering:Future -> Release Engineering. See http://coop.deadsquid.com/2010/02/kiss-the-future-goodbye/ for more details.
Component: Release Engineering: Future → Release Engineering
Priority: -- → P3
Priority: P3 → P5
Serge: are you still planning on working on this? Still working on bug 473259 first? No one else is clamoring for a fix here, so it's all on you. I'm happy to do reviews though.
Attachment #362591 - Attachment description: (Av1) Fix timeout error format and unify test names → (Av1) Fix timeout error format and unify test names [Checkin: Comment 12]
(In reply to comment #15) > Serge: are you still planning on working on this? Still working on bug 473259 > first? I'm still interested in all these bugs, but I +/- gave up a year ago. Very recently, I got some progress on bug 427500 (as an example), so I'm not totally stalled anymore, though I've also updated real life to higher priority...
Whiteboard: [unittest]
Assignee: nobody → ccooper
(In reply to comment #13) > To summarize, I see 3 issues: > 1- this bug, TinderboxPrinting both the figures and timeout (separately). > 2- maybe the not-run tests number should be added to the ToDo (or Fail) number > by the test harness, or something. > 3- the server (thread) not stoppping. Based on this, I'm going to file a new bug for #1, and move this bug over to the Testing product to sort out #2 and #3.
Assignee: coop → nobody
Component: Release Engineering → General
Flags: checked-in+
Product: mozilla.org → Testing
QA Contact: release → general
Version: other → Trunk
Mass closing bugs with no activity in 2+ years. If this bug is important to you, please re-open.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: