Closed Bug 443090 Opened 16 years ago Closed 16 years ago

Unify logging unittest failures during a build

Categories

(Release Engineering :: General, defect)

defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mtschrep, Assigned: ted)

References

Details

(Whiteboard: [deployed, needs verification])

Attachments

(6 files, 2 obsolete files)

We already post-process to get the failed/skiped/run count but it is non-trivial to answer basic questions like does test x fail only on windows because we do not have the testname in ever test failure and we don't log in any centralized fashion. 

Lukas has been playing with splunk to get us better visibility into the unit test failures to separate out system from test issues.   

I'm likely missing some really dumb stuff - so forgive me - but it seems with a bit of work we can make the process of finding unit test failures much easier.  If we make sure every test failure logs:

1) Clear indication of failure (looks like even this varies from UNEXPECTED to FAILURE, etc)

2) Test Name

3) Any additional information

eg:

TEST FAILURE | TEST NAME | ADDITIONAL INFORMATION

We should be able to quickly modify the test suites to do this:

1) Reftests we can do this automatically in the framework (reftest failures already report the file

2) For XPCShell we can walk the stackframe (http://mxr.mozilla.org/mozilla-central/source/tools/test-harness/xpcshell-simple/head.js#95) to find the name of the file in which the test failed

3) MochiTests we should be able to get the framework to do it as well

4) C++ tests we'll just have to have some style recommendations

(junit/junitreport does this really well)

We can then simply grep the logs, and assuming we know the machine name can dump basic structured log info somewhere:

BUILDID/BUILDTIME | TESTTIME | TESTNAME | MACHINE(from this we can map to OS)
cc'd original authors of each test suite to see if we can get some help getting every test failure to log the basics in a consistent fashion
Yes, please. Note that we'll have to coordinate changes to log output with changes to the unittest buildbot log scrapers, and the Tinderbox error parser.
DBaron just educated me that there are a total of 2 test states (PASS/FAIL) and 3 expectations (PASS/FAIL/RANDOM) leading to a total of 6 output states.

ExpectedPass->Pass
ExpectedPass->Fail
ExpectedFail->Pass
ExpectedFail->Fail
Random->Pass
Random->Pass

(better explained: http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftest.js#453 )

So we just need to account for all 6 states:

TEST-PASS
TEST-UNEXPECTED-FAIL
TEST-UNEXPECTED-PASS
TEST-KNOWN-FAIL
TEST-PASS(EXPECTED RANDOM)
TEST-FAIL(EXPECTED RANDOM)

Followed by a test identifier (file name or test name) followed by additional debugging output
Note if we log all test results into a separate file we don't have to have a unique test identifier.  If we don't/can't than we want some hard-to-mistake prefix to make parsing fullproof

(EG MOZ-AUTOMATED-TEST-*)
Ted can you help shepard this bug to completion?
Assignee: nobody → ted.mielczarek
Blocks: 443323
This bug is just about getting our test suites to all report their test results in a consistent manner - consolidated reporting is split out into bug 443329 
Summary: Unify logging and reporting for unittest failures → Unify logging unittest failures during a build
This reformats the output of the browser chrome tests. Gavin: the hunks at the bottom are just adding some semicolons to fix JS2-mode warnings, and JS2-mode apparently removing some trailing whitespace for me.

Here's some sample output:

TEST-UNEXPECTED-FAIL | chrome://mochikit/content/browser/testing/mochitest/tests/browser/browser_pass.js | pass ok
TEST-PASS | chrome://mochikit/content/browser/testing/mochitest/tests/browser/browser_pass.js | pass is - Got true, expected true
TEST-UNEXPECTED-PASS | chrome://mochikit/content/browser/testing/mochitest/tests/browser/browser_pass.js | pass todo
TEST-KNOWN-FAIL | chrome://mochikit/content/browser/testing/mochitest/tests/browser/browser_pass.js | pass todo_is - Got false, expected true
Attachment #328052 - Flags: review?(gavin.sharp)
Comment on attachment 328052 [details] [diff] [review]
reformat browser-chrome test output

>diff --git a/testing/mochitest/browser-harness.xul b/testing/mochitest/browser-harness.xul

> function testResult(aCondition, aName, aDiag, aIsTodo) {

>+  if (aDiag)
>+    this.msg += " - " + aDiag;

r=me with this moved into the !this.pass branch as discussed on IRC.
Attachment #328052 - Flags: review?(gavin.sharp) → review+
Ted - this is awesome are you going to cover all the other test suites as well or do you need help?
Contains the previous patch (with Gavin's suggested change) + the changes for Mochitest. Mochitest output looks like this:
*** 2 INFO TEST-PASS | /tests/MochiKit_Unit_Tests/test_MochiKit-Async.html | this should pass
*** 3 ERROR TEST-UNEXPECTED-FAIL | /tests/MochiKit_Unit_Tests/test_MochiKit-Async.html | this should fail
*** 4 INFO TEST-KNOWN-FAIL | /tests/MochiKit_Unit_Tests/test_MochiKit-Async.html | todo
*** 5 ERROR TEST-UNEXPECTED-PASS | /tests/MochiKit_Unit_Tests/test_MochiKit-Async.html | passing a todo
*** 6 INFO TEST-PASS | /tests/MochiKit_Unit_Tests/test_MochiKit-Async.html | this should pass
*** 7 ERROR TEST-UNEXPECTED-FAIL | /tests/MochiKit_Unit_Tests/test_MochiKit-Async.html | this should fail - got 1, expected 2
Attachment #328052 - Attachment is obsolete: true
Attachment #328108 - Flags: review?(sayrer)
I dunno who actually owns xpcshell-simple, but robcee wfm!

Sample output:
TEST-UNEXPECTED-FAIL | ../../../../_tests/xpcshell-simple/test_harness_xpcshell_simple/unit/test_sample.js | test failed, see log
../../../../_tests/xpcshell-simple/test_harness_xpcshell_simple/unit/test_sample.js.log:
>>>>>>>
*** test pending
*** exiting
*** TEST-UNEXPECTED-FAIL | ../../../../_tests/xpcshell-simple/test_harness_xpcshell_simple/unit/test_sample.js | 57 == 58
JS frame :: d:/build/mozilla-central/tools/test-harness/xpcshell-simple/head.js :: do_throw :: line 101
Attachment #328144 - Flags: review?(rcampbell)
Comment on attachment 328144 [details] [diff] [review]
[checked in] xpcshell test output changes

Apparently I own this, I'm not sure when that happened.
Attachment #328144 - Flags: review?(rcampbell) → review?(benjamin)
Attachment #328108 - Flags: review?(sayrer) → review+
Attached patch reftest output patch (obsolete) — Splinter Review
Changes reftest output, fixes perl scripts.

Sample output:
REFTEST TEST-PASS | data:text/html,<body> | 
REFTEST TEST-PASS | data:text/plain, | 
REFTEST TEST-PASS | data:text/plain,HELLO | (!=) 
REFTEST TEST-UNEXPECTED-FAIL | data:text/plain,HELLO | 
(image data URIs elided)
REFTEST TEST-FAIL(EXPECTED RANDOM) | data:text/plain,HELLO | 
REFTEST TEST-PASS(EXPECTED RANDOM) | data:text/plain, | 
REFTEST TEST-KNOWN-FAIL | data:text/plain,HELLO | 
REFTEST TEST-UNEXPECTED-PASS | data:text/html,<body> | 
REFTEST TEST-KNOWN-FAIL | data:text/plain, | (SKIP)
Attachment #328157 - Flags: review?(dbaron)
Comment on attachment 328157 [details] [diff] [review]
reftest output patch

Could you also update reftest-analyzer.xhtml ?

Did you test the updates to the perl scripts?  

(I'd note that the code inside the additional if-else case in reftest-to-html and the one after it has some redundancy, since randomresult is guaranteed false for the first and true for the second; probably worth at least leaving a comment pointing that out if not cleaning it up.)
Yeah, I realized I forgot reftest-analyzer after the fact. I did test the perl scripts, with all combinations of output (I believe), and they looked correct.

Yeah, that last one is a little weird, it makes me think that we should just use TEST-UNEXPECTED-FAIL(EXPECTED RANDOM) for the output in that case. I could simplify that branch a bit. I'll get that updated when I get back to my computer...
(In reply to comment #15)
> Yeah, that last one is a little weird, it makes me think that we should just
> use TEST-UNEXPECTED-FAIL(EXPECTED RANDOM) for the output in that case. I could
> simplify that branch a bit. I'll get that updated when I get back to my
> computer...

That could be misleading if "UNEXPECTED" is what you're look for to find the thing that turned tinderbox orange...
Re-reading the code, I think I meant TEST-KNOWN-FAIL(EXPECTED RANDOM), which would let me join those two branches.
Reftest changes, includes reftest-analyzer changes. I changed the output in the case I mentioned to
TEST-KNOWN-FAIL(EXPECTED RANDOM). I think that's more consistent with the other output. Reftest is currently our only test suite with an expected random state, so the other patches should be fine.

dbaron: I noticed while editing reftest-analyzer, that in the build_viewer function, there's a variable called "rowclass" that's declared and assigned, but never used. Is that just an oversight, or intentional?
Attachment #328157 - Attachment is obsolete: true
Attachment #328326 - Flags: review?(dbaron)
Attachment #328157 - Flags: review?(dbaron)
This changes the output for runtests.py's leak logging. This will fix bug 441550 as a side effect.
Attachment #328352 - Flags: review?(sayrer)
Attachment #328352 - Flags: review?(sayrer) → review+
This makes the tinderbox unittest error parser handle this new output. I didn't fix the "link to file" bits because tinderbox just links to bonsai, which doesn't work with Mercurial anyway, so it seemed sort of pointless.
Attachment #328359 - Flags: review?(cls)
Do you need to also change has_errorline for tinderbox?
has_errorline is the "link to file" bits I mentioned above. Sorry, could have been clearer there. That's just for getting links back to bonsai for filenames mentioned in an error, but links to bonsai aren't going to help us in mercurial anyway.
Comment on attachment 328326 [details] [diff] [review]
[checked in] reftest output patch, take 2

r=dbaron

Could you make the line you added in reftest-analyzer.xhtml use tabs like the surrounding lines?
Attachment #328326 - Flags: review?(dbaron) → review+
Ok, here's the unittest buildbot bits. This hasn't been tested, of course, Lukas has agreed to be the guinea pig here and test this on a staging system. In the meantime I'll get the test suite changes pushed to a repo that she can pull from to test with.
Attachment #328527 - Flags: review?(rcampbell)
Attachment #328144 - Flags: review?(benjamin) → review+
lsblakk:
http://hg.mozilla.org/users/tmielczarek_mozilla.com/index.cgi/unittest-unified-output/

You can build from there, and apply the buildbotcustom patch above locally to test.
Attachment #328359 - Flags: review?(cls) → review+
Comment on attachment 328359 [details] [diff] [review]
[checked in] handle new error output in tinderbox error parser

This is checked in, but we'll need IT to update the production tinderbox.
Attachment #328359 - Attachment description: handle new error output in tinderbox error parser → [checked in] handle new error output in tinderbox error parser
Comment on attachment 328527 [details] [diff] [review]
[checked in] handle new error output in buildbot unittest steps

looks good to me.
Attachment #328527 - Flags: review?(rcampbell) → review+
Whiteboard: [needs testing on staging]
Blocks: 417516
Test running on http://staging-master.build.mozilla.org:2010/waterfall

First attempt at 11:08 am = Builds 37, 16, 2 (from left to right)

Backed it out and ran from mozilla-central, then set it up again - starting at Builds 44, 22, 8 at 15:34


Test runs on staging look good. Thanks to lsblakk for setting that up for me.
Whiteboard: [needs testing on staging] → [needs deployment]
Comment on attachment 328527 [details] [diff] [review]
[checked in] handle new error output in buildbot unittest steps

I've checked this in, we still need to update the master and restart it to get this live. We'll have to coordinate it with landing the unittest output changes as well.
Attachment #328527 - Attachment description: handle new error output in buildbot unittest steps → [checked in] handle new error output in buildbot unittest steps
Comment on attachment 328359 [details] [diff] [review]
[checked in] handle new error output in tinderbox error parser

Filed bug 445206 on getting production tinderbox updated to pick up this change.
Blocks: 445362
No longer blocks: 445362
Comment on attachment 328108 [details] [diff] [review]
[checked in] browser chrome + mochitest logging changes

Pushed to mozilla-central - rev:b2d1c3093a84
Attachment #328108 - Attachment description: browser chrome + mochitest logging changes → [checked in] browser chrome + mochitest logging changes
Comment on attachment 328144 [details] [diff] [review]
[checked in] xpcshell test output changes

Pushed to mozilla-central - rev:d3f2561990ab
Attachment #328144 - Attachment description: xpcshell test output changes → [checked in] xpcshell test output changes
Comment on attachment 328326 [details] [diff] [review]
[checked in] reftest output patch, take 2

Pushed to mozilla-central - rev:03a87160d7a2
Attachment #328326 - Attachment description: reftest output patch, take 2 → [checked in] reftest output patch, take 2
Comment on attachment 328352 [details] [diff] [review]
[checked in] change runtests.py leaks output

Pushed to mozilla-central - rev:2beee8ac413e
Attachment #328352 - Attachment description: change runtests.py leaks output → [checked in] change runtests.py leaks output
Comment on attachment 328527 [details] [diff] [review]
[checked in] handle new error output in buildbot unittest steps

qm-moz2-unittest01:/tools/buildbotcustom/unittest/steps.py updated to rev 1.6 to pick up this change. Master restarted
Whiteboard: [needs deployment] → [deployed, needs verification]
Thanks Nick!
I don't think there's anything else to do here.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: