Closed Bug 499315 Opened 16 years ago Closed 16 years ago

allow "fails", "random" to be specified for "load" reftests tests

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: bc, Assigned: bc)

References

(Blocks 1 open bug)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(2 files, 4 obsolete files)

proposed patch 16 years ago Bob Clary [:bc] (inactive) 4.59 KB, patch	dbaron : review-	Details \| Diff \| Splinter Review
patch v2 16 years ago Bob Clary [:bc] (inactive) 8.71 KB, patch		Details \| Diff \| Splinter Review
patch v3 16 years ago Bob Clary [:bc] (inactive) 8.71 KB, patch	dbaron : review-	Details \| Diff \| Splinter Review
patch v4 16 years ago Bob Clary [:bc] (inactive) 11.31 KB, patch	dbaron : review+	Details \| Diff \| Splinter Review
patch v5 16 years ago Bob Clary [:bc] (inactive) 11.35 KB, patch		Details \| Diff \| Splinter Review
patch v6 16 years ago Bob Clary [:bc] (inactive) 11.92 KB, patch	dbaron : review+	Details \| Diff \| Splinter Review

Bob Clary [:bc] (inactive)

Assignee

Description

•

16 years ago

Attached patch proposed patch (obsolete) — Details — Splinter Review

I wish to use the reftest framework for running the JavaScript Tests in the browser as part of the effort to get the JavaScript Tests running on tinderbox. See bug 469718. The ability of the reftests framework to selectively run tests based upon various criteria, such as platform, processor architecture or build type, is important for running the JavaScript tests which may fail for debug but not opt or may fail on Mac OS X intel but not on other platforms. The minimal requirement to run the JavaScript Tests using reftest is to be able to specify "fails" or "random" for "load" tests. Currently, the reftest manifests preclude using "fails" or "random" with a "load" only test. A secondary requirement is the ability to report on the individual test case results from each test file. The simplest approach I can think of involves the reftest extension retrieving the pass/fail information from the test's content and using its knowledge of the manifest entry's conditional value to report expected vs. unexpected failures. This could involve just checking a single global variable or could involve retrieving all of the test results from the test's content. The attached patch implements one approach to solve this. It assumes that the test exposes a function |getTestCases()| which returns an array of test results and that each item in the array exposes two functions: |testPassed()| and |testDescription()|. By intention, the patch reuses and duplicates the |outputs| object for writing the reftest results in order to minimize changes to reftest.js. This could be changed to specify |outputs| only once, if the approach is approved. The patch has been tested by running the reftests and crashtests in opt and debug as well as the JavaScript Tests.

Flags: in-testsuite-

Attachment #384107 - Flags: review?(dbaron)

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 1

•

16 years ago

One thing I'm not sure about here is whether it's best to reuse 'load' or add a new keyword that triggers this behavior of loading a test and then poking into it for scripted test results. I think I probably prefer a new keyword, but I'd be interested in Waldo's thoughts as well. (I'm not quite sure what to call it, though... maybe 'script'?) (In reply to comment #0) > By intention, the patch reuses and duplicates the |outputs| object for writing > the reftest results in order to minimize changes to reftest.js. This could be > changed to specify |outputs| only once, if the approach is approved. Please change it so outputs is specified only once. (It's not only approved, but required.)

Bob Clary [:bc] (inactive)

Assignee

Comment 2

•

16 years ago

Waldo, did you have any opinion about the new keyword? David, would a keyword of "test" be ok? That seems more descriptive and on target than "script".

Jeff Walden [:Waldo]

Comment 3

•

16 years ago

Sorry, slipped below the top of bugmail... I agree a new keyword should be used; this is quite different from the traditional meaning of "load". I *don't* think "test" is a good name, that's entirely too overloaded and/or unmeaningful in this context. "script" is a plausible name; I don't really have any suggestions to make.

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Updated

•

16 years ago

Attachment #384107 - Flags: review?(dbaron) → review-

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 4

•

16 years ago

Comment on attachment 384107 [details] [diff] [review] proposed patch Could you request review again with the comments above incorporated (merging outputs, using new "script" keyword)?

Bob Clary [:bc] (inactive)

Assignee

Comment 5

•

16 years ago

Attached patch patch v2 (obsolete) — Details — Splinter Review

Attachment #384107 - Attachment is obsolete: true

Attachment #388319 - Flags: review?(dbaron)

Bob Clary [:bc] (inactive)

Assignee

Comment 6

•

16 years ago

Comment on attachment 388319 [details] [diff] [review] patch v2 sorry, this fails in the reftest sanity tests.

Attachment #388319 - Flags: review?(dbaron)

Bob Clary [:bc] (inactive)

Assignee

Comment 7

•

16 years ago

Attached patch patch v3 (obsolete) — Details — Splinter Review

this runs reftests, crashtests and the jstests afaict.

Attachment #388319 - Attachment is obsolete: true

Attachment #388493 - Flags: review?(dbaron)

Bob Clary [:bc] (inactive)

Assignee

Comment 8

•

16 years ago

dbaron: review ping?

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Updated

•

16 years ago

Attachment #388493 - Flags: review?(dbaron) → review-

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 9

•

16 years ago

Comment on attachment 388493 [details] [diff] [review] patch v3 Adding the value EXPECTED_SCRIPT really makes things awfully confusing. EXPECTED_LOAD was a bit of a hack, but it sort of worked; with this it no longer makes sense and requires a bunch of extra complication. Instead, you should: * remove EXPECTED_LOAD and EXPECTED_SCRIPT * replace the "equal" field with a new "type" field to the objects (probably right before "expected") taking values TYPE_REFTEST_EQUAL, TYPE_REFTEST_NOTEQUAL, TYPE_LOAD, and TYPE_SCRIPT >+ getTestCases() returns an array of test result objects >+ representing the results of the tests performed by the page. >+ >+ Each test result object has two methods: >+ >+ testPassed() returns true if the test result object passed, >+ otherwise it returns false. >+ >+ testDescription() returns a string describing the test >+ result. Why are these methods rather than properties? Is the idea that the tests are actually executed when the harness calls the methods? If so, it's probably better that the tests not be executed twice (though I have some suggested changes below that would fix that). >@@ -843,2 +881,91 @@ function DocumentLoaded() > dump("REFTEST TEST-PASS | " + gURLs[0].prettyPath + " | (LOAD ONLY)\n"); >+ } >+ else if (gURLs[0].expected == EXPECTED_SCRIPT || >+ ((gURLs[0].expected == EXPECTED_FAIL || >+ gURLs[0].expected == EXPECTED_RANDOM) && !gURLs[0].url2)) { You're chopping off the last 2 lines of the EXPECTED_LOAD case (FinishTestItem and return) so that they don't return early anymore, but they need to. >+ if (!testwindow.getTestCases) >+ { >+ // "script" tests must provide a getTestCases() object. >+ // Force an unexpected failure. >+ testpassed = false; >+ ++gTestResults[outputs[EXPECTED_PASS][testpassed].n]; >+ result = "REFTEST " + outputs[expected][test_passed].s + " | " + >+ gURLs[0].prettyPath + " | " + // the URL being tested >+ "missing test results (SCRIPT).\n"; >+ >+ dump(result); >+ } >+ else >+ { >+ var testcases = testwindow.getTestCases(); >+ >+ if (testcases.length == 0) >+ { >+ // Treat empty testcases array as a failure. >+ test_passed = false; >+ >+ ++gTestResults[outputs[expected][test_passed].n]; >+ >+ result = "REFTEST " + outputs[expected][test_passed].s + " | " + >+ gURLs[0].prettyPath + " | " + // the URL being tested >+ "No test results reported.\n"; >+ >+ dump(result); I'm not entirely sure the subtle distinction here is worth it, but it's ok with me if you think it is. If you wanted, you could combine these two cases by doing: var testcases; if (!testwindow.getTestcases || !(testcases = testwindow.getTestCases())) { // code you previously had twice goes here } In both, rather than setting test_passed = false (you typo it as testpassed twice, though, leading to one reference to it when it's been uninitialized), you should just do: var output = outputs[EXPECTED_PASS][false]; or var output = outputs[expected][false]; and then refer to output.s and output.n. (I probably should have done the same at the existing site that uses outputs. Feel free to fix that if you want.) >+ for (var itest = 0; itest < testcases.length; itest++) >+ { >+ test_passed = test_passed && testcases[itest].testPassed(); >+ } >+ if (testcases.length > 1 || !test_passed) >+ { >+ outputs[EXPECTED_FAIL][true].s = "TEST-PASS"; >+ } While |outputs| isn't const, I'd really like to continue to behave as though it is. Also, (in the same code), I'd really like the numbers in gTestResults to continue to match the number of output lines we print with test results. Also, I think the tests that pass in a test that marked failing should probably be treated as though they were EXPECTED_RANDOM rather than EXPECTED_FAIL. So to fix all three of these, you could do something like: var results = testcases.map(function(test) { return test.testPassed(); }); var anyFailed = results.some(function(result) { return !result; }); var outputPair; if (anyFailed && expected == EXPECTED_FAIL) { // If we're marked as expected to fail, and some (but not all) tests // passed, treat those tests as though they were marked random // (since we can't tell whether they were really intended to be // marked failing or not). outputPair = { true: outputs[EXPECTED_RANDOM][true], false: outputs[expected][false] }; } else { outputPair = outputs[expected]; } var index = 0; results.forEach(function(result) { var output = outputPair[result]; ++gTestResults[output.n]; result = "REFTEST " + output.s + " | " + gURLs[0].prettyPath + " | " + // the URL being tested testcase.testDescription() + " item " + (++index) + "\n"; dump(result); }); With those issues fixed, I think this looks good, though I'd like to look at the new patch. Sorry for the delay in getting to this. (I should be able to get to a revised version a good bit faster; I've cleared out much of my review queue backlog in the past few weeks.)

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 10

•

16 years ago

(In reply to comment #8) > dbaron: review ping? Oh, and I just saw this review ping about a minute ago. :-)

Bob Clary [:bc] (inactive)

Assignee

Comment 11

•

16 years ago

Attached patch patch v4 — Details — Splinter Review

(In reply to comment #9) > Instead, you should: > * remove EXPECTED_LOAD and EXPECTED_SCRIPT > * replace the "equal" field with a new "type" field to the objects > (probably right before "expected") taking values TYPE_REFTEST_EQUAL, > TYPE_REFTEST_NOTEQUAL, TYPE_LOAD, and TYPE_SCRIPT > done. > > Why are these methods rather than properties? Is the idea that the > tests are actually executed when the harness calls the methods? If so, > it's probably better that the tests not be executed twice (though I have > some suggested changes below that would fix that). > The tests are expected to run at load and to use class="reftest-wait" to inform reftest that they are complete. I used methods to hopefully make it simpler for others to incorporate test suites using reftest by simply providing the required methods which could wrap and return the test-specific representations in a standard format. > >@@ -843,2 +881,91 @@ function DocumentLoaded() > > dump("REFTEST TEST-PASS | " + gURLs[0].prettyPath + " | (LOAD ONLY)\n"); > >+ } > >+ else if (gURLs[0].expected == EXPECTED_SCRIPT || > >+ ((gURLs[0].expected == EXPECTED_FAIL || > >+ gURLs[0].expected == EXPECTED_RANDOM) && !gURLs[0].url2)) { > > > You're chopping off the last 2 lines of the EXPECTED_LOAD case > (FinishTestItem and return) so that they don't return early anymore, but > they need to. > I wondered how that missed my testing. I actually had that fixed but it was mixed up in the next patch in my queue. Sorry. > > I'm not entirely sure the subtle distinction here is worth it, but it's > ok with me if you think it is. If you wanted, you could combine these > two cases by doing: > The idea was to flag the case where the test writer failed to provide the required function in the one case and to flag where the return list of tests failed to contain any results. In the JS Engine at least, regressions which cause the list to be empty due to silent script termination do occur. I cleaned it up a bit though. > > In both, rather than setting test_passed = false (you typo it as > testpassed twice, though, leading to one reference to it when it's been doh. > So to fix all three of these, you could do something like: done.

Attachment #388493 - Attachment is obsolete: true

Attachment #393537 - Flags: review?(dbaron)

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 12

•

16 years ago

Comment on attachment 393537 [details] [diff] [review] patch v4 + var missing_msg; Maybe initialize to false? (Would that avoid a strict warning?) + else if (!(testcases = testwindow.getTestCases())) { declare "var testcases" above the if-else chain? + output = outputs[expected][test_passed]; Should probably be "var output =" r=dbaron with that

Attachment #393537 - Flags: review?(dbaron) → review+

Bob Clary [:bc] (inactive)

Assignee

Comment 13

•

16 years ago

(In reply to comment #12) > (From update of attachment 393537 [details] [diff] [review]) > + var missing_msg; > > Maybe initialize to false? (Would that avoid a strict warning?) No, I don't think there is a strict warning for uninitialized variable declarations. But no harm, no foul. Ok. > > + else if (!(testcases = testwindow.getTestCases())) { > > declare "var testcases" above the if-else chain? Darn. Thanks! > > + output = outputs[expected][test_passed]; > > Should probably be "var output =" > I already did that following the definition of |outputs|. Is that ok?

Bob Clary [:bc] (inactive)

Assignee

Comment 14

•

16 years ago

Attached patch patch v5 (obsolete) — Details — Splinter Review

incorporates changes from comment 12 but leaves var output below the var outputs.

Bob Clary [:bc] (inactive)

Assignee

Comment 15

•

16 years ago

Attached patch patch v6 — Details — Splinter Review

I missed adding the new type to print-manifest-dirs.py in the previous patches. David, is it ok to carry the r+ forward to this patch?

Attachment #393943 - Attachment is obsolete: true

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Updated

•

16 years ago

Attachment #394309 - Attachment is patch: true

Attachment #394309 - Attachment mime type: application/octet-stream → text/plain

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 16

•

16 years ago

Comment on attachment 394309 [details] [diff] [review] patch v6 Well, since I just looked at the differences, I may as well carry it forward myself. :-) Though I'm not sure what this line is: + topsrcdir = os.path.normpath(topsrcdir) Is it for something different?

Attachment #394309 - Flags: review+

Bob Clary [:bc] (inactive)

Assignee

Comment 17

•

16 years ago

Oh, sorry. I didn't mean to try to slip that by you. I'm trying to shoe horn the JavaScripts tests into build's Makefiles and have a patch in progress for bug 468913 that does --- a/js/src/Makefile.in +++ b/js/src/Makefile.in @@ -61,2 +61,6 @@ endif +ifdef ENABLE_TESTS +DIRS += ../tests +endif + While working on the packaging I at first was going to use the normal approach of using print-manifest-dirs.py pointing to the top level manifest file and found that it threw on the line if d[0] == '/': with an invalid index error due to the /../ in the path. I've since decided not to use print-manifest-dirs.py in that way since it ends up copying the entire js/tests directory. I can drop the normpath call if you want.

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 18

•

16 years ago

Probably better to put that in bug 468913 if you need it there, I think.

Bob Clary [:bc] (inactive)

Assignee

Comment 19

•

16 years ago

http://hg.mozilla.org/mozilla-central/rev/c01c8bea9ead http://hg.mozilla.org/tracemonkey/rev/d18db5828c6e

Assignee: nobody → bclary

Status: NEW → RESOLVED

Closed: 16 years ago

Resolution: --- → FIXED

Whiteboard: fixed-in-tracemonkey

Bob Clary [:bc] (inactive)

Assignee

Updated

•

16 years ago

Blocks: 511611

Bob Clary [:bc] (inactive)

Assignee

Comment 20

•

16 years ago

This was landed on 1.9.2 on Oct 1, 2009 http://hg.mozilla.org/releases/mozilla-1.9.2/rev/e093beeee710

You need to log in before you can comment on or make changes to this bug.