Closed Bug 1252361 Opened 9 years ago Closed 7 years ago

fuzzy() and fuzzy-if() annotations should take ranges rather than maximums

Categories

(Testing :: Reftest, defect)

defect
Not set
normal

Tracking

(firefox47 affected, firefox55 fixed)

RESOLVED FIXED
mozilla55
Tracking Status
firefox47 --- affected
firefox55 --- fixed

People

(Reporter: dbaron, Assigned: kats)

References

Details

Attachments

(2 files)

I think we made a somewhat substantial design mistake with fuzzy() and fuzzy-if() reftest.list annotations (which I've been thinking about for a while). I think we should attempt to fix this. fuzzy() currently takes as arguments two numbers, maxDiff and diffCount. Both of these are a *maximum* allowable difference. This means that when the condition causing a reftest to be fuzzy goes away, we can't ever get an UNEXPECTED-PASS failure that tells us that we fixed a bug and should remove the annotation. We should changes these to take ranges, like asserts() does. This will require replacing all the existing annotation numbers with "0-" at the beginning to mark the range that they now represent. This, in turn, will allow us to get UNEXPECTED-PASS results for fuzzy annotations when we give them ranges that do not include "0-".
That's looking good. I'm not totally clear on what conditions should trigger the UNEXPECTED-PASS though, I'd have to spend some more time looking at the code.
I think I'd report: * UNEXPECTED-FAIL if either number is higher than the allowed range (as today) * otherwise, UNEXPECTED-PASS if either number is lower than the allowed range * otherwise, continue with the logic as it is today (I'm not entirely sure what that is)
The logic is fairly convoluted. I think this should do what we want: https://treeherder.mozilla.org/#/jobs?repo=try&revision=ea86f12b4bb50db5866d00c59a53c87ecb76fb6d The first two patches here are the real meat and potatoes. The third patch is something that we'll probably want to do eventually but will probably turn up some failures if we do it right away. The fourth patch is not something we want to land but it forces a couple of these UNEXPECTED-PASS scenarios to see how treeherder deals with them.
So looks like using output = {s: ["PASS", "PASS"], n: "UnexpectedPass"}; makes the test get listed under the "Unexpected" results in the log, but TreeHerder doesn't notice anything. I'll have to use output = {s: ["PASS", "FAIL"], n: "UnexpectedPass"}; instead, even though that's kind of misleading since the test isn't really expected to fail. I canceled the old try push, here's another: https://treeherder.mozilla.org/#/jobs?repo=try&revision=63d6def39a86e54db467a79fd9be71b5ca9be41b
Having to say it's expected to fail seems reasonable.
The try push seems to have worked as expected, there's a lot of overfuzzed tests so there's a lot of orange. We can land the first two patches now and then file follow-up bugs to reduce the overfuzzing. I'll get them ready for review.
Assignee: nobody → bugmail
Also a final try push just to be safe and make sure everything is still green: https://treeherder.mozilla.org/#/jobs?repo=try&revision=66e52ec0c6e16a15cae340bdd9c4db8174e43575
Comment on attachment 8873573 [details] Bug 1252361 - Modify the fuzzy and fuzzy-if reftest annotations to accept ranges as well. https://reviewboard.mozilla.org/r/144956/#review148944 Yeah, this is a good way to start. I'd still like to change the default for single-number at some point in the future.
Attachment #8873573 - Flags: review?(dbaron) → review+
A previous try that I had done resulted in a Rs1 UNEXPECTED-PASS because the stylo reftest suite compared fuzzy.html in stylo against fuzzy.html in gecko with fuzzy(3-4,250000). I might need to do a skip-if(styloVsGecko) on some/all of the new tests I added to reftest-sanity because they're doing weird things in that configuration.
Comment on attachment 8873574 [details] Bug 1252361 - Produce UNEXPECTED-PASS results for fuzzy tests that are overfuzzed. https://reviewboard.mozilla.org/r/144958/#review148948 ::: layout/tools/reftest/reftest.jsm:1752 (Diff revision 1) > - maxDifference.value >= gURLs[0].fuzzyMinDelta && > - maxDifference.value <= gURLs[0].fuzzyMaxDelta && > - differences >= gURLs[0].fuzzyMinPixels && > - differences <= gURLs[0].fuzzyMaxPixels) { > + logger.info(`REFTEST fuzzy test ` > + + `(${gURLs[0].fuzzyMinDelta}, ${gURLs[0].fuzzyMinPixels}) <= ` > + + `(${maxDifference.value}, ${differences}) <= ` > + + `(${gURLs[0].fuzzyMaxDelta}, ${gURLs[0].fuzzyMaxPixels})`); Gecko style puts the + at end of line rather than the start, I think. ::: layout/tools/reftest/reftest.jsm:1756 (Diff revision 1) > - if (equal) { > - throw "Inconsistent result from compareCanvases."; > - } > - equal = expected == EXPECTED_FUZZY; > - logger.info(`REFTEST fuzzy match (${maxDifference.value}, ${differences}) <= (${gURLs[0].fuzzyMaxDelta}, ${gURLs[0].fuzzyMaxPixels})`); > + fuzz_exceeded = ((maxDifference.value > gURLs[0].fuzzyMaxDelta) || > + (differences > gURLs[0].fuzzyMaxPixels)); > + equal = (!fuzz_exceeded && > + (maxDifference.value >= gURLs[0].fuzzyMinDelta) && > + (differences >= gURLs[0].fuzzyMinPixels)); please drop all of the parentheses in these two statements. ::: layout/tools/reftest/reftest.jsm:1758 (Diff revision 1) > - } > - equal = expected == EXPECTED_FUZZY; > - logger.info(`REFTEST fuzzy match (${maxDifference.value}, ${differences}) <= (${gURLs[0].fuzzyMaxDelta}, ${gURLs[0].fuzzyMaxPixels})`); > + equal = (!fuzz_exceeded && > + (maxDifference.value >= gURLs[0].fuzzyMinDelta) && > + (differences >= gURLs[0].fuzzyMinPixels)); Somewhere around here (or maybe in the bit below) I think you should make it an error for a TYPE_REFTEST_NOTEQUAL test to have fuzz minimums that are greater than 0. I don't think that makes sense to have, so I think it should report a failure. (For a != test, fuzzy annotations make the test *stricter* rather than *weaker*.)
Attachment #8873574 - Flags: review?(dbaron) → review+
(In reply to David Baron :dbaron: ⌚️UTC-7 from comment #13) > > Gecko style puts the + at end of line rather than the start, I think. Changed this. > please drop all of the parentheses in these two statements. This too. > Somewhere around here (or maybe in the bit below) I think you should make it > an error for a TYPE_REFTEST_NOTEQUAL test to have fuzz minimums that are > greater than 0. I don't think that makes sense to have, so I think it > should report a failure. So I implemented this, but put it in part 1 since it makes more sense there. I had to update the changes to reftest-sanity/reftest.list in part 2 because a couple of the new things I added violate this constraint. I also added skip-if(styloVsGecko) on them. I kicked off another try run with the changes. https://treeherder.mozilla.org/#/jobs?repo=try&revision=9b159443610b2b61feea726c1dc2390f808f5b3e
Pushed by kgupta@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/bc72cc13c8f4 Modify the fuzzy and fuzzy-if reftest annotations to accept ranges as well. r=dbaron https://hg.mozilla.org/integration/mozilla-inbound/rev/4d98a42a482f Produce UNEXPECTED-PASS results for fuzzy tests that are overfuzzed. r=dbaron
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: