Closed Bug 1619393 Opened 4 years ago Closed 4 years ago

Reftest improvements for fuzzy tests

Categories

(Core :: Graphics: WebRender, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla75
Tracking Status
firefox75 --- fixed

People

(Reporter: bpeers, Assigned: bpeers)

Details

Attachments

(3 files)

Tests that verify if results are "similar" at different resolutions have a lot of small differences due to the usual quantization, plus a few large differences near edges -- for example a sharp black-white transition becomes medium grey due to bilerping.

This means the fuzzy statement needs to be something like (120, 10000) which looks worse than it is, plus rendering might start failing and the test not catching it due to being so generous.

It's also not immediately obvious in reftest-analyzer how "bad" it is as there is no indication of magnitude of the difference.

So there's a few ideas to improve this:

  1. support multiple fuzzy keywords per reftest; a non-identical pixel gets counted against the lowest fuzzy delta that allows it, and the test only fails if any of these buckets overflows their max-allowed-number count. So for example a test could be fuzzy(2, 9500) fuzzy(100,500) to tighten the bounds.
  2. in ref-test analyzer show a histogram of the differing pixels. this can help us find good bounds for item #1 above.
  3. perhaps color-code the "circling difference", or add a new "difference as heat map", or add diff-range sliders -- anything that would help us visually investigate if a test if totally off the rails or if it's mostly tiny differences plus a few outliers.
Assignee: nobody → bpeers
Attached image problem.png

Example of the problem, the test comes with fuzzy(115,23498) (see wrench\reftests\text\reftest.list test raster_root_C_8192).

Add support for multiple fuzzy keywords in reftest.list.
Each extra keyword allows more pixels with a difference that's too large
to be caught by other fuzzy terms.

For example, fuzzy(5,100) will allow at most 100 pixels with a
difference of at most 5, as before. Adding another fuzzy(20,10) will
allow an extra 10 pixels at most that have a difference more than 5 but
less than or equal to 20. The total number of differing pixels allowed
is thus 110, but only if 100 of those differ by <= 5 and the remaining
10 by <= 20. 110 pixels with a difference <= 5 will still fail.
This is intentional to encourage tighter bounds in tests where many
pixels are slightly off and a few outliers are off by a lot.

Any pixels that exceed the highest maximum will fail the test, same as
before.

Using multiple fuzzy statements disables fuzzy parameters passed via the
options. This is because multiple fuzzy keywords are cumulative (as in
the example above), but "legacy" fuzzy and options combine into the
maximum of the two.

Steps tested:

  1. the same tests fail in exactly the same way before and after;
  2. reordered the fuzzy statements for raster_root_A/B/C to no longer
    be sorted by max difference, and verified that the tests pass/fail
    the same way;
    (then sort them again which is easier to understand);
  3. tests using the new feature still fail when the ref no longer matches
    (deliberately broke the _ref version and verified test failed);
Pushed by bpeers@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8c07488c9a3f
Reftest improvements for fuzzy tests r=gw
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla75

Oh neat, big +1 on the potential plans for improving the information we get from the reftest analyzer. Things I have also noticed would help a lot:

  • actively suggesting fuzzy params with some basic heuristics, using the actual fuzzy(a,b) format (even if you need to edit it, that's a much nicer starting point).
  • try to provide a link for you to load up the source of the test/ref for investigation
    • also provide a link (or iframe!) to a live version of the test/ref so you can inspect the issue in a local build (may not always work, but should still be streamlined more!)
Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3c67e9f15e93
Increase fuzz by one to allow reftest to pass on AppVeyor. r=Bert

Re-open for the (less important, more optional) second part of reftest analyzer.

Thanks for the suggestions Alexis, sounds good!

Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED

Created Bug 1620642 for the reftest visualizer follow up work.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: