Reftest improvements for fuzzy tests
Categories
(Core :: Graphics: WebRender, enhancement, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox75 | --- | fixed |
People
(Reporter: bpeers, Assigned: bpeers)
Details
Attachments
(3 files)
Tests that verify if results are "similar" at different resolutions have a lot of small differences due to the usual quantization, plus a few large differences near edges -- for example a sharp black-white transition becomes medium grey due to bilerping.
This means the fuzzy
statement needs to be something like (120, 10000)
which looks worse than it is, plus rendering might start failing and the test not catching it due to being so generous.
It's also not immediately obvious in reftest-analyzer how "bad" it is as there is no indication of magnitude of the difference.
So there's a few ideas to improve this:
- support multiple
fuzzy
keywords per reftest; a non-identical pixel gets counted against the lowest fuzzy delta that allows it, and the test only fails if any of these buckets overflows their max-allowed-number count. So for example a test could befuzzy(2, 9500) fuzzy(100,500)
to tighten the bounds. - in ref-test analyzer show a histogram of the differing pixels. this can help us find good bounds for item #1 above.
- perhaps color-code the "circling difference", or add a new "difference as heat map", or add diff-range sliders -- anything that would help us visually investigate if a test if totally off the rails or if it's mostly tiny differences plus a few outliers.
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Comment 1•4 years ago
|
||
Example of the problem, the test comes with fuzzy(115,23498)
(see wrench\reftests\text\reftest.list
test raster_root_C_8192
).
Assignee | ||
Comment 2•4 years ago
|
||
Add support for multiple fuzzy
keywords in reftest.list.
Each extra keyword allows more pixels with a difference that's too large
to be caught by other fuzzy
terms.
For example, fuzzy(5,100) will allow at most 100 pixels with a
difference of at most 5, as before. Adding another fuzzy(20,10) will
allow an extra 10 pixels at most that have a difference more than 5 but
less than or equal to 20. The total number of differing pixels allowed
is thus 110, but only if 100 of those differ by <= 5 and the remaining
10 by <= 20. 110 pixels with a difference <= 5 will still fail.
This is intentional to encourage tighter bounds in tests where many
pixels are slightly off and a few outliers are off by a lot.
Any pixels that exceed the highest maximum will fail the test, same as
before.
Using multiple fuzzy statements disables fuzzy parameters passed via the
options. This is because multiple fuzzy keywords are cumulative (as in
the example above), but "legacy" fuzzy and options combine into the
maximum of the two.
Steps tested:
- the same tests fail in exactly the same way before and after;
- reordered the
fuzzy
statements for raster_root_A/B/C to no longer
be sorted by max difference, and verified that the tests pass/fail
the same way;
(then sort them again which is easier to understand); - tests using the new feature still fail when the ref no longer matches
(deliberately broke the _ref version and verified test failed);
Pushed by bpeers@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/8c07488c9a3f Reftest improvements for fuzzy tests r=gw
Comment 4•4 years ago
|
||
bugherder |
Comment 5•4 years ago
|
||
Comment 6•4 years ago
|
||
Oh neat, big +1 on the potential plans for improving the information we get from the reftest analyzer. Things I have also noticed would help a lot:
- actively suggesting fuzzy params with some basic heuristics, using the actual fuzzy(a,b) format (even if you need to edit it, that's a much nicer starting point).
- try to provide a link for you to load up the source of the test/ref for investigation
- also provide a link (or iframe!) to a live version of the test/ref so you can inspect the issue in a local build (may not always work, but should still be streamlined more!)
Pushed by kgupta@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/3c67e9f15e93 Increase fuzz by one to allow reftest to pass on AppVeyor. r=Bert
Assignee | ||
Comment 8•4 years ago
|
||
Re-open for the (less important, more optional) second part of reftest analyzer.
Thanks for the suggestions Alexis, sounds good!
Updated•4 years ago
|
Comment 9•4 years ago
|
||
bugherder |
Assignee | ||
Comment 10•4 years ago
•
|
||
Created Bug 1620642 for the reftest visualizer follow up work.
Description
•