Closed Bug 472557 Opened 16 years ago Closed 14 years ago

make individual reftests fail when they assert

Categories

(Testing :: Reftest, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dbaron, Assigned: dbaron)

References

(Blocks 2 open bugs)

Details

Attachments

(7 files, 4 obsolete files)

5.80 KB, patch
benjamin
: review+
Details | Diff | Splinter Review
14.99 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
4.89 KB, patch
benjamin
: review+
Waldo
: review+
Details | Diff | Splinter Review
28.34 KB, text/plain
Details
3.42 KB, text/plain
Details
250 bytes, text/plain
Details
2.67 KB, text/plain
Details
In order to get assertions down, we should make reftests fail, per-test, when they assert, after exposing a global assertion counter from XPCOM.  We can annotate the manifest with the number of assertions known per test, potentially as a range:
  asserts(1) // always has one assertion
  asserts(5,7) // has 5-7 assertions
  asserts-if(MOZ_WIDGET_TOOLKIT=="gtk2",3)
  asserts-if(MOZ_WIDGET_TOOLKIT=="cocoa",0,1) // might or might not have 1 assert

Then we can report unexpected failure if the actual assertion count is too high and unexpected pass if it is too low.
(We can do this for mochitest too, but that's probably a bit more complicated and should go in a different bug.)
Benjamin suggested nsIDebug2, which makes it easier than adding a new service.

(Also a question about the next patch:  I assume I should be using getService to get the nsIDebug rather than createInstance; XPCOM already enforces it being a singleton either way.)
Attachment #355881 - Flags: review?(benjamin)
I think I'll need to land this disabled initially, both until we set up tinderboxes and perhaps until we do something about max-total-viewers assertions that affect the next 30 or so tests...

(The disabling would be trivial; comment out the line that gets the real assertion count, and substitute assignment of 0 instead.)


Note that this makes us fail the first test if there's an assertion during startup.  I think that's a good thing.
(I tested that this does report things as expected with some tests in layout/reftests/bugs/ that actually assert, and briefly tested the effects of various asserts and asserts-if annotations.)
And how about I update README.txt too?
Attachment #355882 - Attachment is obsolete: true
Attachment #355883 - Flags: review?(jwalden+bmo)
Attachment #355882 - Flags: review?(jwalden+bmo)
This should interact better with canvas caching.
Attachment #355883 - Attachment is obsolete: true
Attachment #355888 - Flags: review?(jwalden+bmo)
We could use this for crashtests too
This patch would automatically apply to crashtests too.
I know, I just mean that'd be a great place to use it.
Attachment #355888 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 355888 [details] [diff] [review]
patch 2: make reftest report failures for assertions

>diff --git a/layout/tools/reftest/README.txt b/layout/tools/reftest/README.txt

>+      asserts(count)
>+          Loading the test and reference is known to assert exactly
>+          count times.
>+          NOTE: asserts() with a non-zero count or max-count suppresses
>+          use of a cached canvas for the test with the annotation.

"suppresses" is correct here, of course, but I get cognitive dissonance with "asserts()" because I read it as a verb.  Perhaps "An asserts() annotation with a non-zero count..."?


>+      asserts(min-count,max-count)
>+          Loading the test and reference is known to assert between
>+          min-count and max-count times.

Add a ", inclusive" here.


>+          NOTE: See above regarding canvas caching.
>+
>+      asserts-if(condition,count)
>+      asserts-if(condition,min-count,max-count)
>+          Same as above, but only if condition is true.

Why use a comma instead of a hyphen for ranges?  A hyphen would be much more understandable at first glance, I think.  Looks like a two-character change at first glance to do this.


>diff --git a/layout/tools/reftest/reftest.js b/layout/tools/reftest/reftest.js

>+            } else if ((m = item.match(/^asserts\((\d+)(,\d+)?\)$/))) {
>+                cond = false;
>+                minAsserts = Number(m[1]);
>+                maxAsserts = (m[2] == undefined) ? minAsserts
>+                                                 : Number(m[2].substring(1));
>+            } else if ((m = item.match(/^asserts-if\((.*?),(\d+)(,\d+)?\)$/))) {
>+                cond = false;
>+                if (Components.utils.evalInSandbox("(" + m[1] + ")", sandbox)) {
>+                    minAsserts = Number(m[2]);
>+                    maxAsserts =
>+                      (m[3] == undefined) ? minAsserts
>+                                          : Number(m[3].substring(1));
>+                }

You could use the unary + operator rather than Number if you wanted here for less visual noise...

Please add a check somewhere that |minAsserts <= maxAsserts|; fail-fast is better, I think, than fail-when-actually-run.
(In reply to comment #10)
> You could use the unary + operator rather than Number if you wanted here for
> less visual noise...

Seems too magical to me.  But I've addressed the remaining comments.
Attachment #355881 - Flags: review?(benjamin) → review+
Landed 2 patches, but second one disabled:
http://hg.mozilla.org/mozilla-central/rev/3317170b582b
http://hg.mozilla.org/mozilla-central/rev/89847884d010

I realize I also need to add something that suppresses the assertion checks for non-DEBUG builds; otherwise we'll hit tons of TEST-UNEXPECTED-PASS for those.
Adding to nsIDebug2 again seemed like the simplest way that wouldn't get in the way of our plans to be able to run reftest as an extension.
Attachment #356081 - Flags: review?(benjamin)
Attachment #356081 - Flags: review?(jwalden+bmo) → review+
Attachment #356081 - Flags: review?(benjamin) → review+
For the record, I'm slightly uncomfortable with this attribute because it only tells you whether XPCOM was compiled with debug, not whether the app or anything else was compiled with debug. Tthey can be separate, at least in the case of XR apps... it used to be that you could compile DEBUG for only certain gecko modules, but I'm pretty sure that magic is either gone or not working. But since that's unusual, and the attribute helps solve a problem at hand, let's do it anyway.
Version: unspecified → Trunk
This bug won't land on m-1.9.1, right?
Depends on: 372581
I changed the way this code is disabled so that we can scrape the assertion failure counts out of the tinderbox logs in order to populate the manifests:
http://hg.mozilla.org/mozilla-central/rev/69bab0e75853
(In reply to comment #14)
> it used to be that you could compile DEBUG for only certain gecko modules
--enable-debug-modules=xpcom,foo,bar
[you have to compile XPCOM debug if you want to compile anything else debug]
[also comes in a ^module version should you want to override --enable-debug]
Attached file make-assertlists.sh (obsolete) —
First script for processing tinderbox logs.
Attached file unify-assertlists.sh (obsolete) —
Second script for processing tinderbox logs.
The above scripts (make-assertlists.sh and then unify-assertlists.sh) expect to be run in a directory containing only tinderbox logs named linux-Ed-*, mac-Ed-*, and win-Ed-*.
And note that the above list reflects that http://hg.mozilla.org/mozilla-central/rev/4700e3c42868 has landed (so doesn't list assertions from reftest-print for Windows and Linux), except it does *not* reflect that I need to adjust that code to reflect that there are three assertions on Mac.
Attachment #443468 - Attachment mime type: text/x-python → text/plain
One more annotation for a single failure that I think is random:
http://hg.mozilla.org/mozilla-central/rev/a93160d1e442
We hit another one of the intermittent video assertions on Linux, so I marked them all cross-platform:
http://hg.mozilla.org/mozilla-central/rev/83794e67435d
and another assertion got fixed:
http://hg.mozilla.org/mozilla-central/rev/441c99461236
Enabled:
http://hg.mozilla.org/mozilla-central/rev/36b9855dad18
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: