Closed Bug 472557 Opened 15 years ago Closed 14 years ago

make individual reftests fail when they assert


(Testing :: Reftest, defect)

Not set


(Not tracked)



(Reporter: dbaron, Assigned: dbaron)


(Blocks 2 open bugs)



(7 files, 4 obsolete files)

In order to get assertions down, we should make reftests fail, per-test, when they assert, after exposing a global assertion counter from XPCOM.  We can annotate the manifest with the number of assertions known per test, potentially as a range:
  asserts(1) // always has one assertion
  asserts(5,7) // has 5-7 assertions
  asserts-if(MOZ_WIDGET_TOOLKIT=="cocoa",0,1) // might or might not have 1 assert

Then we can report unexpected failure if the actual assertion count is too high and unexpected pass if it is too low.
(We can do this for mochitest too, but that's probably a bit more complicated and should go in a different bug.)
Benjamin suggested nsIDebug2, which makes it easier than adding a new service.

(Also a question about the next patch:  I assume I should be using getService to get the nsIDebug rather than createInstance; XPCOM already enforces it being a singleton either way.)
Attachment #355881 - Flags: review?(benjamin)
I think I'll need to land this disabled initially, both until we set up tinderboxes and perhaps until we do something about max-total-viewers assertions that affect the next 30 or so tests...

(The disabling would be trivial; comment out the line that gets the real assertion count, and substitute assignment of 0 instead.)

Note that this makes us fail the first test if there's an assertion during startup.  I think that's a good thing.
Attachment #355882 - Flags: review?(jwalden+bmo)
(I tested that this does report things as expected with some tests in layout/reftests/bugs/ that actually assert, and briefly tested the effects of various asserts and asserts-if annotations.)
And how about I update README.txt too?
Attachment #355882 - Attachment is obsolete: true
Attachment #355883 - Flags: review?(jwalden+bmo)
Attachment #355882 - Flags: review?(jwalden+bmo)
Attachment #355883 - Flags: review?(jwalden+bmo)
This should interact better with canvas caching.
Attachment #355883 - Attachment is obsolete: true
Attachment #355888 - Flags: review?(jwalden+bmo)
We could use this for crashtests too
This patch would automatically apply to crashtests too.
I know, I just mean that'd be a great place to use it.
Attachment #355888 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 355888 [details] [diff] [review]
patch 2: make reftest report failures for assertions

>diff --git a/layout/tools/reftest/README.txt b/layout/tools/reftest/README.txt

>+      asserts(count)
>+          Loading the test and reference is known to assert exactly
>+          count times.
>+          NOTE: asserts() with a non-zero count or max-count suppresses
>+          use of a cached canvas for the test with the annotation.

"suppresses" is correct here, of course, but I get cognitive dissonance with "asserts()" because I read it as a verb.  Perhaps "An asserts() annotation with a non-zero count..."?

>+      asserts(min-count,max-count)
>+          Loading the test and reference is known to assert between
>+          min-count and max-count times.

Add a ", inclusive" here.

>+          NOTE: See above regarding canvas caching.
>+      asserts-if(condition,count)
>+      asserts-if(condition,min-count,max-count)
>+          Same as above, but only if condition is true.

Why use a comma instead of a hyphen for ranges?  A hyphen would be much more understandable at first glance, I think.  Looks like a two-character change at first glance to do this.

>diff --git a/layout/tools/reftest/reftest.js b/layout/tools/reftest/reftest.js

>+            } else if ((m = item.match(/^asserts\((\d+)(,\d+)?\)$/))) {
>+                cond = false;
>+                minAsserts = Number(m[1]);
>+                maxAsserts = (m[2] == undefined) ? minAsserts
>+                                                 : Number(m[2].substring(1));
>+            } else if ((m = item.match(/^asserts-if\((.*?),(\d+)(,\d+)?\)$/))) {
>+                cond = false;
>+                if (Components.utils.evalInSandbox("(" + m[1] + ")", sandbox)) {
>+                    minAsserts = Number(m[2]);
>+                    maxAsserts =
>+                      (m[3] == undefined) ? minAsserts
>+                                          : Number(m[3].substring(1));
>+                }

You could use the unary + operator rather than Number if you wanted here for less visual noise...

Please add a check somewhere that |minAsserts <= maxAsserts|; fail-fast is better, I think, than fail-when-actually-run.
(In reply to comment #10)
> You could use the unary + operator rather than Number if you wanted here for
> less visual noise...

Seems too magical to me.  But I've addressed the remaining comments.
Attachment #355881 - Flags: review?(benjamin) → review+
Landed 2 patches, but second one disabled:

I realize I also need to add something that suppresses the assertion checks for non-DEBUG builds; otherwise we'll hit tons of TEST-UNEXPECTED-PASS for those.
Adding to nsIDebug2 again seemed like the simplest way that wouldn't get in the way of our plans to be able to run reftest as an extension.
Attachment #356081 - Flags: review?(benjamin)
Attachment #356081 - Flags: review?(jwalden+bmo)
Attachment #356081 - Flags: review?(jwalden+bmo) → review+
Attachment #356081 - Flags: review?(benjamin) → review+
For the record, I'm slightly uncomfortable with this attribute because it only tells you whether XPCOM was compiled with debug, not whether the app or anything else was compiled with debug. Tthey can be separate, at least in the case of XR apps... it used to be that you could compile DEBUG for only certain gecko modules, but I'm pretty sure that magic is either gone or not working. But since that's unusual, and the attribute helps solve a problem at hand, let's do it anyway.
Blocks: 372581
Version: unspecified → Trunk
This bug won't land on m-1.9.1, right?
Depends on: 372581
I changed the way this code is disabled so that we can scrape the assertion failure counts out of the tinderbox logs in order to populate the manifests:
(In reply to comment #14)
> it used to be that you could compile DEBUG for only certain gecko modules
[you have to compile XPCOM debug if you want to compile anything else debug]
[also comes in a ^module version should you want to override --enable-debug]
Attached file (obsolete) —
First script for processing tinderbox logs.
Attached file (obsolete) —
Second script for processing tinderbox logs.
The above scripts ( and then expect to be run in a directory containing only tinderbox logs named linux-Ed-*, mac-Ed-*, and win-Ed-*.
And note that the above list reflects that has landed (so doesn't list assertions from reftest-print for Windows and Linux), except it does *not* reflect that I need to adjust that code to reflect that there are three assertions on Mac.
Depends on: 532174
Attachment #443468 - Attachment mime type: text/x-python → text/plain
Attached file
Attachment #417516 - Attachment is obsolete: true
Attached file
Attachment #417517 - Attachment is obsolete: true
One more patch for crashtest annotation:
One more crashtest manifest adjustment:
Additional crashtest manifest adjustment:
One more annotation for a single failure that I think is random:
One more annotation of a random:
We hit another one of the intermittent video assertions on Linux, so I marked them all cross-platform:
and another assertion got fixed:
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.