Closed Bug 369330 Opened 18 years ago Closed 18 years ago

allow reftests to be marked as failing or random on a specific platform

Categories

(Testing :: Reftest, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla1.9

People

(Reporter: dbaron, Assigned: dbaron)

References

Details

(Whiteboard: [T.M.=mozilla1.9a3])

Attachments

(1 file, 1 obsolete file)

After having gone through the platform differences in reftest success, I'd like to add to reftest the ability to mark a test as currently failing or currently random based on the platform. In particular, what I propose is getting rid of the "f" at the beginning of the test operator and replacing it by putting failure or randomness status *before* the operator. (It's prominent there in the manifest, and easier to process if we extend the test types.) So I'd propose a syntax something like: == testurl ref-url # just like now fails == testurl ref-url # test that fails everywhere fails-if(platform==Windows) == testurl ref-url random-if(platform==Windows) == testurl ref-url # for things like bug 369040 Anything marked as random will have its PASS/FAIL status in the output along with "(MARKED AS RANDOM)". I'm not yet sure what I want for the syntax inside the (). I might find some way to test based on stuff in autoconf.mk and/or mozilla-config.h, although I might just hack in navigator.platform tests for now. Thoughts?
Bug 369017 would handle everything except randomness, and it'd also enable other build-type tests beyond just platform.
But that way, it would be a pain to write #if foo == ... ... #else f== ... ... #endif which is generally what we want, since we want to know if a failing test becomes a passing test so that we are forced to change the manifest and then avoid regressing it.
I think I prefer the manifest language in dbaron's original comment to Waldo's example. This is a necessary addition, imo, as the tests on windows are currently useless. Alternatively, another approach would be to include multiple manifests for each platform. Linux, Mac and Windows. This would pass the pain onto the person who has to update 3 manifest files to add a test, but marking one failing on one or more platform would be fairly easy. It has the added benefit of not adding to the manifest "language". Take your pick, David.
Assignee: nobody → dbaron
Attached patch patch (obsolete) — Splinter Review
This works, but I'd like to tweak the code a bit more, and I haven't added fails-if / random-if lines for Mac and Windows yet.
Attached patch patchSplinter Review
OK, I got it working with evalInSandbox, which solved the issue I was worrying about with the possibility of an autoconf.mk variable that conflicted with my variable -- in theory it's also more secure if we ever had untrusted reftest manifests, although I don't think we should. I haven't actually tested the security aspects of it, and I'm not worried about that possibility. In any case, this removes the "f"s from the beginning of the test operands and adds fails, random, fails-if(), and random-if() keywords that go at the beginning of a test line. The contents of fails-if and random-if are javascript expressions, to be evaluated when all of the "simple" variables in autoconf.mk are defined (all as strings).
Attachment #254211 - Attachment is obsolete: true
Attachment #254217 - Flags: review?(rcampbell)
Note that this also doesn't make the necessary annotations for Windows and Mac; I'll do that after this lands. One step at a time...
Status: NEW → ASSIGNED
Whiteboard: [patch]
Comment on attachment 254217 [details] [diff] [review] patch I might change (RESULT EXPECTED TO BE RANDOM) to something more succinct. Like RANDOM or DON'T CARE, but, r+.
Attachment #254217 - Flags: review?(rcampbell) → review+
Well, I want the randomness to be pretty obvious for two reasons: 1. having to mark something as random is something that we should try to avoid; fixing nondeterministic behavior should be a priority. 2. when something is marked as random, it will never output "UNEXPECTED" and thus turn tinderbox orange. So we should make it more noticeable in other ways.
Checked in to trunk, 2007-02-08 11:24 -0800 with various manifest followups over the course of the day. I think things should be passing now on Linux, Mac, and Windows. The linux testing tinderbox actually was green, although the Windows one (and the Linux one) went offline right before it would have picked up the last manifest change needed for Windows.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
It looks good, David. We have a nice green pair of boxes now. Mac to follow soon...
Component: Layout: Misc Code → Reftest
Product: Core → Testing
Version: Trunk → unspecified
Depends on: 492228
Whiteboard: [patch] → [T.M.=mozilla1.9a3]
Target Milestone: --- → mozilla1.9
Version: unspecified → Trunk
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: