Closed Bug 571062 Opened 12 years ago Closed 12 years ago

disable slow JS test, except on tinderbox

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla2.0b3

People

(Reporter: gal, Assigned: Waldo)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 file, 1 obsolete file)

kahoolawe:tests gal$ python jstests.py -j8 ../Darwin_DBG.OBJ/js
[ 136|   0|   8] 100% ===============================================>|    5.0s^C
PASS (partial run -- interrupted by user)
kahoolawe:tests gal$ python jstests.py -j8 ../Darwin_DBG.OBJ/js
[2824|  17| 159] 100% ===============================================>|  720.0s
REGRESSIONS
    ecma/Date/dst-offset-caching-2-of-8.js
    ecma/Date/dst-offset-caching-3-of-8.js
    ecma/Date/dst-offset-caching-4-of-8.js
    ecma/Date/dst-offset-caching-5-of-8.js
    ecma/Date/dst-offset-caching-6-of-8.js
    ecma/Date/dst-offset-caching-7-of-8.js
    ecma/Date/dst-offset-caching-8-of-8.js
    ecma_5/Object/15.2.3.6-dictionary-redefinition-1-of-8.js
    ecma_5/Object/15.2.3.6-dictionary-redefinition-2-of-8.js
    ecma_5/Object/15.2.3.6-dictionary-redefinition-3-of-8.js
    ecma_5/Object/15.2.3.6-dictionary-redefinition-4-of-8.js
    ecma_5/Object/15.2.3.6-dictionary-redefinition-5-of-8.js
    js1_5/Regress/regress-203278-1.js
    js1_8_5/regress/regress-555246-1.js
    js1_5/Array/regress-157652.js
    js1_5/Array/regress-330812.js
    js1_5/Regress/regress-422348.js
FAIL

It fails for -j, -j2, -j4 and -j8 (the latter is the fastest on my machine). This stuff fails reliably, so its worthless for pre-checkin testing. I understand these are cool tests, but they simply do not work for development work. We need --with-slow-tests or something like that.
Assignee: general → jwalden+bmo
Please!
Waldo, please take this and fix promptly. You didn't start the problem, but the date caching tests made it a lot worse. (love the patch and tests though)
Just about done with a patch, testing now.
Attached patch Patch (obsolete) — Splinter Review
Attachment #450459 - Flags: review?(dmandelin)
Attachment #450459 - Flags: review?(dbaron)
Comment on attachment 450459 [details] [diff] [review]
Patch

The Python side looks good. I have one change request, though:

>     def __call__(self):
>-        if self.test.enable or OPTIONS.run_skipped:
>+        if self.test.slow and not OPTIONS.run_slow_tests:
>+            return NullTestOutput(self.test)
>+        elif self.test.enable or OPTIONS.run_skipped:
>             return self.test.run(self.js_cmd_prefix, 

I would prefer to have this filter applied before running the tests, as another addition to the part in the 'main' script that goes like this:

    if OPTIONS.exclude_file:
        test_list = exclude_tests(test_list, OPTIONS.exclude_file)

    if not OPTIONS.random:
        test_list = [ _ for _ in test_list if not _.random ]

    if OPTIONS.run_only_skipped:
        OPTIONS.run_skipped = True
        test_list = [ _ for _ in test_list if not _.enable ]
I also noticed that not all the tests in the original bug report were marked slow.
Not all of the tests in the original report consistently, or even semi-consistently, or even inconsistently, time out in my experience.  Taking it a step at a time seems like a good idea.
I disagree : I think we should make them tinderbox only until we see them breaking where developers aren't catching locally. This code is quite self-contained, and likely to be very rarely changed, so I think we should do as this bug's initial description requests.
I should also note that I don't actually believe comment 0's list of tests to mark as slow was intended to be exact as well as illustrative.  In particular, if the list displayed were supposed to be the exact list to mark as slow, I immediately question why 15.2.3.6-dictionary-redefinition-[1-5]-of-8.js are listed but 15.2.3.6-dictionary-redefinition-[6-8]-of-8.js are not.  Also, why list dst-offset-caching-[2-8]-of-8.js but not dst-offset-caching-1-of-8.js?

Andreas, consider this a formal request: if you have a specific list of tests you would like to be marked as slow, please comment with it.
I just took a snapshot of all failing tests on my machine. I will see what fails after this check-in and comment here.
Comment on attachment 450459 [details] [diff] [review]
Patch

So the reason that this skips the slow tests by default for developers but not for tinderbox is that it makes the JS test runner's default be to skip slow tests, but the reftest harness's default not to skip slow tests?  I suppose that's ok.

In DoneTests in reftest.js, you need to add gTestResult.Slow to count right before where you dump it.  And I think I'd prefer slow after skip despite the additional blame-taking it requires, and maybe call it "skipped (slow)" rather than "slow"?

Could you document slow and slow-if in layout/tools/reftest/README.txt ?

When you land, please double-check that the tests you expect to run on tinderbox are running.

It might be good to have somebody else (ted?) review the changes to the python test harness code.
Attachment #450459 - Flags: review?(dbaron) → review+
I hadn't realized until now, but since dmandelin didn't actually flip the bit, and as there was the suggestion to get a review of the Python bits from Ted, let's try getting those before landing.

It's my birthday tomorrow \o/, and I'll be going on a quick backpacking trip from sometime tomorrow through Monday (back just in time for a summit flight).  It'd be great to land this before I head out tomorrow (wink wink), but if it doesn't happen, a little more delay won't matter much.
Attachment #450459 - Attachment is obsolete: true
Attachment #455625 - Flags: review?(ted.mielczarek)
Attachment #455625 - Flags: review?(dmandelin)
Attachment #450459 - Flags: review?(dmandelin)
Comment on attachment 455625 [details] [diff] [review]
Updated to comments so far

As the original author of the Python harness, I feel pretty qualified to review that part. If you want a second review, it might be more useful to get someone who knows about the browser harness.
Attachment #455625 - Flags: review?(dmandelin) → review+
Comment on attachment 455625 [details] [diff] [review]
Updated to comments so far

Hmm, yeah, looking back that probably was a suggestion for the jstest harness bits and not for the Python reftest bits -- and those are so minimal it's probably not necessary.  I'll land this in a bit after I rustle up a few more things to batch-push.
Attachment #455625 - Flags: review?(ted.mielczarek)
http://hg.mozilla.org/tracemonkey/rev/b17c8b926585

...and then backed out, seems to have broken since I last tested this:

http://hg.mozilla.org/tracemonkey/rev/84b1306e4431

:-\  Will investigate tomorrow.
Status: NEW → ASSIGNED
OS: Mac OS X → All
Hardware: x86 → All
Whiteboard: fixed-in-tracemonkey
Target Milestone: --- → mozilla2.0b3
Version: Other Branch → Trunk
Take two:

http://hg.mozilla.org/tracemonkey/rev/5aefe511ef9d

Tested out locally, should do fine (problem seems to have been mis-rebases at some point, the guts of the slow-parsing in reftest.js had mysteriously disappeared).
http://hg.mozilla.org/mozilla-central/rev/b17c8b926585
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.