<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Comment 3

•

9 years ago

Attached patch bug977849-part1.patch (obsolete) — Details — Splinter Review

A new update script to pull test262 from GitHub and then pre-process each test file to be compatible with the jstests harness.

Assignee: jwalden+bmo → andrebargull

Attachment #8809649 - Flags: review?(evilpies)

Comment 4

•

9 years ago

Attached patch bug977849-part2.patch (obsolete) — Details — Splinter Review

Remove all old test262 files.

Attachment #8809650 - Flags: review?(evilpies)

Comment 5

•

9 years ago

Attached file bug977849-part3.zip (obsolete) — Details

Add the new test262 files (zipped patch because bugzilla doesn't allow 70MB patches *shocking* :-p).

Attachment #8809653 - Flags: review?(evilpies)

Comment 6

•

9 years ago

Attached patch bug977849-part4.patch (obsolete) — Details — Splinter Review

And finally the updated exclusion list for tests which don't pass in SpiderMonkey.

Attachment #8809655 - Flags: review?(evilpies)

Comment 7

•

9 years ago

I haven't yet tried to support the async Promise tests, that needs to happen next. :-)

Comment 8

•

9 years ago

Thanks for doing this work! I wish we had this two years ago :) However we probably need to talk to different people before landing this ... Previously my idea is to use the nodejs runner instead of using our test runner, so we don't pay a cost of keeping the tests running. We also talked about having treeherder/task cluster download the test from git instead of putting the files in mercurial. 70MB is a huge repository increase so we need to discuss this with the appropriate peers.

Comment 9

•

9 years ago

Another caveat is the increased time required to finish jstests. With the old test262 checkout (~3300 files), I need about 1 minute to run all test262 tests in my development VM (with an opt build). The new test262 checkout (~42300 files, most tests duplicated to run in non-strict and strict mode) requires about 13 minutes on the same machine.

Comment 10

•

9 years ago

Oh and thanks for classifying all the errors. Must have been a lot of work and we could use this to reduce our number of failures.

Comment 11

•

9 years ago

Statistics: 1) With test duplication to run tests in non-strict and strict mode: - Increased run time for jstests.py (with -j5): 13x - 47.791 objects, total size 59,4 MB 2) Without test duplication: - Increased run time for jstests.py (with -j5): 6.5x - 26.284 objects, total size 32,9 MB

Comment 12

•

9 years ago

Hey gps, is this an acceptable increase? Jan suggested to ping you about this.

Flags: needinfo?(gps)

Comment 13

•

9 years ago

Attached patch bug977849-part1.patch (obsolete) — Details — Splinter Review

Updated patches to skip the test duplication for strict-mode tests.

Attachment #8809649 - Attachment is obsolete: true

Attachment #8809649 - Flags: review?(evilpies)

Attachment #8809984 - Flags: review?(evilpies)

Comment 14

•

9 years ago

Attached patch bug977849-part2.patch (obsolete) — Details — Splinter Review

Attachment #8809650 - Attachment is obsolete: true

Attachment #8809650 - Flags: review?(evilpies)

Attachment #8809985 - Flags: review?(evilpies)

Comment 15

•

9 years ago

Attached file bug977849-part3.zip (obsolete) — Details

Attachment #8809653 - Attachment is obsolete: true

Attachment #8809653 - Flags: review?(evilpies)

Attachment #8809986 - Flags: review?(evilpies)

Comment 16

•

9 years ago

Attached patch bug977849-part4.patch (obsolete) — Details — Splinter Review

Attachment #8809655 - Attachment is obsolete: true

Attachment #8809655 - Flags: review?(evilpies)

Attachment #8809987 - Flags: review?(evilpies)

Updated

•

9 years ago

Attachment #8809985 - Flags: review?(evilpies) → review+

Comment 17

•

9 years ago

Comment on attachment 8809984 [details] [diff] [review] bug977849-part1.patch Review of attachment 8809984 [details] [diff] [review]: ----------------------------------------------------------------- I don't think you need to change anything here, as long as this works! \o/ ::: js/src/tests/test262-harness.diff @@ +2,5 @@ > +index 52c9021..eeabfd8 100644 > +--- a/harness/detachArrayBuffer.js > ++++ b/harness/detachArrayBuffer.js > +@@ -1,3 +1,3 @@ > + function $DETACHBUFFER(buffer) { Why isn't test262 using $.detachArrayBuffer? ::: js/src/tests/test262-host.js @@ +6,5 @@ > +this.$ = { > + __proto__: null, > + createRealm() { > + var newGlobal = this.newGlobal(); > + newGlobal.evaluate(` Couldn't we just define this once at the top level and use eval for that definiton as well? @@ +30,5 @@ > + // This function is generally called from within a Promise handler, so any > + // exception thrown by this method will be swallowed and most likely > + // ignored by the Promise machinery. > + if ($mozAsyncTestDone) > + reportFailure("$DONE() already called"); Oh we don't use that unhanded rejected promise machinery in the shell? We probably should. ::: js/src/tests/test262-update.py @@ +32,5 @@ > + """ > + import imp > + > + packagingDir = os.path.join(test262Dir, "tools", "packaging") > + return imp.load_source("test262parser", os.path.join(packagingDir, "parseTestRecord.py")) I think that function was removed in future versions. load_module wasn't deprecated till python 3.3. @@ +67,5 @@ > + skipIfTest = filterRefTest(refTest, "skip-if") > + failsTest = filterRefTest(refTest, "fails") > + > + if skipTest: > + comments = ", ".join(imap(itemgetter(2), skipTest)) Personally I find using tuples confusing in this case. @@ +97,5 @@ > + # Prepend a possible "use strict" directive. > + if directive: > + source = directive + "\n" + source > + > + # Add the |reftest| is present. if @@ +393,5 @@ > + strictGroup = parser.add_mutually_exclusive_group() > + strictGroup.add_argument("--strict", default=False, action="store_true", dest="strict", > + help="Generate additional strict mode tests.") > + strictGroup.add_argument("--no-strict", default=False, action="store_false", dest="strict", > + help="Don't generate additional strict mode tests. This is the default mode.") How does it know which one is the default?

Attachment #8809984 - Flags: review?(evilpies) → review+

Comment 18

•

9 years ago

Comment on attachment 8809986 [details] bug977849-part3.zip Okay, maybe it would be possible to move some of the imports to the top level, e.g those that provide verifyEnumerable etc. It seems like that is duplicated in a lot of shell.js files.

Comment 19

•

9 years ago

Comment on attachment 8809987 [details] [diff] [review] bug977849-part4.patch Review of attachment 8809987 [details] [diff] [review]: ----------------------------------------------------------------- This is amazing and super useful. We should try to prioritize some of those failures.

Attachment #8809987 - Flags: review?(evilpies) → review+

Comment 20

•

9 years ago

Attached patch bug977849-part0.patch — Details — Splinter Review

I've separated the jstests framework changes into this new patch for clarity. browser.js: Test262 requires a host provided function to evaluate source code as global script code. In the shell, we can use the evaluate() function. But in browser, no builtin function is available to evaluate source code as global script code. We can provide this functionality through <script> elements, but there are some caveats: 1. Evaluating source code through <script> doesn't return the completion value. In the previous patch, I named the new browser.js function "evaluate" like it's shell counterpart, but that broke existing jstests which expect evaluate() to return a completion value. So I needed to use a different name for this function (it's now named "evaluateScript"). 2. Catching and rethrowing errors needs some extra effort, but it seems to work so far. 3. I had to learn that it's not possible to evaluate source code with <script> in an <iframe>, when the <iframe> is no longer attached to the DOM (*). That means I needed to remove the call to HTMLIFramePrototypeRemove in browser.js' newGlobal() function. (*) And I also had to learn that the same restriction applies to Workers in <iframe>s. :-) manifest.py: manifest.py applies different reftest terms on top of each other, but only stores the last terms string. This leads to different behaviour when running test262 in the shell compared to the browser reftests. In the shell, the reftest terms are cumulative and it's possible to use inline |reftest| comments and jstests.list entries. But in the browser, only the inline |reftest| comments was used.

Attachment #8810552 - Flags: review?(evilpies)

Comment 21

•

9 years ago

Attached patch bug977849-part1.patch (obsolete) — Details — Splinter Review

Updated part 1 per review comments: - Removed code duplication in test262-host.js - Replaced imp.load_source with imp.find_module + imp.load_module - Simplified reftest string creation ("fails" entries weren't actually used any more) - Common harness files are now bundled in top level shell.js files to reduce code duplication in the generated test262 files - And Promise tests are now also enabled for browser jstests Carrying r+ from evilpie

Attachment #8809984 - Attachment is obsolete: true

Attachment #8810555 - Flags: review+

Comment 22

•

9 years ago

Attached file bug977849-part3.zip (obsolete) — Details

Regenerated part 3 again.

Attachment #8809986 - Attachment is obsolete: true

Attachment #8809986 - Flags: review?(evilpies)

Attachment #8810556 - Flags: review?(evilpies)

Comment 23

•

9 years ago

Attached patch bug977849-part4.patch (obsolete) — Details — Splinter Review

Updated part 4 to exclude additional tests when running test262 in the browser (*) and added new bugzilla links for failing tests. Carrying r+ from evilpie. Note: I'll need to update this part again because some tests are timing out on certain platforms. (*) Not all test262 are compatible for browser environments!

Attachment #8809987 - Attachment is obsolete: true

Attachment #8810561 - Flags: review+

Comment 24

•

9 years ago

It also seems to be necessary to use more chunks when jsreftest includes test262 to avoid frequent "Output exceeded 52428800 bytes, remaining output has been truncated" and "command timed out: 7200 seconds elapsed running" issues on Try [1]. [1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=cad4b1851e6d9f237e84eee688317a7d58bbc1d7

Comment 26

•

9 years ago

Comment on attachment 8810552 [details] [diff] [review] bug977849-part0.patch Review of attachment 8810552 [details] [diff] [review]: ----------------------------------------------------------------- Interesting. I didn't spot anything obviously wrong. The onbeforescriptexecute stuff in evaluateScript might be a bit too complicated, but as long as it works. ::: js/src/tests/lib/manifest.py @@ +276,5 @@ > return > > testcase.tag = matches.group(1) > + _append_terms_and_comment(testcase, matches.group(2), matches.group(4)) > + _parse_one(testcase, matches.group(2), xul_tester) So it's correct not to use testcase.terms here?

Attachment #8810552 - Flags: review?(evilpies) → review+

Comment 27

•

9 years ago

Comment on attachment 8810556 [details] bug977849-part3.zip rs=me, great to see the smaller shell.js files!

Attachment #8810556 - Flags: review?(evilpies) → review+

Comment 28

•

9 years ago

(In reply to Tom Schuster [:evilpie] from comment #26) > ::: js/src/tests/lib/manifest.py > @@ +276,5 @@ > > return > > > > testcase.tag = matches.group(1) > > + _append_terms_and_comment(testcase, matches.group(2), matches.group(4)) > > + _parse_one(testcase, matches.group(2), xul_tester) > > So it's correct not to use testcase.terms here? Yes, this simply avoids to parse any previous terms again.

Comment 29

•

9 years ago

(In reply to Tom Schuster [:evilpie] from comment #12) > Hey gps, is this an acceptable increase? Jan suggested to ping you about > this. Thank you for checking for implications of landing this before landing this! I'm a big proponent of monorepos and vendoring all of the things, like the ~30k new test files this bug wants to add. However, that mindset has to be tempered by the reality that adding tens of thousands of files has performance implications for several processes: * clone size/time, both for developers and for automation * working directory update time, both for developers and for automation (it is a bigger deal in automation because machines are doing large updates more frequently than humans) * `hg status` and `git status` performance (although Mercurial has a mitigation for that via the `watchman` filesystem monitor) Combined, this can contribute significant one-time and recurring overhead. We want mozilla-central to be a useful monorepo and to scale to millions of files (if needed). However, before that can realistically happen: * We need at least CI/automation performing sparse checkouts (only check out the files you need, not every file). This may require Mercurial's sparse checkout functionality being part of the core distribution (as opposed to a 3rd party extension) * We may also want to support "narrow clones" (cloning a subset of files) so people and machines don't have to download data for thousands of files they likely don't care about. * Git's scaling story for working directories with hundreds of thousands of files needs to improve. Without filesystem watching, performance of operations like `git status` can go off a cliff once the working directory reaches certain inode thresholds. We have rough timelines for the first 2 items. And if push came to shove, Git client performance issues would likely not block us from adding more files to mozilla-central. Or the Git tooling would be taught to ignore certain directories to allow it to work/scale. If we can avoid adding the ~30k new test files at this time, it would be preferred. If it needs to happen (e.g. there is a compelling developer productivity case to be made) we can do it. But I'd prefer to not press the scaling issues if we don't have to. As for alternatives, we can have automation grab the tests from a separate repository. We don't like automation relying on 3rd party services (e.g. github.com) because we don't like outages of 3rd party services causing tree closures. Unfortunately, we don't have a good Git hosting story that scales well. So your alternate choices seem to be: 1) have automation clone from Git; make jobs tier-1 and incur a tree closure when Git host goes down 2) same as #1 except jobs are tier-2 and can be orange for long periods if Git host goes down 3) mirror Git repository to hg.mozilla.org and have automation consume it (there was precedence for this with FxOS foo and I believe we have existing software/infrastructure to throw at the problem)

Flags: needinfo?(gps)

Comment 30

•

9 years ago

Thank you gps! See bug 1285372 about the alternative approach of using the nodejs test262-harness and a separate test262 git checkout to run the tests.

Comment 31

•

9 years ago

I should also mention that this isn't the only project that may want to add tons of new files: * vendoring servo will add ~10k files * vendoring WPT CSS tests would add ~100k files (!!!) mozilla-central is currently ~150k files. From what I've heard from Facebook's experience scaling their monorepo, mozilla-central is on the verge of running into some problems. With vendoring of servo being a sure thing, I really don't want to take the risk that the ~30k files in this bug jeopardize the servo vendoring. Once servo is vendored in and we have some more vcs syncing infrastructure in place around that, I'd feel much better about taking these JS test files and (eventually) the WPT CSS tests.

James Graham [:jgraham]

Comment 32

•

9 years ago

> * vendoring WPT CSS tests would add ~100k files (!!!) I don't know where that number comes from, but I think the reality is closer to 35k files. jgraham@luna:~/develop/csswg-test$ find . | wc -l 34660 FWIW I think that basing a stopgap solution on FxOS precedent would be a bad idea; that infrastructure was fragile and no one was sad to see the back of it. If you need a quick hack, don't want two-way sync and plan to update the tests relatively infrequently, uploading a zip of the testsuite to tooltool could work. It's not a great solution, but nothing other than "vendor the files in tree" is.

Comment 33

•

9 years ago

Attached patch bug977849-part2.patch (obsolete) — Details — Splinter Review

Updated part 2 to apply cleanly on inbound. Carrying r+ from evilpie.

Attachment #8809985 - Attachment is obsolete: true

Attachment #8816707 - Flags: review+

Comment 34

•

9 years ago

Attached patch bug977849-part4.patch (obsolete) — Details — Splinter Review

Updated part 4 to enable more test262 tests. Carrying r+.

Attachment #8810561 - Attachment is obsolete: true

Attachment #8816708 - Flags: review+

Comment 35

•

8 years ago

Attached patch bug977849-part1.patch — Details — Splinter Review

test262-harness.diff is no longer required, updated part 1 accordingly.

Attachment #8810555 - Attachment is obsolete: true

Attachment #8824405 - Flags: review+

Comment 36

•

8 years ago

Attached patch bug977849-part2.patch (obsolete) — Details — Splinter Review

Updated part 2 to apply cleanly on inbound.

Attachment #8816707 - Attachment is obsolete: true

Attachment #8824406 - Flags: review+

Comment 37

•

8 years ago

Attached file bug977849-part3.zip (obsolete) — Details

Update part 3 to use newer checkout from test262.

Attachment #8810556 - Attachment is obsolete: true

Attachment #8824407 - Flags: review+

Comment 38

•

8 years ago

Attached patch bug977849-part4.patch (obsolete) — Details — Splinter Review

Update exclusion list to match current test262 status.

Attachment #8816708 - Attachment is obsolete: true

Attachment #8824408 - Flags: review+

Comment 39

•

8 years ago

Attached patch bug977849-part2.patch — Details — Splinter Review

Attachment #8824406 - Attachment is obsolete: true

Attachment #8827947 - Flags: review+

Comment 40

•

8 years ago

Attached file bug977849-part3.zip — Details

Attachment #8824407 - Attachment is obsolete: true

Attachment #8827948 - Flags: review+

Steve Fink [:sfink] [:s:]

Comment 41

•

8 years ago

Attached patch bug977849-part4.patch — Details — Splinter Review

Attachment #8824408 - Attachment is obsolete: true

Attachment #8827949 - Flags: review+

Assignee

Comment 42

•

8 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=74ca7afe6a081d20bec23557946ab67a4570acba is a try push with SM-tc(262) jobs for running these tests. But the patches in this bug do not reflect the bug title; these patches add the files into the gecko repo directly. I think I heard that gps has changed his opinion since comments 29 and 31? If so, it would be good to document that here. And if not, I can look into changing the job to checking out from github and storing a revision id file in the gecko tree instead.

Flags: needinfo?(gps)

Comment 43

•

8 years ago

Servo vendoring will add ~2k files instead of ~10k. So I'm less worried about file count and am tentatively OK with vendoring these test-262 files. Before we commit to vendoring these files in mozilla-central, I think we should prove out the value and feasibility of running these tests in automation. I think we should stand up tier-2 or tier-3 automation that runs these tests after cloning from GitHub. Once those are running for a few weeks or months and all the major issues are ironed out (sheriffing, intermittent failures, etc), we can vendor the test files in mozilla-central, remove the GitHub dependency, and make these a tier-1 test job. My concern with jumping straight in is that we vendor all these files (an irreversible decision) and find a showstopper or other similar issue preventing or significantly delaying the running of the tests in automation. While I assume the risk is low and we'll eventually run test-262 from mozilla-central, it is a risk. And the cost to initially running from GitHub feels low compared to the cost of having all the extra data in mozilla-central forever. Also, the VCS syncing infrastructure I'm building for Servo in bug 1322769 can be leveraged to automatically synchronize/vendor the external Git[Hub] repo into a subdirectory of mozilla-central.

Flags: needinfo?(gps)

Steve Fink [:sfink] [:s:]

Comment 44

•

8 years ago

Oh, if you really want tier-1 automation status, we can mirror the Git repo to hg.mozilla.org and have automation clone from there instead of GitHub. We could live in that state indefinitely and vendor the files in mozilla-central at your leisure. We'd need to pin commit hashes inside mozilla-central so behavior in automation is deterministic, of course.

Assignee

Comment 45

•

8 years ago

Darn it, my bugmail somehow ate your reply. Need to track that down. (In reply to Gregory Szorc [:gps] from comment #44) > Oh, if you really want tier-1 automation status, we can mirror the Git repo > to hg.mozilla.org and have automation clone from there instead of GitHub. We > could live in that state indefinitely and vendor the files in > mozilla-central at your leisure. We'd need to pin commit hashes inside > mozilla-central so behavior in automation is deterministic, of course. Ok, that was what I originally expected your desired outcome to be. But now I'm thinking that maybe vendoring is a better idea. When I asked on dev-tech-js-internals whether people wanted to separate out these test262 tests so they wouldn't slow down regular test runs during development, the consensus seemed to be that these test262 tests should really be replacing many of our existing in-tree tests, and that we should be relying on the new tests to catch problems during development. And unless we expect people to start pulling down an external repo for testing, that seems to argue for vendoring if it doesn't cause too many other problems. Also, I looked at the test job time increase. It wasn't as bad as expected -- the worst one is the debug job, which goes from 25 minutes to 40 minutes. For a CI job, that really doesn't seem too awful. I get what you mean about this being irreversible, though. I'll look more closely at what anba has done here to see how hard it would be to try things out first with a github pull.