Closed Bug 1515039 Opened 11 months ago Closed 10 months ago

Running jstests is extremely slow

Categories

(Core :: JavaScript Engine, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla66
Tracking Status
firefox66 --- fixed

People

(Reporter: evilpie, Assigned: jorendorff)

References

Details

Attachments

(3 files)

Before running tests is seems to load all WPT test manifests. Running a single test is now extremely slow.
Same as bug 1513584, maybe?
Will try to take a look this week or next; if one of you would have time to profile where we're spending the time, that would be great.
Attached file cProfiler output
I don't really know how to profile python code, but this is the output from:

% python -mcProfile -o prof.txt jstests.py ~/src/mozilla/obj-js-opt/dist/bin/js
(Ctrl-C after it starts running tests)
% python -c 'from pstats import Stats; s = Stats("prof.txt"); s.strip_dirs(); s.sort_stats("cumulative"); s.print_stats()'
Thanks!

Some of the relevant calls are below; it seems like gathering and parsing the plain jstests takes about 5.8s, and gathering the wpt tests takes about 3.2s. One thing that may be a bug is that we seem to call load_and_update four times. jgraham, could you check if that's supposed to happen?

Wed Jan  2 15:18:56 2019    prof.txt

         20345497 function calls (20328341 primitive calls) in 18.249 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.026    0.026   18.252   18.252 jstests.py:10(<module>)
        1    0.008    0.008   18.214   18.214 jstests.py:507(main)
      353    0.049    0.000    8.654    0.025 tasks_unix.py:213(run_all_tests)
     1995    0.071    0.000    6.084    0.003 progressbar.py:56(update)
      352    0.014    0.000    6.013    0.017 results.py:234(push)
     2058    5.955    0.003    5.955    0.003 {method 'flush' of 'file' objects}
    38520    0.040    0.000    5.944    0.000 jstests.py:501(<genexpr>)
    38520    0.012    0.000    5.898    0.000 jstests.py:498(<genexpr>)
    38613    0.016    0.000    5.886    0.000 jstests.py:491(<genexpr>)
    38590    0.102    0.000    5.870    0.000 manifest.py:409(load_reftests)

        1    0.001    0.001    3.205    3.205 jstests.py:301(load_wpt_tests)
        4    0.006    0.002    2.706    0.677 manifest.py:429(load_and_update)
        1    0.000    0.000    1.830    1.830 manifestupdate.py:65(run)
        1    0.010    0.010    1.811    1.811 manifestupdate.py:178(update)
     3335    0.003    0.000    1.587    0.000 manifest.py:93(test)
        7    0.001    0.000    1.407    0.201 __init__.py:258(load)
        7    0.000    0.000    1.388    0.198 __init__.py:294(loads)
        2    0.049    0.025    1.034    0.517 manifest.py:244(update)
        1    0.000    0.000    0.906    0.906 testloader.py:391(load)
        2    0.000    0.000    0.906    0.453 testloader.py:401(load_manifest)
     6515    0.078    0.000    0.703    0.000 gitignore.py:193(filter)
     6515    0.070    0.000    0.534    0.000 vcs.py:249(walk)
        1    0.000    0.000    0.226    0.226 products.py:29(load_product)
        1    0.000    0.000    0.226    0.226 products.py:13(product_module)
        1    0.002    0.002    0.226    0.226 firefox.py:1(<module>)
        4    0.000    0.000    0.203    0.051 manifest.py:372(from_json)
        4    0.060    0.015    0.203    0.051 manifest.py:382(<dictcomp>)
Flags: needinfo?(james)
(In reply to :Ms2ger (he/him; ⌚ UTC+1/+2) from comment #5)
> [...] and gathering the wpt tests takes about 3.2s.

It can take much longer, depending on how recent WPT was, I assume, synch-ed last. Any subsequent runs are much faster (it only takes 4-5 additional seconds to call |load_wpt_tests| in jstests.py), though.


For example after a restart of the VM I'm using for SpiderMonkey builds, I get the following numbers:
---
401501982 function calls (391767205 primitive calls) in 415.579 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.040    0.040  415.654  415.654 jstests.py:10(<module>)
        1    0.002    0.002  415.589  415.589 jstests.py:508(main)
        1    0.004    0.004  407.359  407.359 jstests.py:412(load_tests)
        1    0.007    0.007  405.120  405.120 jstests.py:301(load_wpt_tests)
        1    0.000    0.000  402.702  402.702 manifestupdate.py:65(run)
        4    0.003    0.001  401.042  100.260 manifest.py:429(load_and_update)
        1    0.016    0.016  400.062  400.062 manifestupdate.py:178(update)
        2    0.561    0.280  391.453  195.727 manifest.py:244(update)
720475/293120    2.560    0.000  377.683    0.001 utils.py:44(__get__)
    61335    0.853    0.000  356.684    0.006 sourcefile.py:593(manifest_items)
    42127    0.203    0.000  340.376    0.008 sourcefile.py:553(content_is_css_manual)
    43667    0.389    0.000  336.810    0.008 sourcefile.py:358(root)
    23685    0.053    0.000  331.582    0.014 sourcefile.py:144(<lambda>)
    23685    0.133    0.000  331.529    0.014 html5parser.py:27(parse)
    23685    0.084    0.000  311.337    0.013 html5parser.py:267(parse)
    23685    0.179    0.000  311.139    0.013 html5parser.py:125(_parse)
    23687   12.331    0.001  188.732    0.008 html5parser.py:196(mainLoop)
  2729367    3.233    0.000  137.485    0.000 html5parser.py:263(normalizedTokens)
  ...
---


And for a subsequent run:
---
5000366 function calls (4983536 primitive calls) in 9.356 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.005    0.005    9.361    9.361 jstests.py:10(<module>)
        1    0.001    0.001    9.338    9.338 jstests.py:508(main)
        1    0.000    0.000    5.641    5.641 jstests.py:412(load_tests)
        1    0.001    0.001    4.911    4.911 jstests.py:301(load_wpt_tests)
        4    0.010    0.002    4.240    1.060 manifest.py:429(load_and_update)
       35    0.008    0.000    3.673    0.105 tasks_unix.py:213(run_all_tests)
        1    0.000    0.000    3.143    3.143 manifestupdate.py:65(run)
        1    0.015    0.015    3.110    3.110 manifestupdate.py:178(update)
     2573    0.004    0.000    3.049    0.001 jstests.py:502(<genexpr>)
     2573    0.002    0.000    3.043    0.001 jstests.py:499(<genexpr>)
     2663    0.003    0.000    3.041    0.001 jstests.py:492(<genexpr>)
     2667    0.080    0.000    3.039    0.001 manifest.py:409(load_reftests)
        7    0.001    0.000    1.948    0.278 __init__.py:258(load)
        4    0.062    0.016    1.946    0.486 manifest.py:403(_load)
        2    0.079    0.039    1.923    0.961 manifest.py:244(update)
        7    0.000    0.000    1.917    0.274 __init__.py:294(loads)
        7    0.000    0.000    1.917    0.274 decoder.py:359(decode)
        7    1.917    0.274    1.917    0.274 decoder.py:370(raw_decode)
      373    0.002    0.000    1.866    0.005 manifest.py:93(test)
     2666    0.010    0.000    1.833    0.001 manifest.py:291(_parse_test_header)
      450    0.003    0.000    1.782    0.004 manifest.py:127(_parse_one)
    61337    0.113    0.000    1.658    0.000 vcs.py:131(__iter__)
       46    0.000    0.000    1.449    0.031 subprocess.py:769(communicate)
       42    0.001    0.000    1.443    0.034 subprocess.py:1408(_communicate)
       42    0.004    0.000    1.441    0.034 subprocess.py:1441(_communicate_with_poll)
       84    1.434    0.017    1.434    0.017 {built-in method poll}
     6524    0.168    0.000    1.321    0.000 gitignore.py:193(filter)
        1    0.000    0.000    1.145    1.145 testloader.py:391(load)
   ...
---
415 seconds suggests that we're not managing to download the cached manifest from taskcluster. Are you running this in a normal gecko tree? If not that could be the reason, otherwise we need to dig into that more.

The multiple calls to load_and_update are known; different bits of code aren't very aware of when other code has previously loaded the manifest. Upstream added a cache to optimise this case but for various reasons we're a little behind with updates so I don't think that's landed yet. Hopefully that change will shave a second or so off the best-case time.
Flags: needinfo?(james)

I've spoken with a few other JS engine folks and it's no exaggeration to say that everyone finds this situation basically unacceptable. There are a bunch of mitigating factors here, but still, running a 100-ms test shouldn't take 10 seconds in the best case.

This is P1. We're going to comment stuff out or put it behind flags until jstests.py is acceptable for ordinary JS developers doing basic engine development.

Assignee: nobody → jorendorff
Priority: -- → P1

This changes the default behavior of jstests.py to skip loading the wpt manifest if the user specified a particular test or tests to run on the command line. This should not affect anything that runs on our CI infrastructure, while making individual tests run much faster for engine hackers.

With this change, running an individual test on my machine takes 0.5s (down from 5s).

Depends on: 1519174
Depends on: 1519347

I spent a bit of time profiling the fast case here and there are a few low-hanging fruit, but nothing is going to make a big difference:

  • The meta_filter optimistation is broken (bug 1519347)
  • Importing mozinfo is slow (bug 1519398)
  • We were reading the manifest json twice for no good reason

With fixes for those things, I still find that importing wpt is 85% of the time for loading all the tests. About 40% of the total time is updating the manifest with most of that being iterating over the full set of wpt tests. It may be possible to do slightly better there; I see 68,000 files under testing/web-platform/tests compared to 42000 under js/src/tests/ but the time spent in walk() is a factor of 2 rather 1.5. However wpt also does more work here; it applies .gitignore rules which I think accounts for much of the difference between the time under walk() and the total time iterating over the tests. Loading the firefox.py module from wptrunner is still pretty slow (9% of total) and does a bunch of unnecessary work loading modules that won't be used; we could refactor that to avoid this work in the jsshell case. There are also possible optimistations with the TestLoader (about 4% of total).

Ultimately, though, if we update the wpt manifest it seems unlikely that we can do better than a 3.5x perf regression in startup time here (without significant work) because loading the manifest takes about as long as scanning for the existing js tests, and updating it takes at least as long. Without updating it each time we could maybe get as low as a 2x performance regression (with some additional work), but the obvious implementation of that would be to have a --wpt-update flag which people would have to remember to pass if they updated a wpt test. I don't know if that's a better solution than requiring people to pass a flag to run the wpt tests at all.

Why do we need to depend on WPT for running the JS shell?
Could we only do these import if-and-only-if we need to do these imports?

It's only for the jstest.py tests (I'm sure you know that, just being clear)

Some of the js tests are shared with wpt (we run a test derived from the same source file in the browser and in the shell). Because the jstest test matching is substring based rather than prefix based it's hard to know ahead of time whether any tests in the wpt set will be matched by the command arguments. Therefore we have to either load the wpt stuff every time or require an explicit opt-in to enable the wpt test subset.

(In reply to James Graham [:jgraham] from comment #11)

About 40% of the total time is updating the manifest.

Could we have a pre-computed index generated each time we import or add tests to js/tests?

it applies .gitignore rules which I think accounts for much of the difference between the time under walk() and the total time iterating over the tests.

I think as a JS engine developer we do not care about any of these, if you have an extra file it might as well be checked too. As we do for any tests under js/jit-test.

Loading the firefox.py module from wptrunner is still pretty slow (9% of total) and does a bunch of unnecessary work loading modules that won't be used; we could refactor that to avoid this work in the jsshell case.

Definitely.

(In reply to Nicolas B. Pierron [:nbp] from comment #14)

(In reply to James Graham [:jgraham] from comment #11)

About 40% of the total time is updating the manifest.

Could we have a pre-computed index generated each time we import or add tests to js/tests?

That's effectively how things already work with the proviso that there's no step where the tests are imported to js/tests; tests in wpt can directly declare that they're suitable to run in a jsshell and that's the set that we pick up to run with jstests.py. The "pre-computed index" is the MANIFEST.json file that we download and update. Generating that from scratch takes several minutes, checking it's up to date takes a couple of seconds. It's the load time for the large json file plus that couple of seconds that's the performance regression here.

it applies .gitignore rules which I think accounts for much of the difference between the time under walk() and the total time iterating over the tests.

I think as a JS engine developer we do not care about any of these, if you have an extra file it might as well be checked too. As we do for any tests under js/jit-test.

I guess it would be possible to have a flag saying "don't apply the gitignore rules here", but it guarantees that you will generate an invalid manifest (since some files are checked in which actually match a rule). That might not matter for the jsshell case but could matter if you're sharing a source tree with other gecko work (but I'm not sure how much it matters).

Loading the firefox.py module from wptrunner is still pretty slow (9% of total) and does a bunch of unnecessary work loading modules that won't be used; we could refactor that to avoid this work in the jsshell case.

Definitely.

Well except it's a win of a few percent in the case that we fix a number of other things some of which aren't as easy to land as as I hoped (in particular making mozinfo more efficient to load turns out to be non-trivial). So it's not going to make a qualitative difference to the outcome here (i.e. it won't change the decision to not load these tests by default).

(In reply to James Graham [:jgraham] from comment #13)

Because the jstest test matching is substring based rather than prefix based

I don't think that has to be a hard requirement. I don't know for sure how people use jstests.py, but at least for me, I only use prefixes anyway. (And if that functionality is important, perhaps we could replace substring testing with pattern matching like 'substr'.)

Honestly, though, I suspect a lot of people have kind of stopped using jstests for latency-sensitive smoke testing ever since we imported all of test262. ("latency-sensitive" meaning the developer is going to sit and wait for the results, rather than firing off the whole suite and checking back later.) I didn't; I just run it with 'non262' as an argument instead (because the test262 tests are mostly useless for GC testing; I need it to do enough work to trigger GCs.) That has gotten a lot less useful with the wpt slowdown, to the point where I usually run jit-tests now instead (and I don't like jit_test.py's command line interface as much.)

I know that removing substring matching functionality might be scope creep here, given that you'd need to establish some level of consensus, but I just didn't want that to be an artificial barrier if it doesn't need to be.

(In reply to Steve Fink [:sfink] [:s:] from comment #16)

I don't think that has to be a hard requirement. I don't know for sure how people use jstests.py, but at least for me, I only use prefixes anyway.

I use substring matches all the time. When adding a new test or debugging a failure it's nice that you can just use (part of) the filename. Or when I make some changes to, say, Date objects, I'll run |jstests.py Date| as a quick smoke test.

Pushed by jorendorff@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d945f4dd088b
Running jstests is extremely slow. r=jandem

(In reply to Jan de Mooij [:jandem] from comment #17)

(In reply to Steve Fink [:sfink] [:s:] from comment #16)

I don't think that has to be a hard requirement. I don't know for sure how people use jstests.py, but at least for me, I only use prefixes anyway.

I use substring matches all the time. When adding a new test or debugging a failure it's nice that you can just use (part of) the filename. Or when I make some changes to, say, Date objects, I'll run |jstests.py Date| as a quick smoke test.

Ok, fair. And that seems useful enough that I should give it a try.

Status: NEW → RESOLVED
Closed: 10 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla66
You need to log in before you can comment on or make changes to this bug.