Last Comment Bug 686240 - jit_test.py should run tests in parallel
: jit_test.py should run tests in parallel
Status: RESOLVED DUPLICATE of bug 638219
[buildfaster:?]
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: All All
: -- normal (vote)
: ---
Assigned To: general
:
: Jason Orendorff [:jorendorff]
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-09-11 16:05 PDT by Gregory Szorc [:gps]
Modified: 2012-12-14 10:46 PST (History)
8 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments

Description Gregory Szorc [:gps] 2011-09-11 16:05:44 PDT
The JavaScript JIT test harness is currently executing very inefficiently by spawning off thousands of new `js` processes, each creating its own JS runtime and context. New processes are expensive, especially on Windows (where they are about 10x slower than *NIX OS's). And, the overhead associated with them adds up. On my personal desktop, the current test harness is only using about 25-50% of a single core's capacity during much of the test suite. It only reaches 100% when executing tests that require many CPU cycles to complete.

For something that should be CPU bound, we are wasting a lot of wall clock time.

For kicks, I hacked up a standalone C++ program that takes a list of JS files and executes them under separate contexts (each file is executed in 18 separate contexts for the various combinations of engine/context related options). So, instead of thousands of processes, we have 1 process running thousands of contexts.

The results are very promising! The (single threaded) process is consistently maxing out a full core. On my reasonably fast i7-2600K, wall time goes from 8:40 to 7:00, a healthy 20% reduction. Of course, this was only on a single core. Since I'm now maxing out a core, I could theoretically get near linear gains by making things multi-threaded and executing on multiple cores. Assuming linear gains, going to 4 cores would yield an overall wall time of 1:45. And 8 cores would be ~0:52s. Not too shabby considering we started with 8:40 wall time.

I'm not sure how long it takes our current build machines to run the JIT test suite (I couldn't find good timing data in the logs). But, my current model MBP takes ~22 minutes. A savings of 20% with single core would net ~4.5 minutes wall time. If I utilized 4 cores, I might be looking at ~17 minutes wall time savings. Now that's going faster!

Now, this approach isn't all rosy. One gotcha is that on some failures, processes can segfault. The current execution method works around this by segregating every test on a new process. So, if we're serious about going faster and shaving minutes off of build times, we'll need to work around this drawback. There are various solutions such as having the jit_test.py driver restart the master execution process after where it crashed. Or, the C++ process could fork and do work in the children (so the parent process can catch a crash and recover gracefully.) Or, Python could start up a C++ worker pool and start tests via IPC and respawn after failure. Anyway, there are solutions. The new world would likely be a little more complicated than current, but I think the potential gains are worth it. For this type of solution, I made the assumption that misbehaving JIT code won't corrupt the underlying JSRuntime and that JSContext instances are completely isolated. I have no clue if these are valid assumptions. (Can someone in JS land validate?)

My proof of concept code is located at https://github.com/indygreg/mozilla-central/tree/jit-test-speedup. The main diff from m-c can be found at https://github.com/indygreg/mozilla-central/commit/ddfd6caacb8c779cb0eea44af8a0a881689a0318. I'm fully aware that the code is crap and jit_test.py is horribly broken. My main objective in writing it was to prove my hunch that the current implementation was far from optimal and that we could go much faster. I think I've made the case on both points and now relinquish this bug to the JS and RelEng teams for further action.
Comment 1 Gregory Szorc [:gps] 2012-12-14 10:24:24 PST
I was looking at the test running code in js/src last night over a beer. It seems to me the "easy" solution here would be to refactor jit_test.py to use the same test running "framework" as jstests.py. The crazy single process model described in the initial comment could be deferred to a follow-up bug if things aren't fast enough.

FWIW, the timings for jit_tests.py on my MBP are as follows:

real	18m49.003s
user	8m39.230s
sys	4m18.158s

We have ~13 minutes of CPU time running tests with one core. Assuming we could max out all 4 physical cores plus the 4 hyperthreading threads and yield 25% from hyperthreading, we'd get a nice 5x speedup and would execute tests in about 2.5 minutes! 

|make check| takes about 30 minutes on builbot machines (this executes jit_tests.py). Parallel jit_test.py execution would shave a *lot* of time off of |make check| and free up build machines to perform more builds.
Comment 2 Ted Mielczarek [:ted.mielczarek] 2012-12-14 10:41:26 PST
FYI bug 638219 covers merging those two harnesses.
Comment 3 Ted Mielczarek [:ted.mielczarek] 2012-12-14 10:42:35 PST
Also I'd love to get these harnesses packaged up with the rest of the tests and run on the test slaves instead of the build slaves. Then we could fix them to run on mobile as well and gain the ARM test coverage we're currently lacking.
Comment 4 Gregory Szorc [:gps] 2012-12-14 10:46:58 PST
Well then.

*** This bug has been marked as a duplicate of bug 638219 ***

Note You need to log in before you can comment on or make changes to this bug.