Closed
Bug 660788
Opened 14 years ago
Closed 11 years ago
Run xpcshell tests in parallel
Categories
(Testing :: XPCShell Harness, defect)
Testing
XPCShell Harness
Tracking
(Not tracked)
RESOLVED
FIXED
mozilla26
People
(Reporter: ted, Assigned: mihneadb)
References
Details
Attachments
(1 file, 1 obsolete file)
1006 bytes,
patch
|
ted
:
review+
|
Details | Diff | Splinter Review |
Currently the xpcshell harness runs each test in a new xpcshell process one-at-a-time. Previously this had to be done this way because component registration would write compreg.dat in the app dir. Nowadays we have manifest-driven component registration, so this should be feasible.
Comment 1•14 years ago
|
||
Do we want to run them in parallel or just use the same process? I ask because parallel running tests would result in logs that would be difficult to read in times of failure.
Reporter | ||
Comment 2•14 years ago
|
||
I think we need to continue to use a fresh process for each test, to ensure that each test gets a clean environment. The harness should be able to run tests in parallel but serialize the output from them to make it less confusing. (It already buffers all test output anyway.)
Reporter | ||
Updated•13 years ago
|
Assignee: ted.mielczarek → nobody
Comment 3•12 years ago
|
||
I've been fooling around with some Python and made this. It's far from shippable but on my (2-core, non SSD) machine it's giving a ~40% speed up. If anyone can help finish it off please go for it.
Comment 4•12 years ago
|
||
Comment on attachment 654928 [details] [diff] [review]
WIP
Review of attachment 654928 [details] [diff] [review]:
-----------------------------------------------------------------
Thank you for doing this! It's definitely on the right road.
I would refactor this to use multiprocessing.Pool. That way, you have true separate processes and you won't run into the GIL (the threading module has GIL issues). It will also automagically use the number of processors your machine has, likely yielding optimal results.
That being said, I'm not sure if every xpcshell test runner in the build infra has Python 2.6+ and thus access to multiprocessing. You can always add an "import multiprocessing", push to try, and see what happens.
There will also be real-world issues with tests in some test suites running in parallel. e.g. I'm pretty sure you will get random failures in services/sync/tests/unit because many of those tests establish a HTTP server on a hard-coded port. We definitely need support in the xpcshell.ini manifests to control whether parallel execution is allowed. The optimist in me says to default to parallel and only disable if we see issues. Over time, we can identify pain points preventing parallel execution (e.g. we need a way for Python to tell the xpcshell environment what network ports it can listen/bind on and tests need to use that rather than hard-coded port numbers).
I'm not sure how a flag in xpcshell.ini will play with the master manifest file (e.g. |make xpcshell-tests| from topobjdir). We obviously want the "no parallel" flag to be as fine as we can make it, otherwise we'll be limiting execution too much. I'm not sure if the manifest parser has a way of intelligently representing this.
Since you will be touching a lot of logic in the test execution bits, I'd appreciate a part 0 that factors out the "execute an individual test process" code into a standalone method and have the for loop or your pool worker call into that. I think that will result in cleaner code in the end. runxpcshelltests.py is hard enough to read as it is...
::: testing/xpcshell/head.js
@@ +339,5 @@
> + " if (isHttps || isWebSocket || isWebSocketSSL)" +
> + " return 'PROXY localhost:" + port + "';" +
> + " return 'DIRECT';" +
> + "}");
> + }
Did this sneak in?
Attachment #654928 -
Flags: feedback+
Reporter | ||
Comment 5•12 years ago
|
||
I would love for the xpcshell harness and the JS harnesses to be merged someday. The JS harness already has this capability, FWIW.
(In reply to Gregory Szorc [:gps] from comment #4)
> That being said, I'm not sure if every xpcshell test runner in the build
> infra has Python 2.6+ and thus access to multiprocessing. You can always add
> an "import multiprocessing", push to try, and see what happens.
We only require 2.5 to build, still. If the build infra is up to the task we could require 2.6, but I would bet against that.
> There will also be real-world issues with tests in some test suites running
> in parallel. e.g. I'm pretty sure you will get random failures in
> services/sync/tests/unit because many of those tests establish a HTTP server
> on a hard-coded port. We definitely need support in the xpcshell.ini
> manifests to control whether parallel execution is allowed. The optimist in
> me says to default to parallel and only disable if we see issues. Over time,
> we can identify pain points preventing parallel execution (e.g. we need a
> way for Python to tell the xpcshell environment what network ports it can
> listen/bind on and tests need to use that rather than hard-coded port
> numbers).
I'd like to see (as a separate bug) a way for all the test servers we use to bind to an arbitrary free port and report that to tests so that we don't have to use hardcoded ports at all.
![]() |
||
Comment 6•12 years ago
|
||
> We only require 2.5 to build, still. If the build infra is up to the task we
> could require 2.6, but I would bet against that.
We now require Python 2.6 to build, see bug 800614.
Comment 7•12 years ago
|
||
Building != running tests.
We still have test runners running older Python than what the build itself requires.
Work towards running Python 2.7 everywhere is ongoing and tracked in bug 724191.
Comment 8•12 years ago
|
||
According to bug 883983, httpd.js can choose its own port. This will go a long way towards abating the resource contention in xpcshell tests.
Assignee | ||
Comment 9•12 years ago
|
||
Comment on attachment 654928 [details] [diff] [review]
WIP
Marking older WIP patch as obsolete, hope it's fine.
After bug 887054 lands, I'll upload a patch that turns on parallel runs on automation here.
Attachment #654928 -
Attachment is obsolete: true
Assignee | ||
Comment 10•12 years ago
|
||
So this patch will turn on parxpc in automation. Putting it here so we'll have
it ready to try landing soon.
As discussed in bug 898658, I'll do some "stress test" runs locally trying to
identify broken tests and after that we can try landing this.
Attachment #790940 -
Flags: review?(ted)
Assignee | ||
Updated•12 years ago
|
Assignee: nobody → mihneadb
Status: NEW → ASSIGNED
Reporter | ||
Updated•12 years ago
|
Attachment #790940 -
Flags: review?(ted) → review+
Assignee | ||
Updated•11 years ago
|
Keywords: checkin-needed
Comment 11•11 years ago
|
||
Keywords: checkin-needed
Comment 12•11 years ago
|
||
\o/
Please blog this. Break your arm doing it.
Assignee | ||
Comment 13•11 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #12)
> \o/
>
> Please blog this. Break your arm doing it.
I'm waiting for it to reach central. But yes.
Assignee | ||
Comment 14•11 years ago
|
||
Ted, it seems that killPid in the windows version of automation.py hung[1] once.
It's clearly intermittent (see the retrigger[2] and the runs here[3]), but I don't know what could cause it. Any ideas?
[1] https://tbpl.mozilla.org/php/getParsedLog.php?id=27209828&tree=Mozilla-Inbound&full=1#error0
[2] https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=14619f24a8a8
[3] https://tbpl.mozilla.org/?tree=Try&rev=6c50d2ebcada
Flags: needinfo?(ted)
Reporter | ||
Comment 15•11 years ago
|
||
I don't have any idea about that. I have a patch floating around to replace all that code with something less horrible, I should just finish that.
Flags: needinfo?(ted)
Comment 16•11 years ago
|
||
(In reply to Mihnea Dobrescu-Balaur (:mihneadb) from comment #14)
> Ted, it seems that killPid in the windows version of automation.py hung[1]
> once.
> It's clearly intermittent (see the retrigger[2] and the runs here[3]), but I
> don't know what could cause it. Any ideas?
>
> [1]
> https://tbpl.mozilla.org/php/getParsedLog.php?id=27209828&tree=Mozilla-
> Inbound&full=1#error0
> [2] https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=14619f24a8a8
> [3] https://tbpl.mozilla.org/?tree=Try&rev=6c50d2ebcada
Filed as bug 911249.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #15)
> I don't have any idea about that. I have a patch floating around to replace
> all that code with something less horrible, I should just finish that.
Guessing this is bug 890026 ? :-)
Comment 17•11 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla26
Assignee | ||
Comment 18•11 years ago
|
||
(In reply to Ed Morley [:edmorley UTC+1] from comment #16)
>
> (In reply to Ted Mielczarek [:ted.mielczarek] from comment #15)
> > I don't have any idea about that. I have a patch floating around to replace
> > all that code with something less horrible, I should just finish that.
>
> Guessing this is bug 890026 ? :-)
That one. Also, trying bug 911262 as a quick(er) fix.
Assignee | ||
Comment 19•11 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #12)
> \o/
>
> Please blog this. Break your arm doing it.
Blog post with details about the process of parallelizing the harness:
http://www.mihneadb.net/post/parallelizing-a-test-harness/
You need to log in
before you can comment on or make changes to this bug.
Description
•