Closed
Bug 906525
Opened 11 years ago
Closed 11 years ago
Resolve timeouts in jit-test parallel suite
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
mozilla31
People
(Reporter: dminor, Unassigned)
References
Details
In process of working on bug 858622 I've run into a couple of tests in the parallel jit-test suite that fail intermittently when running on my pandaboard:
/home/dminor/mozilla-central/js/src/jit-test/tests/parallel/ic-getelement.js
/home/dminor/mozilla-central/js/src/jit-test/tests/parallel/ic-getproperty.js
These need to be resolved or marked as intermittent/skippable before the tests can be scheduled on tbpl.
Updated•11 years ago
|
Assignee: nobody → general
Component: General → JavaScript Engine
Product: Testing → Core
Reporter | ||
Comment 1•11 years ago
|
||
The parallel suite has recently begun failing (possibly intermittently) on Mac OS X 10.6 and Windows XP:
13:27:33 INFO - TIMEOUT - parallel\timeout-gc.js
13:27:33 WARNING - TEST-UNEXPECTED-FAIL | tests\jit-test\jit-test\tests\parallel\timeout-gc.js |
13:27:33 INFO - INFO exit-status : -1
13:27:33 INFO - INFO timed-out : True
13:27:33 INFO - INFO stdout >
13:27:33 INFO - INFO stderr 2>
13:27:33 INFO - TIMEOUT - parallel\timeout-gc.js
13:27:33 WARNING - TEST-UNEXPECTED-FAIL | tests\jit-test\jit-test\tests\parallel\timeout-gc.js | --ion-eager --ion-parallel-compile=off
13:27:33 INFO - INFO exit-status : -1
13:27:33 INFO - INFO timed-out : True
13:27:33 INFO - INFO stdout >
13:27:33 INFO - INFO stderr 2>
13:27:33 INFO - TIMEOUT - parallel\timeout-gc.js
13:27:35 WARNING - TEST-UNEXPECTED-FAIL | tests\jit-test\jit-test\tests\parallel\timeout-gc.js | --ion-eager --ion-parallel-compile=off --ion-check-range-analysis --no-sse3
13:27:35 INFO - INFO exit-status : -1
13:27:35 INFO - INFO timed-out : True
13:27:35 INFO - INFO stdout >
13:27:35 INFO - INFO stderr 2>
13:27:35 INFO - TIMEOUT - parallel\timeout-gc.js
13:27:35 WARNING - TEST-UNEXPECTED-FAIL | tests\jit-test\jit-test\tests\parallel\timeout-gc.js | --baseline-eager
13:27:35 INFO - INFO exit-status : -1
13:27:35 INFO - INFO timed-out : True
13:27:35 INFO - INFO stdout >
13:27:35 INFO - INFO stderr 2>
13:27:35 INFO - TIMEOUT - parallel\timeout.js
13:27:35 WARNING - TEST-UNEXPECTED-FAIL | tests\jit-test\jit-test\tests\parallel\timeout.js |
13:27:35 INFO - INFO exit-status : -1
13:27:35 INFO - INFO timed-out : True
13:27:35 INFO - INFO stdout >
13:27:35 INFO - INFO stderr 2>
Blocks: 973900
Summary: Resolve intermittent failures in jit-test parallel suite on Panda → Resolve failures in jit-test parallel suite
Reporter | ||
Comment 2•11 years ago
|
||
Giving these more time (20 minutes) does not help.
Reporter | ||
Comment 3•11 years ago
|
||
These pass on the linux slaves which are single core, so I tried running them sequentially (--worker-count=1) but it did not help.
Summary: Resolve failures in jit-test parallel suite → Resolve timeouts in jit-test parallel suite
Reporter | ||
Comment 4•11 years ago
|
||
I don't see anything obvious left to try with these test cases.
Any more suggestions, or are we down to trying to bisect this? I'm almost certain these were ok back in January when I was going through the first batch of test machine specific failures.
Flags: needinfo?(terrence)
Comment 7•11 years ago
|
||
I'm not sure what's going on here. There seem to be two distinct problems collected in this bug:
1. `ic-setelement` and `ic-getelement`
2. `timeout` and `timeout-gc`
For the second one, I also started seeing problems locally (but apparently not on tbpl?). This confuses me. Those tests test an infinite loop that is supposed to be interrupted by the (shell equivalent of) the slow script dialog. It seems like there is a problem! I know that Shu was looking at that at one point, so I will flag him with needinfo as well.
Regarding the IC tests, I have no idea what could be going on there.
I'm leaving my needinfo since I don't think this comment really provides a lot of info yet. ;)
Flags: needinfo?(shu)
Reporter | ||
Comment 8•11 years ago
|
||
To provide a bit more context we're running these from the test package on Cedar: https://tbpl.mozilla.org/?tree=Cedar.
They are still running as part of make check, which means that it is passing on the WinXP builder, but failing on the WinXP test machine, and passing on the OS X 10.8 builder, but failing on the OS X 10.6 test machine.
The only difference between the WinXP test machine and the build machine that I'm certain about is that the test machines are sensitive to large memory allocations. Multiple small allocations will work fine, but single large allocations that work on the build machine will fail on the test machine. I don't think this is relevant here, but just in case.
I have a WinXP test machine loaner from releng if you have a patch you would like me to test. I can also ask for a 10.6 loaner if that would be useful, but I was going to see if solving it on WinXP fixes it on 10.6 before asking for one.
Comment 9•11 years ago
|
||
Dan -- can you point me at an instance where they actually fail?
Flags: needinfo?(nmatsakis) → needinfo?(dminor)
Reporter | ||
Comment 10•11 years ago
|
||
Niko, here is an instance from a recent run on cedar:
https://tbpl.mozilla.org/php/getParsedLog.php?id=36344679&tree=Cedar&full=1
Flags: needinfo?(dminor)
Comment 11•11 years ago
|
||
How rare are the timeouts? It is possible that that test hits a nasty corner case in the new scheduler that causes a deadlock or something, but without reliable STR, I can't find any bugs from just auditing the scheduler code. :(
Comment 12•11 years ago
|
||
Shu -- I think I am seeing timeouts in these two tests (timeout, timeout-gc) on my local machine quite regularly, actually. I have never seen any problems with the IC tests. The link to TPBL that dminor provided is also for timeout and timeout-gc, so presumably those tests are the actual issue.
Reporter | ||
Comment 13•11 years ago
|
||
(In reply to Shu-yu Guo [:shu] from comment #11)
> How rare are the timeouts? It is possible that that test hits a nasty corner
> case in the new scheduler that causes a deadlock or something, but without
> reliable STR, I can't find any bugs from just auditing the scheduler code. :(
The timeouts always occur when running on the test machines.
Comment 14•11 years ago
|
||
(In reply to Dan Minor [:dminor] from comment #13)
> (In reply to Shu-yu Guo [:shu] from comment #11)
> > How rare are the timeouts? It is possible that that test hits a nasty corner
> > case in the new scheduler that causes a deadlock or something, but without
> > reliable STR, I can't find any bugs from just auditing the scheduler code. :(
>
> The timeouts always occur when running on the test machines.
Is it possible for me to ssh in to these test machines to debug?
Reporter | ||
Comment 15•11 years ago
|
||
(In reply to Shu-yu Guo [:shu] from comment #14)
> (In reply to Dan Minor [:dminor] from comment #13)
> > (In reply to Shu-yu Guo [:shu] from comment #11)
> > > How rare are the timeouts? It is possible that that test hits a nasty corner
> > > case in the new scheduler that causes a deadlock or something, but without
> > > reliable STR, I can't find any bugs from just auditing the scheduler code. :(
> >
> > The timeouts always occur when running on the test machines.
>
> Is it possible for me to ssh in to these test machines to debug?
Shu, you can request a test machine by filing a bug under Release Engineering - Loan Requests (you can clone Bug 977711). Thanks for looking at this!
Reporter | ||
Comment 16•11 years ago
|
||
These tests have been disabled until they are fixed, so they no longer block removing jit-tests from make check.
Comment 17•11 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/31b79b2c4a7a
Re-enabled. r=nmatsakis over IRC
Comment 18•11 years ago
|
||
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla31
You need to log in
before you can comment on or make changes to this bug.
Description
•