Closed
Bug 648379
Opened 14 years ago
Closed 1 year ago
Add emscripten automatic tests to js engine
Categories
(Core :: JavaScript: WebAssembly, defect)
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: azakai, Unassigned)
Details
Emscripten has 500+ automatic tests, that have uncovered various bugs in spidermonkey (in particular, recently in TI). It has been suggested to add some or all of those tests to the JS engine test suite, hence this bug.
The tests include both compiling LLVM to JS and running the generated JS. That includes some very large JS scripts (multiple megabytes), and running the entire suite takes 1-2 hours, so that is the downside to adding them all (but the large JS scripts have tended to find the most JS engine bugs).
Reporter | ||
Comment 1•14 years ago
|
||
To move on this, the relevant people (I am not sure who that is) need to decide if we want this, and if so how many of the tests.
Comment 2•14 years ago
|
||
The criteria we want to use is "the tests which increase our test coverage". Does this help or hinder?
Reporter | ||
Comment 3•14 years ago
|
||
We have found several bugs in TI using this, including recent regressions after they were fixed (the reduced testcases for them, that were added to the js tests, still passed, but not the original emscripten test they were reduced from). So I believe these tests do increase coverage.
However, it might be the case that some of that can be achieved with fuzzing, which AFAIK is not yet done on jaegermonkey, so maybe once fuzzing is done these tests will not increase the coverage that much. I don't know the answer to that, but I did ask jesse about one specific case - a bug with >50 local variables - and he said his fuzzer would not have found that one.
Comment 4•14 years ago
|
||
Just to clarify, I wasnt trying to shoot down including the tests, but trying to provide the criteria so that you can find some subset which is useful.
But in fact, :decoder tells me that coverage is probably not a good metric for including tests, because we're doing pretty well on both line- and branch-coverage, and are finding bugs with fuzz-tests which
However, I believe we can't add 2 hours of tests, no matter what the benefits. I could be wrong - it may be worth checking with RelEng and see what they say.
Assuming for a moment that I'm right, then we need to find some subset of the tests that are useful to add to the testsuite. I don't know a way to identify that subset.
Comment 5•14 years ago
|
||
(In reply to comment #4)
> But in fact, :decoder tells me that coverage is probably not a good metric for
> including tests, because we're doing pretty well on both line- and
> branch-coverage, and are finding bugs with fuzz-tests which
... are not improving coverage stats.
Comment 6•14 years ago
|
||
How about running it once per day on TM, instead of once per push?
Reporter | ||
Comment 7•14 years ago
|
||
I run the tests generally once a week, so I guess there is already some coverage in that sense, if we are happy with just that (and not preventing badness from even landing).
The risk is that I might miss a few weeks now and then. For example, I just found another bug on tracemonkey with this, bug 648769, which looks like a regression from almost two weeks ago.
Comment 8•14 years ago
|
||
(In reply to comment #6)
> How about running it once per day on TM, instead of once per push?
If we do this we can also use it for Kraken and other long running scripts. Bug 624792 is about adding Kraken to jit-tests but I had to tweak the tests to run faster. It would be good to check correctness of the full version too.
Comment 9•14 years ago
|
||
(In reply to Alon Zakai (:azakai) from comment #7)
> The risk is that I might miss a few weeks now and then. For example, I just
> found another bug on tracemonkey with this, bug 648769, which looks like a
> regression from almost two weeks ago.
Bug 679878 is another regression. From about a month ago, on the TI branch.
What about Jesse's suggestion of running these tests once per day? Ideally on several branches (TI, MC, Ion). It doesn't have to be fancy, just an email to some people if there's a new failure.
Assignee | ||
Updated•11 years ago
|
Assignee: general → nobody
Updated•3 years ago
|
Severity: normal → S3
Comment 11•1 year ago
|
||
I think we're unlikely to do this anytime soon, especially with the issues around the size of the test suite. We have had regressions reported to us by emscripten too, so I think the current situation works well.
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•