Closed Bug 512592 Opened 15 years ago Closed 15 years ago

New valgrind testsuite / unit-test capability for JS engine

Categories

(Release Engineering :: General, defect, P4)

x86
Linux
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: graydon, Assigned: catlee)

References

Details

Attachments

(2 files, 4 obsolete files)

We (the JS team) are adding the ability to run some of the JS engine unit tests under valgrind. Once we have this working, we'd like to extend the JS engine unit testing that occurs on each checkin to include valgrind testing.

I talked to joduinn a bit about this today, and he asked me to include a few notes here to clarify the requirements:

  - No new builds are required; a regular debug build of firefox as we are producing now contains the javascript shell in its dist directory, and we'll be using that.

  - Valgrind only runs on linux and OSX, and the linux one is likely the faster and less buggy version at the moment, so for now we can confine our attention to the linux unit-test hosts. We might ask for OSX later.

  - The main new requirement is getting a copy of the valgrind tool installed on the unit tester slaves. Valgrind is a normal, unprivileged user-mode tool like gcc or make, and it should already exist as a package for every major linux distribution. If a suitable package can't be found, I or another JS team member would be happy to build it from source as required in your linux VM image. It's pretty easy to build.

  - If you like, this can be considered as a "new suite", though it's really just a new capability we're adding to the existing unit-test suite, that the harness will detect and use when it's running on a host that has valgrind.

  - This can initially be confined to the tracemonkey branch, though it's likely it'll spread to all branches eventually.

  - This is only going to involve testing the js shell for now, not firefox itself, and the tests should be reasonably quick. John asked me if a 5 minute window is likely to cover it, and I think that's likely. Future followup bugs from *other* teams might involve enabling valgrind-testing firefox, but valgrind slows execution down to a crawl and uses significant amounts of memory, so I'd suggest leaving that for another day. More complicated resource allocation considerations will apply in that case. For this bug, the normal unit-test slaves should be ok.


tm branch only

VM
OS: Windows Vista → Linux
Er, the two lines at the bottom there were just notes I jotted down when talking to John and intended to delete. Oops.
From IRC discussion these js tests run under "make check". The current unit test builds are optimized+refcount rather than debug, so this depends on bug 372581.
Depends on: 372581
Moving to future until bug 372581 is fixed
Component: Release Engineering → Release Engineering: Future
nope, no need for that bug to be fixed.
Component: Release Engineering: Future → Release Engineering
Much Memcheck-detectable badness in js can be shaken out by the
following simple test:

  valgrind-3.5.0 --smc-check=all --leak-check=full   \
     /path/to/js -j -e 'gSkipSlowTests=true' ./trace-test.js

where trace-test.js is the old monolithic trace-tests.js
(need mandelbrot-results.js and math-trace-tests.js in same dir)

Test runs for 63 seconds on a 2.4GHz Core 2 E6600, for a release
build js.

Debug build (-g -O) takes 2m44 user.  Plain -g is liable to be
about 5 mins.
(In reply to comment #0)
> window is likely to cover it, and I think that's likely. Future followup bugs
> from *other* teams might involve enabling valgrind-testing firefox, but
> valgrind slows execution down to a crawl and uses significant amounts of
> memory, so I'd suggest leaving that for another day.

Another day, another bug, but note ..

I've been working on running the entire mochitest on Valgrind.
I believe it to be feasible, requiring very roughly 8 hours CPU
and 2GB memory.  So far I have got about 1/3 way through it
(limited by available cpu time).

Any interest in that -- gimme a shout.
(In reply to comment #0)
> We (the JS team) are adding the ability to run some of the JS engine unit tests
> under valgrind. Once we have this working, we'd like to extend the JS engine
> unit testing that occurs on each checkin to include valgrind testing.
> 
> I talked to joduinn a bit about this today, and he asked me to include a few
> notes here to clarify the requirements:
> 
>   - No new builds are required; a regular debug build of firefox as we are
> producing now contains the javascript shell in its dist directory, and we'll be
> using that.

Exactly which commands do you need to be run under valgrind?  One of the
existing tests?  The mochitest harness understands how to run tests under
valgrind via the --debugger=valgrind option iirc (which launches the tests with
--leak-check=full).  Running the equivalent of 'make check' on debug builds
would require that bug 490223 is fixed so the tests required are contained the
in packaged tests archive.  I'm not sure that we want to be running this test
step as part of the actual build process, so if it could operate on the
packaged debug build, that would be great.

>   - Valgrind only runs on linux and OSX, and the linux one is likely the faster
> and less buggy version at the moment, so for now we can confine our attention
> to the linux unit-test hosts. We might ask for OSX later.
> 
>   - The main new requirement is getting a copy of the valgrind tool installed
> on the unit tester slaves. Valgrind is a normal, unprivileged user-mode tool
> like gcc or make, and it should already exist as a package for every major
> linux distribution. If a suitable package can't be found, I or another JS team
> member would be happy to build it from source as required in your linux VM
> image. It's pretty easy to build.

valgrind 3.2.1 is currently installed on the linux slaves.  Is this version
sufficient, or do you need a newer/different version?

>   - If you like, this can be considered as a "new suite", though it's really
> just a new capability we're adding to the existing unit-test suite, that the
> harness will detect and use when it's running on a host that has valgrind.
> 
>   - This can initially be confined to the tracemonkey branch, though it's
> likely it'll spread to all branches eventually.
> 
>   - This is only going to involve testing the js shell for now, not firefox
> itself, and the tests should be reasonably quick. John asked me if a 5 minute
> window is likely to cover it, and I think that's likely. Future followup bugs
> from *other* teams might involve enabling valgrind-testing firefox, but
> valgrind slows execution down to a crawl and uses significant amounts of
> memory, so I'd suggest leaving that for another day. More complicated resource
> allocation considerations will apply in that case. For this bug, the normal
> unit-test slaves should be ok.

A few other questions:

- How are we going to be managing the results?  What constitutes a "green",
  "orange", or "red" run?

- Do we have a suppresions file that we should be using?  If so, who should be
  updating this file as required?  If not, do we need one?
(In reply to comment #7)

> Exactly which commands do you need to be run under valgrind?  One of the
> existing tests?

I've just checked in bug 512591 which added a new 'make check-valgrind' target inside the js/src/Makefile.in, that parallels the existing 'make check' target, but runs a subset of the suite under valgrind (if it can find it in $PATH). 

If you can arrange to have that run on any linux-x86 or linux-x64 host (preferably both) I think this bug can close. Whether this is on the build hosts or -- presumably after some bugs relating to packaging and moving tests get resolved -- on a separate test host doesn't concern me. 

At the moment 'make check-valgrind' should add at most a minute to the 'make check' time.

(As for mochitests ... eh, I wouldn't even aim for valgrinding them. They're very detailed, excruciatingly slow and mostly hit the same code paths over and over, with subtle variations. I'd go with tp4 pageset if anything; much better bang for buck. You want breadth on a valgrinded suite; valgrind does the hard work of turning "random grab-bag of breadth" into "precise failure-analysis". That's what makes it so wonderful.)

> Running the equivalent of 'make check' on debug builds
> would require that bug 490223 is fixed so the tests required are contained the
> in packaged tests archive.  I'm not sure that we want to be running this test
> step as part of the actual build process, so if it could operate on the
> packaged debug build, that would be great.

If and when the tests associated with 'make check' and/or 'make check-valgrind' are packaged and move with it to a build host, sure. We discussed this today on IRC, it seemed like you have some of that work in progress on bug 516983. I'll watch that and try to help out as best I can.

> valgrind 3.2.1 is currently installed on the linux slaves.  Is this version
> sufficient, or do you need a newer/different version?

That's pretty old. It'll probably work, but if you could move them to 3.5 that would be better (and will probably run a fair bit faster too).

> A few other questions:
> 
> - How are we going to be managing the results?  What constitutes a "green",
>   "orange", or "red" run?

The reporting is already handled by the trace-test.py harness script. Prints the same sort of TEST-UNEXPECTED-FAIL output on a failure.

> - Do we have a suppresions file that we should be using?  If so, who should be
>   updating this file as required?  If not, do we need one?

Nope.

Thanks for helping see this through.
'make check-valgrind' in objdir/js/src took 38 seconds on one of our build slaves.
(In reply to comment #9)
> 'make check-valgrind' in objdir/js/src took 38 seconds on one of our build
> slaves.

Was that with 3.2.1?  I second the recommendation to upgrade to 3.5.0, I don't expect it will be much faster but it should be more stable.  3.2.1 is three years old.
(In reply to comment #10)
> (In reply to comment #9)
> > 'make check-valgrind' in objdir/js/src took 38 seconds on one of our build
> > slaves.
> 
> Was that with 3.2.1?  I second the recommendation to upgrade to 3.5.0, I don't
> expect it will be much faster but it should be more stable.  3.2.1 is three
> years old.

Yes, that was with 3.2.1.

Depends on how quickly you want this bug fixed.  Adding an extra command to our leak test build is a pretty minor thing (1-2 days), upgrading valgrind is a bigger deal (better part of a week, and won't happen this quarter).
So I got a test run of this going in staging.  With one failure in 'make check-valgrind', Tinderbox reports 'check-valgrind T-FAIL'.  Is that the expected output?

The log is here http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1253588693.1253589490.4448.gz for the next little while.
(In reply to comment #12)
> So I got a test run of this going in staging.  With one failure in 'make
> check-valgrind', Tinderbox reports 'check-valgrind T-FAIL'.  Is that the
> expected output?

That is the expected output, yes. In the sense that I see the same (correct) valgrind-caught error report on my machine. Thanks!

I wonder, though: the error-formatter doesn't seem to have picked up and linked-to the TEST-UNEXPECTED-FAIL line, as it often does. Do you know why?

Either way, it's pretty great. I'll take a look at the actual failing case more closely tomorrow.
(In reply to comment #13)
> I wonder, though: the error-formatter doesn't seem to have picked up and
> linked-to the TEST-UNEXPECTED-FAIL line, as it often does. Do you know why?

Hmmmm...I suspect this is because we don't send the output from debug builds through the "unittest" tinderbox log parser.  I'll look into if turning that on would have any adverse effects.
I believe the unittest error parser is a superset of the default 'unix' one:
http://mxr.mozilla.org/mozilla/source/webtools/tinderbox/ep_unittest.pl
http://mxr.mozilla.org/mozilla/source/webtools/tinderbox/ep_unix.pl

When I wrote it I'm fairly certain I started with ep_unix as the base and just added on, so switching shouldn't make us catch any less errors in the summary.
Assignee: nobody → catlee
Priority: -- → P4
Blocks: 516983
Comment on attachment 403482 [details] [diff] [review]
buildbot-configs to turn on 'make check' and 'make check-valgrind'

I'm not sure why do_check_test is controlled by a flag when it seems like we want it globally. Am I missing something?
Attachment #403482 - Attachment is obsolete: true
Attachment #403747 - Flags: review?(bhearsum)
Attachment #403482 - Flags: review?(bhearsum)
Attached patch buildbotcustom w/o do_check_test (obsolete) — Splinter Review
Attachment #403480 - Attachment is obsolete: true
Attachment #403748 - Flags: review?(bhearsum)
Attachment #403480 - Flags: review?(bhearsum)
(In reply to comment #18)
> (From update of attachment 403482 [details] [diff] [review])
> I'm not sure why do_check_test is controlled by a flag when it seems like we
> want it globally. Am I missing something?

Probably not.  I've attached new patches where 'make check' is always run.
Attachment #403747 - Flags: review?(bhearsum) → review+
Attachment #403748 - Flags: review?(bhearsum) → review+
back to having a config option to turn this on per branch, since this won't work unless we have --enable-tests on, which we won't have on mozilla-1.9.1.

I renamed the parameter to enable_checktests, but other than that should be the same as before.
Attachment #403748 - Attachment is obsolete: true
Attachment #404687 - Flags: review?(bhearsum)
- Turn on 'enable_checktests' on m-c and project branches
- Turn on 'enable_valgrind_checktests' on tracemonkey
- Turn on test packaging on linux64 opt builds, otherwise 'make check' will fail on opt builds.
Attachment #403747 - Attachment is obsolete: true
Attachment #404696 - Flags: review?(bhearsum)
Comment on attachment 404687 [details] [diff] [review]
buildbotcustom w/ enable_checktests

>@@ -135,16 +138,25 @@ def generateBranchObjects(config, name):
>     branchObjects['status'].append(TinderboxMailNotifier(
>         fromaddr="mozilla2.buildbot@build.mozilla.org",
>         tree=config['tinderbox_tree'],
>         extraRecipients=["tinderbox-daemon@tinderbox.mozilla.org"],
>         relayhost="mail.build.mozilla.org",
>         builders=builders + nightlyBuilders,
>         logCompression="bzip2"
>     ))
>+    branchObjects['status'].append(TinderboxMailNotifier(
>+        fromaddr="mozilla2.buildbot@build.mozilla.org",
>+        tree=config['tinderbox_tree'],
>+        extraRecipients=["tinderbox-daemon@tinderbox.mozilla.org"],
>+        relayhost="mail.build.mozilla.org",
>+        builders=debugBuilders,
>+        logCompression="bzip2",
>+        errorparser="unittest"
>+    ))

May as well just add these to the TinderboxMailNotifier a few lines down. builders=unittest + debugBuilders should do it.

r=me with that change.

Looks like this paves the way to getting rid of UnittestBuildFactory, too!
Attachment #404687 - Flags: review?(bhearsum) → review-
Attachment #404696 - Flags: review?(bhearsum) → review+
Comment on attachment 404687 [details] [diff] [review]
buildbotcustom w/ enable_checktests

changeset:   446:5827037de0f9

checked in with ben's suggested changes
Attachment #404687 - Flags: checked-in+
Comment on attachment 404696 [details] [diff] [review]
buildbot-configs w/ enable_checktests

changeset:   1634:93f05637fbe2
Attachment #404696 - Flags: checked-in+
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Depends on: 534456
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: