Closed Bug 474666 Opened 15 years ago Closed 15 years ago

Do code coverage builds periodically

Categories

(Release Engineering :: General, defect, P2)

x86
Linux
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: catlee, Assigned: catlee)

References

Details

Attachments

(5 files, 4 obsolete files)

Do automated code coverage build+unittest run once per week.

This requires a different mozconfig, detailed here:
https://wiki.mozilla.org/QA:CodeCoverage

and then have the resulting files uploaded somewhere where murali can get them.

Do we run all the same unittest suites that we do for our regular builds?

Do we need to run lcov afterwards, or can we just upload the *.gcda and *.gcno files?
From offline discussions with Tim & Murali, we'll start by doing this once a week, probably over weekend, while we figure out how long it takes to run, and how disruptive is such a long job on the available linux slaves.
Priority: -- → P3
Summary: Do code coverage builds → Do code coverage builds periodically
From Murali's email:

The following steps need to be executed in order to prep the CentOS VM for instrumented Firefox build.


==============

Please download 'lcov' from http://downloads.sourceforge.net/ltp/lcov-1.7-1.noarch.rpm

Please install 'lcov' using the following command:

rpm –ivh lcov-1.7-1.noarch.rpm

'lcov' should have been installed in the /usr/bin directory.

You need to make the following modifications [ as super user ]

vi /usr/bin/geninfo

Press the following key sequence:

Esc -> : -> %s/die/print/g -> Enter

vi /usr/bin/genhtml

Press the following key sequence:

Esc -> : -> %s/die/print/g -> Enter

=============

Please go to /home/cltbld/BLD_PREREQUISITES  on murali-experiment.build.mozilla.org 

Copy the jscoverage* files from this directory to /usr/local/bin or /usr/bin as super user and make sure they have 755 permissions.

chmod 755 /usr/bin/jscoverage
chmod 755 /usr/bin/jscoverage-server

==============

Please use the following .mozconfig file for the build of instrumented Firefox.

export CFLAGS="-fprofile-arcs -ftest-coverage"
export CXXFLAGS="-fprofile-arcs -ftest-coverage"
export LDFLAGS="-lgcov -static-libgcc"
export EXTRA_DSO_LDFLAGS="-lgcov -static-libgcc"
ac_add_options --enable-application=browser
mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/../mozcentral-dbg


ac_add_options --enable-debug
ac_add_options --enable-tests	
ac_add_options --enable-mochitest
ac_add_options --disable-optimize
ac_add_options --enable-chrome-format=flat


mk_add_options MOZ_MAKE_FLAGS="-j3"

mk_add_options AUTOCONF=autoconf-2.13

================
Also need to:
cd $OBJ_DIR/dist
mv bin bin-original
jscoverage \
         --mozilla \
         --no-instrument=defaults \
         --no-instrument=greprefs \
         bin-original bin
Here are the actual commands

cd OBJ_DIR/dist
mv bin bin-original
jscoverage \
--mozilla \
--no-instrument=defaults --no-instrument=greprefs \
--no-instrument=chrome/browser/content/browser/places/treeView.js \
bin-original bin

Then you can start all the usual tests.

Once testing is complete :

The following commands would generate the required code coverage results index.html page

cd OBJ_DIR

lcov -c -i -d . -o app.info

genhtml app.info


Thanks
Murali
Most of the tests failed and I know one reason why they failed.
in the .mozconfig we defined the build destination as @TOPSRCDIR@/../mozcentral-dbg and apparently the actual build destination is different. So, the runtests.py was not found.

Can you please check the issue.

Thanks
Murali
The latest build resulted in this error when trying to run jscoverage:

jscoverage: chrome/browser/content/browser/places/treeView.js: unknown operator (172) in file

Any ideas?
Please check comment #4. I have included that in the exclude list of the jscoverage command. 

Thanks
Mrali
The lcov rpm should be added to the mofo repo before this bug can be considered fixed.
Attached patch Create CodeCoverageFactory (obsolete) — Splinter Review
Attachment #367760 - Flags: review?(bhearsum)
Attachment #367761 - Flags: review?(bhearsum)
Attachment #367760 - Flags: review?(bhearsum) → review-
Comment on attachment 367760 [details] [diff] [review]
Create CodeCoverageFactory

>diff --git a/process/factory.py b/process/factory.py
>--- a/process/factory.py
>+++ b/process/factory.py
>@@ -1744,6 +1744,7 @@
>         )
> 
> class UnittestBuildFactory(MozillaBuildFactory):
>+    timeout = 5*60 # 5 minute timeout by default

I don't think we should lower the timeouts as an incidental part of this bug. Some of the mochitest style ones can taken over 5 minutes to start on Windows IIRC. How well did you test this particular change?


>+class CodeCoverageFactory(UnittestBuildFactory):
>+    timeout = 24*3600 # 24 hour timeout

I think I remember you mentioning hitting a shutdown hang and this timeout causing huge delays. Is that correct? What happened there?

>+    def __init__(self, platform, brand_name, config_repo_path, config_dir,
>+                 stageUsername=None, stageServer=None, stageSshKey=None,
>+                 **kwargs):

style nit: don't need to write out platform, brand_name, config_repo_path, or config_dir here. Let's not do that here either for consistency with the rest of the file.

>+        self.stageServer = stageServer
>+        self.stageUsername = stageUsername
>+        self.stageSshKey = stageSshKey
>+
>+        UnittestBuildFactory.__init__(self, platform, brand_name,
>+                config_repo_path, config_dir , **kwargs)
>+
>+    def addCopyMozconfigStep(self):
>+        config_dir_map = {
>+                'linux': 'linux/%s/codecoverage' % self.branchName,
>+                'macosx': 'macosx/%s/codecoverage' % self.branchName,
>+                'win32': 'win32/%s/codecoverage' % self.branchName,
>+                }
>+        mozconfig = 'mozconfigs/%s/%s/mozconfig' % \
>+            (self.config_dir, config_dir_map[self.platform])
>+
>+        self.addStep(ShellCommand, name="copy mozconfig",
>+         command=['cp', mozconfig, 'build/.mozconfig'],
>+         description=['copy mozconfig'],
>+         workdir='.'
>+        )
>+
>+    def addPreBuildCleanupSteps(self):
>+        MozillaBuildFactory.addPreBuildCleanupSteps(self)
>+        # Always clobber code coverage builds
>+        self.addStep(ShellCommand,
>+         command=['rm', '-rf', 'build'],
>+         workdir=".",
>+         timeout=30*60,
>+        )
>+
>+    def addPreTestSteps(self):
>+        self.addStep(ShellCommand,
>+         command=['mv','bin','bin-original'],
>+         workdir="build/objdir/dist",
>+        )

I assume this is being done because you need the directory from the previous run and the current one. If that's the case, this doesn't work in a pool of slaves model. We solve this for codesighs/leaktest by uploading the necessary files to stage, in tinderbox-builds/mozilla-central-linux, for example.

>+        self.addStep(ShellCommand,
>+         command=['jscoverage', '--mozilla',
>+                  '--no-instrument=defaults',
>+                  '--no-instrument=greprefs',
>+                  '--no-instrument=chrome/browser/content/browser/places/treeView.js',
>+                  'bin-original', 'bin'],
>+         workdir="build/objdir/dist",
>+        )
>+
>+    def addPostTestSteps(self):
>+        self.addStep(ShellCommand,
>+         command=['lcov', '-c', '-d', '.', '-o', 'app.info'],
>+         workdir="build/objdir",
>+         timeout=self.timeout,
>+        )
>+        self.addStep(ShellCommand,
>+         command=['rm', '-rf', 'codecoverage_html'],
>+         workdir="build",
>+         timeout=self.timeout,
>+        )
>+        self.addStep(ShellCommand,
>+         command=['mkdir', 'codecoverage_html'],
>+         workdir="build",
>+         timeout=self.timeout,
>+        )
>+        self.addStep(ShellCommand,
>+         command=['genhtml', '../objdir/app.info'],
>+         workdir="build/codecoverage_html",
>+         timeout=self.timeout,
>+        )
>+        self.addStep(ShellCommand,
>+         command=['cp', 'objdir/dist/bin/application.ini', 'codecoverage_html'],

Please pass in objdir and use that, don't hardcode.

>+         workdir="build",
>+        )
>+        self.addStep(ShellCommand,
>+         command=['tar', 'jcvf', 'codecoverage.tar.bz2', 'codecoverage_html'],
>+         workdir="build",
>+         timeout=self.timeout,
>+        )
>+
>+        uploadEnv = self.env.copy()
>+        uploadEnv.update({'UPLOAD_HOST': self.stageServer,
>+                          'UPLOAD_USER': self.stageUsername,
>+                          'UPLOAD_PATH': '/home/ftp/pub/firefox/nightly/experimental/codecoverage'})
>+        if self.stageSshKey:
>+            uploadEnv['UPLOAD_SSH_KEY'] = '~/.ssh/%s' % self.stageSshKey
>+        if 'POST_UPLOAD_CMD' in uploadEnv:
>+            del uploadEnv['POST_UPLOAD_CMD']
>+        self.addStep(ShellCommand,
>+         env=uploadEnv,
>+         command=['python', 'build/upload.py', 'codecoverage.tar.bz2'],
>+         workdir="build",
>+         timeout=self.timeout,
>+        )

I think that only the actual running of the unittests needs a high timeout. I don't think it's great to use 24 hours for the other steps, especially the uploading ones. Please remove it where it's not needed.

Overall this is OK. Let me know about the shutdown hang, and please fix the other minor issues.
Attachment #367761 - Flags: review?(bhearsum) → review+
Comment on attachment 367761 [details] [diff] [review]
Do code coverage build on staging

>+        if branch['enable_codecoverage']:
>+            # We only do code coverage builds on linux right now
>+            if platform == 'linux':

At some point we probably need to come up with a better system than this for adding miscellaneous jobs to Buildbot, but this is totally fine for now.

>+                codecoverage_factory = CodeCoverageFactory(
>+                    platform=platform,
>+                    brand_name=branch['brand_name'],
>+                    config_repo_path=CONFIG_REPO_PATH,
>+                    config_dir=CONFIG_SUBDIR,
>+                    hgHost=HGHOST,
>+                    repoPath=branch['repo_path'],
>+                    buildToolsRepoPath=BUILD_TOOLS_REPO_PATH,
>+                    buildSpace=5,

paranoid check: do code coverage builds take up more disk space?

>+                codecoverage_scheduler=Nightly(
>+                    name=codecoverage_builder['name'],
>+                    branch=branch['repo_path'],
>+                    dayOfWeek=6, # Saturday
>+                    hour=[3], minute=[02],
>+                    builderNames=[codecoverage_builder['name']]
>+                )
>+                c['schedulers'].append(codecoverage_scheduler)
>+

style nit: please keep the Scheduler up with the other Schedulers.

r=bhearsum with the Scheduler moved up.
Codecoverage builds take more space for two reasons, the build it self is a debug build for each file compiled, we get two additional files generated with .gcno and .gcda extensions. Those files can get big based on the extent of coverage provided to the file in the test runs.

Safe assumption is that we will end up with an additional 200MB of disk space usage for each build+run+lcov+genhtml cycle.
Attached patch Create CodeCoverageFactory (obsolete) — Splinter Review
Attachment #367760 - Attachment is obsolete: true
Attachment #368336 - Flags: review?(bhearsum)
The last code coverage run I did took 3.3 GB, so 5 should be plenty.

Since UnittestBuildFactory now takes an objdir parameter, we need to update all the mozconfigs in both master and staging.
Attachment #367761 - Attachment is obsolete: true
Attachment #368338 - Flags: review?(bhearsum)
Comment on attachment 368336 [details] [diff] [review]
Create CodeCoverageFactory

Looks good to me...
Attachment #368336 - Flags: review?(bhearsum) → review+
Comment on attachment 368338 [details] [diff] [review]
Do code coverage build on staging

This one, too.
Attachment #368338 - Flags: review?(bhearsum) → review+
Since we are planning weekly cycls, can you kindly add the JStest suite testing as well to the test cycle.

Commands are given here.

In order to run jstests, please go to src/js/tests and execute the following commands

   ./jsDriver.pl -e smdebug -L lc2 -L lc3 -L spidermonkey-n-1.9.1.tests -s <OBJ_DIR>/dist/bin/js

./jsDriver.pl -e smdebug -L lc2 -L lc3 -L spidermonkey-n-1.9.1.tests -s <OBJ_DIR>/dist/bin/js -o '-j'

These two are two separate  runs and they would provide good coverage of new JS engine.

JS engine is one of the top areas for QA/DEV study from codecoverage.

Thanks
Murali
Priority: P3 → P2
Attachment #368338 - Attachment is obsolete: true
Attachment #369735 - Flags: review?(bhearsum)
Attachment #368336 - Attachment is obsolete: true
Attachment #369736 - Flags: review?(bhearsum)
Attachment #369735 - Flags: review?(bhearsum) → review+
Attachment #369736 - Flags: review?(bhearsum) → review+
Comment on attachment 369735 [details] [diff] [review]
Do code coverage build on staging

changeset:   1046:1ce4874f5efd
Attachment #369735 - Flags: checked‑in+ checked‑in+
Comment on attachment 369736 [details] [diff] [review]
Create CodeCoverageFactory

changeset:   232:21614788d635
Attachment #369736 - Flags: checked‑in+ checked‑in+
Attachment #370068 - Flags: review?(bhearsum)
Comment on attachment 370068 [details] [diff] [review]
Do code coverage builds on production

Looks fine to me
Attachment #370068 - Flags: review?(bhearsum) → review+
Attachment #370183 - Flags: review?(bhearsum) → review+
Comment on attachment 370183 [details] [diff] [review]
Add branch name to tarball; clobber after finishing tests

changeset:   236:0ed07fe1e2b4
Attachment #370183 - Flags: checked‑in+ checked‑in+
Comment on attachment 370068 [details] [diff] [review]
Do code coverage builds on production

changeset:   1047:2a1d850fec52
Attachment #370068 - Flags: checked‑in+ checked‑in+
This is now live on production.  Code coverage reports are being placed in
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/experimental/codecoverage/

Murali, the two extra steps you mentioned in comment #18 will be added later.  I wonder if it would be better to have a script or makefile target in mozilla's repository to run these steps?
We're also going to wait for bug 469718 to be completed to do the additional JS tests.
Attachment #371760 - Flags: review?(bhearsum) → review+
Attachment #371760 - Flags: checked‑in+ checked‑in+
Comment on attachment 371760 [details] [diff] [review]
Have code coverage builds report to tinderbox trees

changeset:   1084:f44b608dc099
This is all done.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: