Closed Bug 1372324 Opened 3 years ago Closed 3 years ago

Enable Talos on linux64-ccov.

Categories

(Testing :: Code Coverage, enhancement)

enhancement
Not set

Tracking

(firefox56 fixed)

RESOLVED FIXED
mozilla56
Tracking Status
firefox56 --- fixed

People

(Reporter: sparky, Assigned: sparky)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

Currently, linux64-ccov does not run talos tests but we would like to collect code coverage from those suites.

There has been some work done to get this working in bug 1368115, but there is much more to do.

The needed steps can be broken down into:
  1) Run talos on VM instead of HM when running code coverage.
  2) Run only 1 cycle instead of the 20/25 currently done.

Here's some work that has already been done: https://treeherder.mozilla.org/#/jobs?repo=try&revision=bca2d12abe12d9cd54ba53afd9fdb56cda18a561
I've made some progress with getting this working. Here's the most recent changes: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b90a2b0ec4a404cdc360ebd2ae223d0e437949fe

Currently, the majority of the work needed is in getting linux_config.py to be correct for the VM.
So I managed to get talos working on a VM and running on linux64-ccov. I'm also gathering coverage: https://treeherder.mozilla.org/#/jobs?repo=try&revision=67d52ceeabcc72db8c1295bf002074d3cd023716

There are just some timeout issues that should be fixed by changing the number of iterations and also slightly increasing the test_timeout time.
Here's the latest try run with all suites passing and gathering coverage: https://treeherder.mozilla.org/#/jobs?repo=try&revision=4868495a06541ecff16bfe1c57fc688146cbd066
Comment on attachment 8880457 [details]
Bug 1372324 - Enable talos tests on linux64-ccov.

https://reviewboard.mozilla.org/r/151800/#review156828

pretty close!

::: browser/config/mozconfigs/linux64/code-coverage:11
(Diff revision 1)
>  ac_add_options --disable-jemalloc
>  ac_add_options --disable-crashreporter
>  ac_add_options --disable-elf-hack
>  ac_add_options --enable-debug
>  ac_add_options --disable-sandbox
> +ac_add_options --disable-profiling

this needs more testing, and more eyes- might be ok.

::: taskcluster/taskgraph/transforms/tests.py:529
(Diff revision 1)
> +                test['mozharness']['extra-options'].append('--add-option')
> +                test['mozharness']['extra-options'].append('--test_timeout,1200')
> +                test['mozharness']['extra-options'].append('--add-option')
> +                test['mozharness']['extra-options'].append('--cycles,1')
> +                test['mozharness']['extra-options'].append('--add-option')
> +                test['mozharness']['extra-options'].append('--tppagecycles,1')

while this is hacky, I think it is ok.  I would like to add an option to talos: --no-upload-results which we can set here as extra-options.  When --no-upload-results is set, we would not print out PERFHERDER_DATA in talos.

::: testing/mozharness/configs/talos/linux64_ccov_config.py:1
(Diff revision 1)
> +import os

please rename to: linux64_config_taskcluster.py

::: testing/mozharness/mozharness/mozilla/testing/talos.py:34
(Diff revision 1)
>  from mozharness.mozilla.buildbot import TBPL_SUCCESS, TBPL_WORST_LEVEL_TUPLE
>  from mozharness.mozilla.buildbot import TBPL_RETRY, TBPL_FAILURE, TBPL_WARNING
>  from mozharness.mozilla.tooltool import TooltoolMixin
> +from mozharness.mozilla.testing.codecoverage import (
> +    CodeCoverageMixin,
> +    code_coverage_config_options

please ensure that we do not run code coverage pre/post tasks on non ccov builds.
Attachment #8880457 - Flags: review?(jmaher) → review-
Comment on attachment 8880457 [details]
Bug 1372324 - Enable talos tests on linux64-ccov.

https://reviewboard.mozilla.org/r/151800/#review156828

> this needs more testing, and more eyes- might be ok.

Spoke with :marco and he said that without profiling there will be some code that isn't compiled but it's likely that it's nothing too significant for it to be an issue. Here's a test run with all ccov test suites running to test this disable and see if there are any unexpected problems: https://treeherder.mozilla.org/#/jobs?repo=try&revision=621b30e5ec17b863eafb77b13fb696e808c7d058

> while this is hacky, I think it is ok.  I would like to add an option to talos: --no-upload-results which we can set here as extra-options.  When --no-upload-results is set, we would not print out PERFHERDER_DATA in talos.

The only other way (that I know of) to do this configuration change is to go through the talos test definitions in taskcluster/ci/tests/tests.yml and add a by-test-platform option to every talos test. I felt that since all the talos test suites would get the same option the transform was a nicer place to make the change. That said, we can do it the other way. What do you prefer?

> please ensure that we do not run code coverage pre/post tasks on non ccov builds.

So the pre and post tasks never run on non-ccov builds but their pre-/post-action listeners always run whenever a script named 'run_tests' is called. However, it only starts code coverage if '--code-coverage' was given as a CL option. And that flag is only given when linux64-ccov is the platform in use as it is the only build which appends all test suite the needed command.

For example, consider this test run with WD: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&selectedJob=109283462
It doesn't have any code coverage collected on it because '--code-coverage' is not enabled. Here's the log file for this test run: https://public-artifacts.taskcluster.net/IGrShSjoQByN9zgpLPBDLg/0/public/logs/live_backing.log

If you search for 'set_gcov_prefix' (which should be renamed in the near future) you'll see that the listener is triggered but code coverage is never started because it needs 'code-coverage' to start. And this means that we don't have to worry about the pre/post tasks being triggered on non-ccov builds unless other builds start using '--code-coverage' in their test suite extra options. Although, even if they do try to use it, their builds would break immediately because they lack a lot of the specific changes (like disabling jemalloc).
Here's the latest test run on the patch: https://treeherder.mozilla.org/#/jobs?repo=try&revision=4fd34dea4473f1631d2a8689ea7021435bd9bb72

And here's a test run on linux64 to see if any unexpected problems popped up: https://treeherder.mozilla.org/#/jobs?repo=try&revision=e74cbebee765501b9e0d664ddfbbf759daf57b4c

No additional problems, and the flake8 errors are coming from a whitespace addition in the try push message patch.
Comment on attachment 8880457 [details]
Bug 1372324 - Enable talos tests on linux64-ccov.

https://reviewboard.mozilla.org/r/151800/#review158714

looking good
Attachment #8880457 - Flags: review?(jmaher) → review+
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/53bbc6fabb65
Enable talos tests on linux64-ccov. r=jmaher
https://hg.mozilla.org/mozilla-central/rev/53bbc6fabb65
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
Depends on: 1377915
Depends on: 1377982
Assignee: nobody → gmierz2
You need to log in before you can comment on or make changes to this bug.