Closed Bug 1431753 Opened 2 years ago Closed 2 years ago

Support generating per-test coverage information

Categories

(Testing :: Code Coverage, enhancement)

enhancement
Not set

Tracking

(firefox61 fixed)

RESOLVED FIXED
mozilla61
Tracking Status
firefox61 --- fixed

People

(Reporter: marco, Assigned: marco)

References

(Blocks 3 open bugs)

Details

Attachments

(2 files, 4 obsolete files)

This will be a special mode which will probably be run weekly.
Depends on: 1431388, 1380659, 1380661
Blocks: 1438500
Attached patch Proposal (obsolete) — Splinter Review
This is not really implementing much, just moving some things around and adding placeholder comments where some operations should be implemented.
I'm uploading it to because I'd like to get some feedback before moving forward, especially on the integration between verify and coverage.
Attachment #8958796 - Flags: feedback?(gbrown)
Attachment #8958796 - Flags: feedback?(ahalberstadt)
Comment on attachment 8958796 [details] [diff] [review]
Proposal

Review of attachment 8958796 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry, I don't know much about coverage and I don't understand the intent here. I'd be glad to comment further if provided with more context.

::: testing/mochitest/runtests.py
@@ +3071,5 @@
>          options.runByManifest = True
>  
>      if options.verify:
> +        if options.code_coverage:
> +            result = runner.runTestSeparateBrowser(options)

If you are just going to run the specified test once in --verify --coverage mode, what does it have to do with test verification? Couldn't you just run the harness against the specified test with no special options and get the same result?
Attachment #8958796 - Flags: feedback?(gbrown)
(In reply to Geoff Brown [:gbrown] from comment #2)
> Comment on attachment 8958796 [details] [diff] [review]
> Proposal
> 
> Review of attachment 8958796 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Sorry, I don't know much about coverage and I don't understand the intent
> here. I'd be glad to comment further if provided with more context.

There's an email thread about started by Joel a few days ago. You should be CCed too.

> 
> ::: testing/mochitest/runtests.py
> @@ +3071,5 @@
> >          options.runByManifest = True
> >  
> >      if options.verify:
> > +        if options.code_coverage:
> > +            result = runner.runTestSeparateBrowser(options)
> 
> If you are just going to run the specified test once in --verify --coverage
> mode, what does it have to do with test verification? Couldn't you just run
> the harness against the specified test with no special options and get the
> same result?

Yeah, my plan would be to extract the parts of verify we need from verify_tools.py (the parts that find tests from changed files and that allow you to run one test at a time) and put them somewhere which is accessible both by VerifyToolsMixin and CodeCoverageMixin. Maybe I can create a base class for them which has this code and let them extend it with their specific needs.
Comment on attachment 8958796 [details] [diff] [review]
Proposal

Review of attachment 8958796 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good to me. I confess as long as it works, I don't have strong opinions on how the mozharness setup is implemented. The mochitest changes also look good, but since you're piggy-backing off the test-verify code, :gbrown should have the final say there.

::: testing/mochitest/mochitest_options.py
@@ +598,5 @@
>           {"type": int,
>            "default": 3600,
>            "help": "Maximum time, in seconds, to run in --verify mode.",
>            }],
> +        [["--code-coverage"],

Is this something you'd expect developers to want to specify manually? If not please add "suppress": True.

::: testing/mochitest/runtests.py
@@ +3071,5 @@
>          options.runByManifest = True
>  
>      if options.verify:
> +        if options.code_coverage:
> +            result = runner.runTestSeparateBrowser(options)

I'll defer to :gbrown for this section.

::: testing/mozharness/mozharness/mozilla/testing/testbase.py
@@ +270,5 @@
>              return urllib2.build_opener(authhandler).open(url, **kwargs)
>  
>          # If we have the developer_run flag enabled then we will switch
>          # URLs to the right place and enable http authentication
> +        # if "developer_config.py" in self.config["config_files"]:

Why was this commented out? I think :armenzg implemented it awhile ago to facilitate running mozharness locally. It may be broken now I guess.
Attachment #8958796 - Flags: feedback?(ahalberstadt) → feedback+
(In reply to Andrew Halberstadt [:ahal] from comment #4)
> Comment on attachment 8958796 [details] [diff] [review]
> Proposal
> 
> Review of attachment 8958796 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This looks good to me. I confess as long as it works, I don't have strong
> opinions on how the mozharness setup is implemented. The mochitest changes
> also look good, but since you're piggy-backing off the test-verify code,
> :gbrown should have the final say there.
> 
> ::: testing/mochitest/mochitest_options.py
> @@ +598,5 @@
> >           {"type": int,
> >            "default": 3600,
> >            "help": "Maximum time, in seconds, to run in --verify mode.",
> >            }],
> > +        [["--code-coverage"],
> 
> Is this something you'd expect developers to want to specify manually? If
> not please add "suppress": True.
> 
> ::: testing/mochitest/runtests.py
> @@ +3071,5 @@
> >          options.runByManifest = True
> >  
> >      if options.verify:
> > +        if options.code_coverage:
> > +            result = runner.runTestSeparateBrowser(options)
> 
> I'll defer to :gbrown for this section.

I've refactored it a little bit in my current WIP patch, using "--verify" for verify and "--per-test-coverage" for per-test coverage.

> 
> ::: testing/mozharness/mozharness/mozilla/testing/testbase.py
> @@ +270,5 @@
> >              return urllib2.build_opener(authhandler).open(url, **kwargs)
> >  
> >          # If we have the developer_run flag enabled then we will switch
> >          # URLs to the right place and enable http authentication
> > +        # if "developer_config.py" in self.config["config_files"]:
> 
> Why was this commented out? I think :armenzg implemented it awhile ago to
> facilitate running mozharness locally. It may be broken now I guess.

This was just a temporary hack to run mozharness locally (which I've found really hard to do), it wasn't supposed to be part of the patch.
This is just a patch to move verify_tools.py to per_test_base.py (which implements a PerTestMixin that VerifyToolsMixin and CodeCoverageMixin derive from).
It's useful to do this so that the next patch will only show the actual changes made.
Attachment #8958796 - Attachment is obsolete: true
Attached patch Proposal (obsolete) — Splinter Review
This is the refined proposal. The per-test-coverage mode is still relying on the verify taskcluster definition.
Attachment #8960278 - Flags: feedback?(gbrown)
Blocks: 1447179
Comment on attachment 8960278 [details] [diff] [review]
Proposal

Review of attachment 8960278 [details] [diff] [review]:
-----------------------------------------------------------------

> > Sorry, I don't know much about coverage and I don't understand the intent
> > here. I'd be glad to comment further if provided with more context.
> 
> There's an email thread about started by Joel a few days ago. You should be
> CCed too.

I saw (and re-read!) that email, but it really didn't help me understand the intent of these changes. I still don't understand what you are trying to build, or why, and I don't see much conceptual commonality with test verification. That makes it hard for me to provide useful feedback.

::: taskcluster/ci/test/misc.yml
@@ +93,5 @@
>          extra-options:
> +            by-test-platform:
> +                linux64-ccov/.*: [--per-test-coverage]
> +                windows10-64-ccov/debug: [--per-test-coverage]
> +                default: [--verify]

This seems wrong: On ccov builds you'll run a task showing as TV and labelled as test-verify but running code that has nothing to do with test verification. Wouldn't it be better to have a mutually exclusive task with its own name and configuration?

::: testing/mochitest/runtests.py
@@ +3074,5 @@
>  
>      if options.verify:
>          result = runner.verifyTests(options)
> +    elif options.per_test_coverage:
> +        result = runner.runTestSeparateBrowser(options)

I still think that if you just need to run the test once, you don't need any special options or handling in the harness: just run runtests.py <normal-options> <test>

If you do end up doing something special in the harness, you'll also possibly need to update the xpcshell, reftest, and wpt harnesses.

::: testing/mozharness/mozharness/mozilla/testing/codecoverage.py
@@ +162,5 @@
> +        if not self.per_test_coverage:
> +            return
> +
> +        self.find_modified_tests()
> +        # TODO: Add tests that haven't been run for a while (a week? N pushes?)

This is very confusing for me. 

find_modified_tests() will find the tests modified on this push. 

Do you want to run those tests + tests that haven't been run for a while? Or just tests that haven't been run for a while?

Tests that haven't been run for a while...in this code coverage mode, or...??

How will you find tests that haven't been run for a while? activedata??!

::: testing/mozharness/mozharness/mozilla/testing/per_test_base.py
@@ +229,5 @@
> +            cur = []
> +            cur.extend(self.coverage_args)
> +            cur.extend(self.verify_args)
> +            cur.append(file)
> +            args.append(cur)

This seems awkward. Maybe it would be better to have separate implementations of query_args for coverage and verification?

Will all of this produce the same arguments that query_verify_args produced? Even for multiple files?
Attachment #8960278 - Flags: feedback?(gbrown)
(In reply to Geoff Brown [:gbrown] from comment #8)
> Comment on attachment 8960278 [details] [diff] [review]
> Proposal
> 
> Review of attachment 8960278 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> > > Sorry, I don't know much about coverage and I don't understand the intent
> > > here. I'd be glad to comment further if provided with more context.
> > 
> > There's an email thread about started by Joel a few days ago. You should be
> > CCed too.
> 
> I saw (and re-read!) that email, but it really didn't help me understand the
> intent of these changes. I still don't understand what you are trying to
> build, or why, and I don't see much conceptual commonality with test
> verification. That makes it hard for me to provide useful feedback.

The goal is to collect code coverage data per test. We want to 1) collect coverage data for tests that have been modified in a push and 2) for tests that haven't been run for a while.
The first part is in common with test verify. The second part can, at least in my proposal, share most of the code with test verify with just the addition of "query some database to get list of tests not run in the past week".

> 
> ::: taskcluster/ci/test/misc.yml
> @@ +93,5 @@
> >          extra-options:
> > +            by-test-platform:
> > +                linux64-ccov/.*: [--per-test-coverage]
> > +                windows10-64-ccov/debug: [--per-test-coverage]
> > +                default: [--verify]
> 
> This seems wrong: On ccov builds you'll run a task showing as TV and
> labelled as test-verify but running code that has nothing to do with test
> verification. Wouldn't it be better to have a mutually exclusive task with
> its own name and configuration?

Yes, we could create a separate task that only runs for coverage builds. It would add more maintenance burden, but we can do it if you think it's beneficial.


> 
> ::: testing/mochitest/runtests.py
> @@ +3074,5 @@
> >  
> >      if options.verify:
> >          result = runner.verifyTests(options)
> > +    elif options.per_test_coverage:
> > +        result = runner.runTestSeparateBrowser(options)
> 
> I still think that if you just need to run the test once, you don't need any
> special options or handling in the harness: just run runtests.py
> <normal-options> <test>
> 
> If you do end up doing something special in the harness, you'll also
> possibly need to update the xpcshell, reftest, and wpt harnesses.

Sounds good. Note the patch is just a proposal to discuss, I'm not sure yet if I will need something specific as I haven't looked at all the harnesses yet but just mochitest. If I find that I don't need to do anything special, I can just execute runtests.py with the single test.

> 
> ::: testing/mozharness/mozharness/mozilla/testing/codecoverage.py
> @@ +162,5 @@
> > +        if not self.per_test_coverage:
> > +            return
> > +
> > +        self.find_modified_tests()
> > +        # TODO: Add tests that haven't been run for a while (a week? N pushes?)
> 
> This is very confusing for me. 
> 
> find_modified_tests() will find the tests modified on this push. 
> 
> Do you want to run those tests + tests that haven't been run for a while? Or
> just tests that haven't been run for a while?

The intent is to run modified tests and tests that haven't been run for a while.

> 
> Tests that haven't been run for a while...in this code coverage mode, or...??

Sure, in the coverage mode.

> 
> How will you find tests that haven't been run for a while? activedata??!

Either activedata or some other service, we haven't decided yet.


> 
> ::: testing/mozharness/mozharness/mozilla/testing/per_test_base.py
> @@ +229,5 @@
> > +            cur = []
> > +            cur.extend(self.coverage_args)
> > +            cur.extend(self.verify_args)
> > +            cur.append(file)
> > +            args.append(cur)
> 
> This seems awkward. Maybe it would be better to have separate
> implementations of query_args for coverage and verification?
> 
> Will all of this produce the same arguments that query_verify_args produced?
> Even for multiple files?

Most of the code is the same between the coverage and verification, with the only exception being the addition of one argument in coverage (--per-test-coverage) and one in verification (--verify).
We can copy the implementation from verify to coverage, but I thought it would make more sense to share it, since it's basically the same.

I'm going to be in PTO for the next week, so if you need more clarification about the intent here maybe you can chat a bit with Joel?
Depends on: 1453056
I've updated the proposal to add a "test-coverage" suite (very similar to "test-verify") and to remove the unneeded changes to the mochitest runtests.py.

:gbrown, what do you think of this? I'm back from PTO, so if there's still something unclear about what we're trying to do feel free to ask here or ping me directly on IRC.
Attachment #8960278 - Attachment is obsolete: true
Attachment #8966779 - Flags: feedback?(gbrown)
Comment on attachment 8966779 [details] [diff] [review]
Proposal (whitespace changes ignored)

Review of attachment 8966779 [details] [diff] [review]:
-----------------------------------------------------------------

some drive by comments...

::: taskcluster/ci/test/web-platform.yml
@@ +166,5 @@
> +
> +test-coverage-wpt:
> +    description: "Per web-platform test coverage"
> +    suite: test-coverage-wpt
> +    treeherder-symbol: PTCw

I might be more interested in TCw but wouldn't argue with PTCw as stated.

::: taskcluster/taskgraph/transforms/tests.py
@@ +660,5 @@
>          test['attributes']['unittest_flavor'] = flavor
>  
>          script = test['mozharness']['script']
>          category_arg = None
> +        if suite == 'test-verify' or suite == 'test-coverage':

I believe we will need to use suite.startswith('test-[verify|coverage]) as we have test-coverage-e10s-wpt as a job name.  Also to support gpu jobs (reftest, mochitest-webgl, mochitest-gpu, mochitest-media, etc.), we could end up with test-coverage-e10s-gpu.

::: testing/mozharness/mozharness/mozilla/testing/per_test_base.py
@@ +206,1 @@
>              references = re.compile(r"(-ref|-notref|-noref|-noref.)\.")

this looks indented incorrectly- before there was a else and in this patch it is missing.

@@ +223,5 @@
>                                  file = nonref
> +            elif (self.config.get('verify_category') != "web-platform" and
> +                  suite == 'jsreftest'):
> +                file = os.path.relpath(file, jsreftest_extra_dir)
> +                file = os.path.join(self.jsreftest_test_dir, file)

why is jsreftest moved after the for loop?  Are there issues with it in coverage mode?

@@ +243,5 @@
> +            else:
> +                cur = ['--verify-max-time=%d' % MAX_TIME_PER_TEST]
> +
> +            cur.extend(self.coverage_args)
> +            cur.extend(self.verify_args)

would we want to extend both for coverage and verify?

::: testing/mozharness/scripts/desktop_unittest.py
@@ +850,5 @@
>                  cmd_timeout = self.get_timeout_for_category(suite_category)
>  
> +                if self.per_test_coverage:
> +                    gcov_dir, jsvm_dir = self.set_coverage_env(env)
> +                    # TODO: Run basic startup/shutdown test to collect baseline coverage.

Greg is working on this TODO- his initial work is to add a "no-op" test case that runs, not sure if that would work in this model- we should get some feedback from Greg on that.

::: testing/mozharness/scripts/web_platform_tests.py
@@ +321,5 @@
>          max_verify_time = timedelta(minutes=60)
>          max_verify_tests = 10
>          verified_tests = 0
>  
> +        if self.per_test_coverage or self.verify_enabled:

I noticed in desktop_unittest.py there is a lot of code for supporting grcov and setting up the environment/tools.  should we do that here as well?
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #11)
> ::: testing/mozharness/mozharness/mozilla/testing/per_test_base.py
> @@ +206,1 @@
> >              references = re.compile(r"(-ref|-notref|-noref|-noref.)\.")
> 
> this looks indented incorrectly- before there was a else and in this patch
> it is missing.

I've generated the patch ignoring the whitespace changes to make it more clear what is actually changing (without ignoring the whitespace changes, it looked like everything was changing here, but actually only a few things were changing).

> 
> @@ +223,5 @@
> >                                  file = nonref
> > +            elif (self.config.get('verify_category') != "web-platform" and
> > +                  suite == 'jsreftest'):
> > +                file = os.path.relpath(file, jsreftest_extra_dir)
> > +                file = os.path.join(self.jsreftest_test_dir, file)
> 
> why is jsreftest moved after the for loop?  Are there issues with it in
> coverage mode?

It wasn't actually moved, it only looks like it because of the same reason as before. I'll upload another patch without ignoring whitespace changes.

> 
> @@ +243,5 @@
> > +            else:
> > +                cur = ['--verify-max-time=%d' % MAX_TIME_PER_TEST]
> > +
> > +            cur.extend(self.coverage_args)
> > +            cur.extend(self.verify_args)
> 
> would we want to extend both for coverage and verify?

Initially self.coverage_args was actually used to pass things to the harness, now it's returning an empty array so it is basically a no-op. I can remove it.

> 
> ::: testing/mozharness/scripts/desktop_unittest.py
> @@ +850,5 @@
> >                  cmd_timeout = self.get_timeout_for_category(suite_category)
> >  
> > +                if self.per_test_coverage:
> > +                    gcov_dir, jsvm_dir = self.set_coverage_env(env)
> > +                    # TODO: Run basic startup/shutdown test to collect baseline coverage.
> 
> Greg is working on this TODO- his initial work is to add a "no-op" test case
> that runs, not sure if that would work in this model- we should get some
> feedback from Greg on that.

Yes, his patch applies on top of mine :)

> 
> ::: testing/mozharness/scripts/web_platform_tests.py
> @@ +321,5 @@
> >          max_verify_time = timedelta(minutes=60)
> >          max_verify_tests = 10
> >          verified_tests = 0
> >  
> > +        if self.per_test_coverage or self.verify_enabled:
> 
> I noticed in desktop_unittest.py there is a lot of code for supporting grcov
> and setting up the environment/tools.  should we do that here as well?

Yes, we'll need the same there. I haven't done it yet because I want to make sure what I did is OK before doing it for web_platform_tests.py too.
Attachment #8966779 - Attachment description: Patch → Proposal (whitespace changes ignored)
Comment on attachment 8966779 [details] [diff] [review]
Proposal (whitespace changes ignored)

Review of attachment 8966779 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for splitting out the separate taskcluster config, etc. -- I feel better about that.

Lots of nits and minor suggestions here, nothing too major. Probably the biggest issue is that I think restructuring query_args would be beneficial.

I would be happy to see try runs of verification and coverage, with just one file changed, multiple files changed, different test suites, etc -- perhaps at review time.

::: taskcluster/ci/test/misc.yml
@@ +107,5 @@
> +    max-run-time: 10800
> +    allow-software-gl-layers: false
> +    run-on-projects:
> +        by-test-platform:
> +            # do not run on beta or release: usually just confirms earlier results

Is the run-on-trunk-only restriction appropriate for coverage? (I don't know.)

::: taskcluster/ci/test/test-sets.yml
@@ +38,5 @@
>      - reftest-no-accel
>      - telemetry-tests-client
>      - test-verify
>      - test-verify-wpt
> +    - test-coverage

nit: These lists are usually sorted alphabetically.

::: taskcluster/taskgraph/transforms/tests.py
@@ +660,5 @@
>          test['attributes']['unittest_flavor'] = flavor
>  
>          script = test['mozharness']['script']
>          category_arg = None
> +        if suite == 'test-verify' or suite == 'test-coverage':

I'm not sure, but I think this is right the way it is coded: e10s-ness is tracked separately.

::: testing/mozharness/mozharness/mozilla/testing/per_test_base.py
@@ +17,2 @@
>  
> +class PerTestMixin(object):

Is it just me, or is "PerTest" awkward? Consider an alternative. "SingleTestMixin", "TestIteratorMixin"??

@@ +19,1 @@
>      """Utility functions for test verification."""

Comment needs updating: "Utility functions for per-test testing like test verification and per-test coverage", maybe?

@@ +19,4 @@
>      """Utility functions for test verification."""
>  
>      def __init__(self):
>          self.verify_suites = {}

nit: Variables should be renamed without "verify". self.suites might be okay here.

@@ +208,5 @@
>              jsreftest_extra_dir = os.path.join('js', 'src', 'tests')
>              # For some suites, the test path needs to be updated before passing to
>              # the test harness.
>              for file in self.verify_suites.get(suite):
>                  if (self.config.get('verify_category') != "web-platform" and

"verify_category" is a mozharness config parameter that probably applies to both verification and PTC -- it should probably be renamed to something more generic ("per_test_category"??).

@@ +235,5 @@
> +        args = []
> +        for file in files:
> +            # Limit each test harness run to 15 minutes, to avoid task timeouts
> +            # when executing long-running tests.
> +            MAX_TIME_PER_TEST = 900

MAX_TIME_PER_TEST only applies to verification -- it "shouldn't" be in the base class.

@@ +243,5 @@
> +            else:
> +                cur = ['--verify-max-time=%d' % MAX_TIME_PER_TEST]
> +
> +            cur.extend(self.coverage_args)
> +            cur.extend(self.verify_args)

You are swimming against the current here: Trying to implement two separate things in one place. Wouldn't it be better to have separate PTC and verify implementations of query_args(), possibly sharing some common code in the base class?

@@ +258,5 @@
>             In verify mode, determine which suites are active, for the given
>             suite category.
>          """
>          suites = None
> +        if self.config.get('verify') == True or self.config.get('per_test_coverage') == True:

self.verify_enabled or self.per_test_coverage?

::: testing/mozharness/mozharness/mozilla/testing/verify_tools.py
@@ +9,5 @@
> +import os
> +import posixpath
> +import re
> +import sys
> +import mozinfo

Probably these imports can be cleaned up.

::: testing/mozharness/scripts/desktop_unittest.py
@@ +856,5 @@
> +                    # shutil.move(grcov_output_file, 'baseline_grcov')
> +                    # shutil.move(jsvm_output_file, 'baseline_jsvm')
> +                    # TODO: Parse coverage report
> +
> +                for verify_args in self.query_args(suite):

nit: rename all the "verify" variables to something more generic.

::: testing/mozharness/scripts/web_platform_tests.py
@@ +335,5 @@
>              verify_suites = [None]
>          for verify_suite in verify_suites:
>              if verify_suite:
>                  test_types = [verify_suite]
> +            for verify_args in self.query_args(verify_suite):

nit: rename all the "verify" variables to something more generic.
Attachment #8966779 - Flags: feedback?(gbrown) → feedback+
(In reply to Geoff Brown [:gbrown] from comment #14)
> @@ +235,5 @@
> > +        args = []
> > +        for file in files:
> > +            # Limit each test harness run to 15 minutes, to avoid task timeouts
> > +            # when executing long-running tests.
> > +            MAX_TIME_PER_TEST = 900
> 
> MAX_TIME_PER_TEST only applies to verification -- it "shouldn't" be in the
> base class.
> 
> @@ +243,5 @@
> > +            else:
> > +                cur = ['--verify-max-time=%d' % MAX_TIME_PER_TEST]
> > +
> > +            cur.extend(self.coverage_args)
> > +            cur.extend(self.verify_args)
> 
> You are swimming against the current here: Trying to implement two separate
> things in one place. Wouldn't it be better to have separate PTC and verify
> implementations of query_args(), possibly sharing some common code in the
> base class?

I was forgetting that DesktopUnittest and other scripts need to be both a CodeCoverageMixin and a VerifyToolMixin. I see why you are doing things like this now. Let me see if I can find a better way...
Sorry for the long delay!

Another way of structuring this would be to put most of the shared code -- stuff to find modified tests, etc -- in a  PerTest class which isn't part of any inheritance hierarchy. Then have each VerifyMixin and CoverageMixin, when activated by configuration, create and use a PerTest. PerTest takes a parameter that points back to its mixin owner, which it can query for mixin-specific needs (like command arguments):

   class VerifyMixin:
       def __init__:
           self.pertest = PerTest(self)

Scripts could call into the mixins to access the PerTest:

   class DesktopUnittest(...):

        if self.verify_enabled:
            pertest = self.get_verify_per_test()
        elif self.coverage:
            pertest = self.get_coverage_per_test()
        for arg in pertest.query_args():

The script still has some awkwardness to it...maybe that could be addressed with a factory?

This structure feels more flexible to me, but it may not agree with you, and I'm not sure it's worth the additional effort. If we are just dealing with verify and coverage and not expanding this much, it might be best to just go ahead as you have started: your choice!
Attached patch PatchSplinter Review
I think this should be ready for review now.

There are still two TODOs, but I think we can solve them in follow-ups:
1) Diff baseline coverage report with test coverage report and upload it;
2) Not only run modified tests, but also stale tests (this will likely require the work jmaher is doing in bug 1453056).

(In reply to Geoff Brown [:gbrown] from comment #16)
> This structure feels more flexible to me, but it may not agree with you, and
> I'm not sure it's worth the additional effort. If we are just dealing with
> verify and coverage and not expanding this much, it might be best to just go
> ahead as you have started: your choice!

I chose not to change it, as it feels not great to me anyway. I don't think we are going to expand this other than verify and coverage, so as you say it doesn't matter that much.
If we do decide to expand it in the future, we can refactor and maybe think about a better solution.
Assignee: nobody → mcastelluccio
Attachment #8966779 - Attachment is obsolete: true
Attachment #8966999 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8969121 - Flags: review?(jmaher)
Attachment #8969121 - Flags: review?(gbrown)
This patch doesn't apply cleanly for me:

gbrown@mozpad:~/src$ hg qimport https://bugzilla.mozilla.org/attachment.cgi?id=8969121 -n marco
adding marco to series file
gbrown@mozpad:~/src$ hg qpush
applying marco
unable to find 'testing/mozharness/mozharness/mozilla/testing/per_test_base.py' for patching
(use '--prefix' to apply patch relative to the current directory)
4 out of 4 hunks FAILED -- saving rejects to file testing/mozharness/mozharness/mozilla/testing/per_test_base.py.rej
file testing/mozharness/mozharness/mozilla/testing/verify_tools.py already exists
1 out of 1 hunks FAILED -- saving rejects to file testing/mozharness/mozharness/mozilla/testing/verify_tools.py.rej
patch failed, unable to continue (try -v)
patch failed, rejects left in working directory
errors during apply, please fix and qrefresh marco
Oh, I see the first patch now -- that's fine.
(In reply to Geoff Brown [:gbrown] from comment #19)
> Oh, I see the first patch now -- that's fine.

Sorry, forgot to ask review on that one :)
Comment on attachment 8969121 [details] [diff] [review]
Patch

Review of attachment 8969121 [details] [diff] [review]:
-----------------------------------------------------------------

this seems to respect the intentions of test-verify and keep it focused on test-verification without opening the door to crazy ideas.  I have a few nits that I would like to see addressed.

I will work on rebasing my patches on top of these- I have additional work to do on the chunking patch.

::: taskcluster/ci/test/test-sets.yml
@@ +248,5 @@
>      - mochitest-media
>      - mochitest-webgl
>      - reftest
> +    - test-coverage
> +    - test-coverage-wpt

we don't have coverage on osx.

@@ +335,5 @@
>      - mochitest-clipboard
>      - mochitest-gpu
>      - mochitest-media
>      - reftest
> +    - test-coverage

we don't have a method for coverage on android

::: testing/mozharness/scripts/web_platform_tests.py
@@ +339,5 @@
> +                test_types = [suite]
> +
> +            if self.per_test_coverage:
> +                gcov_dir, jsvm_dir = self.set_coverage_env(env)
> +                # TODO: Run basic startup/shutdown test to collect baseline coverage.

can you put a comment at the top of this if statement to indicate we are running a baseline test?

@@ +381,5 @@
> +                    shutil.rmtree(gcov_dir)
> +                    shutil.rmtree(jsvm_dir)
> +                    # TODO: Parse coverage report
> +                    # TODO: Diff this coverage report with the baseline one
> +

one quirk here is that we will evaluate_parser which finds errors/warnings and reports things to treeherder- is there a chance we could have errors from shutil, parsing reports, or diffing coverage?
Attachment #8969121 - Flags: review?(jmaher) → review+
Comment on attachment 8969121 [details] [diff] [review]
Patch

Review of attachment 8969121 [details] [diff] [review]:
-----------------------------------------------------------------

Be sure to review https://wiki.mozilla.org/Sheriffing/Job_Visibility_Policy before check-in. In particular, I think you need to add new documentation.
Attachment #8969121 - Flags: review?(gbrown) → review+
Pushed by mcastelluccio@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/381cc9c492f3
Move verify_tools.py to per_test_base.py. r=gbrown,jmaher
https://hg.mozilla.org/integration/mozilla-inbound/rev/df038cacff14
Add a test-coverage test suite, similar to test-verify, that runs tests in isolation in coverage mode. r=gbrown,jmaher
Blocks: 1455403
Blocks: 1455401
Blocks: 1455400
Pushed by mcastelluccio@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/bc062fa71ca7
Don't check per_test_coverage on Android, where we don't have a coverage build. r=me CLOSED TREE
Backed out 3 changesets (bug 1431753) for android mochitest failures on mozharness/base/script.py

Backout: https://hg.mozilla.org/integration/mozilla-inbound/rev/efab974f4b4afd3991d6988a59fb44f84c0ffb7f

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=bc062fa71ca7c868d5219b1ddcbe5d9afa86bff3&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=success&filter-resultStatus=pending&filter-resultStatus=running&filter-resultStatus=superseded&selectedJob=174610333

https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=df038cacff14882546fd674fae17c8998732fe19&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=pending&filter-resultStatus=running&filter-resultStatus=superseded&selectedJob=174603375

Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=174610333&repo=mozilla-inbound&lineNumber=1440

[task 2018-04-19T19:15:29.721Z] 19:15:29     INFO - >> Install app APK: Attempt #1 of 3
[task 2018-04-19T19:15:29.722Z] 19:15:29     INFO - Running timeout 300 /builds/worker/workspace/build/android-sdk-linux/platform-tools/adb -s emulator-5554 install -r /builds/worker/workspace/build/target.apk
[task 2018-04-19T19:16:08.857Z] 19:16:08     INFO - Finished installing apps for test-1
[task 2018-04-19T19:16:08.859Z] 19:16:08     INFO - Running post-action listener: _resource_record_post_action
[task 2018-04-19T19:16:08.859Z] 19:16:08     INFO - [mozharness: 2018-04-19 19:16:08.858386Z] Finished install step (success)
[task 2018-04-19T19:16:08.859Z] 19:16:08     INFO - [mozharness: 2018-04-19 19:16:08.858625Z] Running run-tests step.
[task 2018-04-19T19:16:08.859Z] 19:16:08     INFO - Running pre-action listener: _resource_record_pre_action
[task 2018-04-19T19:16:08.859Z] 19:16:08     INFO - Running main action method: run_tests
[task 2018-04-19T19:16:08.860Z] 19:16:08     INFO - Running command: ['unzip', '-q', '-o', '/builds/worker/workspace/build/target.apk'] in /builds/worker/workspace/build
[task 2018-04-19T19:16:08.860Z] 19:16:08     INFO - Copy/paste: unzip -q -o /builds/worker/workspace/build/target.apk
[task 2018-04-19T19:16:09.236Z] 19:16:09     INFO - Return code: 0
[task 2018-04-19T19:16:09.237Z] 19:16:09     INFO - Reading from file /builds/worker/workspace/build/package-name.txt
[task 2018-04-19T19:16:09.237Z] 19:16:09     INFO - Contents:
[task 2018-04-19T19:16:09.237Z] 19:16:09     INFO -  org.mozilla.fennec_aurora
[task 2018-04-19T19:16:09.239Z] 19:16:09     INFO - Running post-action listener: _resource_record_post_action
[task 2018-04-19T19:16:09.239Z] 19:16:09     INFO - Running post-action listener: stop_emulator
[task 2018-04-19T19:16:09.245Z] 19:16:09     INFO - Let's kill every process called emulator64-arm
[task 2018-04-19T19:16:09.245Z] 19:16:09     INFO - Killing pid 494.
[task 2018-04-19T19:16:09.251Z] 19:16:09     INFO - [mozharness: 2018-04-19 19:16:09.251104Z] Finished run-tests step (failed)
[task 2018-04-19T19:16:09.252Z] 19:16:09    FATAL - Uncaught exception: Traceback (most recent call last):
[task 2018-04-19T19:16:09.253Z] 19:16:09    FATAL -   File "/builds/worker/workspace/mozharness/mozharness/base/script.py", line 2076, in run
[task 2018-04-19T19:16:09.253Z] 19:16:09    FATAL -     self.run_action(action)
[task 2018-04-19T19:16:09.255Z] 19:16:09    FATAL -   File "/builds/worker/workspace/mozharness/mozharness/base/script.py", line 2015, in run_action
[task 2018-04-19T19:16:09.255Z] 19:16:09    FATAL -     self._possibly_run_method(method_name, error_if_missing=True)
[task 2018-04-19T19:16:09.255Z] 19:16:09    FATAL -   File "/builds/worker/workspace/mozharness/mozharness/base/script.py", line 1955, in _possibly_run_method
[task 2018-04-19T19:16:09.255Z] 19:16:09    FATAL -     return getattr(self, method_name)()
[task 2018-04-19T19:16:09.256Z] 19:16:09    FATAL -   File "/builds/worker/workspace/mozharness/scripts/android_emulator_unittest.py", line 791, in run_tests
[task 2018-04-19T19:16:09.257Z] 19:16:09    FATAL -     for per_test_args in self.query_args(per_test_suite):
[task 2018-04-19T19:16:09.257Z] 19:16:09    FATAL -   File "/builds/worker/workspace/mozharness/mozharness/mozilla/testing/per_test_base.py", line 203, in query_args
[task 2018-04-19T19:16:09.258Z] 19:16:09    FATAL -     if not self.per_test_coverage and not self.verify_enabled:
[task 2018-04-19T19:16:09.258Z] 19:16:09    FATAL - AttributeError: 'AndroidEmulatorTest' object has no attribute 'per_test_coverage'
[task 2018-04-19T19:16:09.259Z] 19:16:09    FATAL - Running post_fatal callback...
[task 2018-04-19T19:16:09.260Z] 19:16:09    FATAL - Exiting -1
Flags: needinfo?(mcastelluccio)
Pushed by mcastelluccio@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e8df07cc7125
Make AndroidEmulatorTest extend CodeCoverageMixin. r=me CLOSED TREE
Pushed by mcastelluccio@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8ce15c227c66
Move verify_tools.py to per_test_base.py. r=gbrown,jmaher
https://hg.mozilla.org/integration/mozilla-inbound/rev/4ab22db3a416
Add a test-coverage test suite, similar to test-verify, that runs tests in isolation in coverage mode. r=gbrown,jmaher
Pushed by mcastelluccio@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0953370890fa
Fix condition when verification and per-test coverage are not enabled. r=me
https://hg.mozilla.org/mozilla-central/rev/8ce15c227c66
https://hg.mozilla.org/mozilla-central/rev/4ab22db3a416
https://hg.mozilla.org/mozilla-central/rev/0953370890fa
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
Flags: needinfo?(mcastelluccio)
Depends on: 1470151
No longer depends on: 1380661, 1470151
Depends on: 1473392
Depends on: 1503232
You need to log in before you can comment on or make changes to this bug.