Closed Bug 1455401 Opened 2 years ago Closed 2 years ago

Generate baseline coverage reports for diffing with full coverage reports

Categories

(Testing :: Code Coverage, enhancement)

enhancement
Not set

Tracking

(firefox62 fixed)

RESOLVED FIXED
mozilla62
Tracking Status
firefox62 --- fixed

People

(Reporter: marco, Assigned: sparky)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

The coverage reports we have per-test are going to include useless coverage info too (e.g. coverage info related to the startup of the browser).
We should collect baseline coverage information, then "subtract" it from the per-test coverage info to have reports that are actually specific to the tests.
Depends on: 1431753
No longer depends on: 1454629
Comment on attachment 8974029 [details]
Bug 1455401 - Generate baseline code coverage reports.

https://reviewboard.mozilla.org/r/242352/#review248206


Code analysis found 3 defects in this patch:
 - 3 defects found by mozlint

You can run this analysis locally with:
 - `./mach lint path/to/file` (JS/Python)


If you see a problem in this automated review, please report it here: http://bit.ly/2y9N9Vx


::: testing/mochitest/baselinecoverage/browser_chrome/browser_baselinecoverage.js:6
(Diff revision 1)
> +/* Any copyright is dedicated to the Public Domain.
> + * http://creativecommons.org/publicdomain/zero/1.0/ */
> +
> +"use strict";
> +
> +add_task(async function() {

Error: 'add_task' is not defined. [eslint: no-undef]

::: testing/mochitest/baselinecoverage/browser_chrome/browser_baselinecoverage.js:7
(Diff revision 1)
> + * http://creativecommons.org/publicdomain/zero/1.0/ */
> +
> +"use strict";
> +
> +add_task(async function() {
> +  ok(true, "Collecting baseline coverage for javascript (.js) file types.");

Error: 'ok' is not defined. [eslint: no-undef]

::: testing/mochitest/baselinecoverage/plain/test_baselinecoverage.html:10
(Diff revision 1)
> +  <script type="application/javascript" src="/tests/SimpleTest/SimpleTest.js"></script>
> +  <link rel="stylesheet" type="text/css" href="/tests/SimpleTest/test.css"/>
> +</head>
> +<body>
> +<script type="application/javascript">
> +  SimpleTest.ok(true, "Collecting  baseline coverage for HTML (.html) file types.");

Error: 'simpletest' is not defined. [eslint: no-undef]
Here is a try push for this patch: https://treeherder.mozilla.org/#/jobs?repo=try&revision=0e54110e2ffbea9c20e95614e0b6d2823ffc2f31

ESLint is failing but I can't figure out why. It says that certain functions are undefined but they are definitely defined, otherwise I wouldn't be passing and getting coverage in TC.

Furthermore, the WPT test is only skipped on non-ccov builds for the time being until we fix bug 1459926 to get a 'verfiy' field in mozinfo (or something similar).
Comment on attachment 8974029 [details]
Bug 1455401 - Generate baseline code coverage reports.

https://reviewboard.mozilla.org/r/242352/#review248284

small nits

::: testing/mochitest/baselinecoverage/browser_chrome/browser.ini:4
(Diff revision 2)
> +[DEFAULT]
> +
> +[browser_baselinecoverage.js]
> +skip-if = !(ccov && verify)

ccov && verify

should this be:
ccov || verify

or just:
ccov

::: testing/mozharness/mozharness/mozilla/testing/codecoverage.py:170
(Diff revision 2)
>          """
>          if not self.per_test_coverage:
>              return
>  
>          self.find_modified_tests()
> +

an accidental blank line added?

::: testing/mozharness/mozharness/mozilla/testing/codecoverage.py:189
(Diff revision 2)
> +                'test': 'testing/mochitest/baselinecoverage/chrome/test_baselinecoverage.xul',
> +                'suite': 'chrome'
> +            }
> +        }
> +
> +        wpt_baseline_test = 'tests/web-platform/tests/baselinecoverage/wpt_baselinecoverage.html'

for the wpt test, do we need to sync this upstream?

::: testing/web-platform/meta/MANIFEST.json:553617
(Diff revision 2)
>    "dom/events/event-global.worker.js": [
>     "084a6f752edee6578113035fece6d0eb85a2fdf7",
>     "testharness"
>    ],
>    "dom/events/relatedTarget.window.js": [
> -   "0426d2ecae3f3562be175e4364353d979365ed1c",
> +   "3dd9ddfa4d8d584f6ddf6db52e4e9020491088d8",

this is changed, not sure why...another hash is changed in this patch as well.
Attachment #8974029 - Flags: review?(jmaher) → review+
Comment on attachment 8974029 [details]
Bug 1455401 - Generate baseline code coverage reports.

https://reviewboard.mozilla.org/r/242352/#review248284

> ccov && verify
> 
> should this be:
> ccov || verify
> 
> or just:
> ccov

If we use ccov || verify, the test will run in the standard ccov test suites (plain, browser-chrome, chrome, wpt), we can allow this if you want? Using ccov && verify checks to make sure we are in a test-coverage run.

> an accidental blank line added?

This was added so that the code is a bit nicer to read. I can remove it if you don't want it added, what do you think?

> this is changed, not sure why...another hash is changed in this patch as well.

Hmm, I missed this...I'll undo those changes and check if everything still works.
Comment on attachment 8974029 [details]
Bug 1455401 - Generate baseline code coverage reports.

https://reviewboard.mozilla.org/r/242352/#review248284

> for the wpt test, do we need to sync this upstream?

Looking at this doc: https://developer.mozilla.org/en-US/docs/Mozilla/QA/web-platform-tests

It seems to say that this syncing is done regularly (I don't know if it's automatic or manual). However, I don't think we need to sync it since it doesn't really test anything. Is there a way we could do that?
I would get that part reviewed with :jgraham, while it gets sync'd automatically, we might have some quirks as upstream doesn't have ability for |skip-if = ccov|
Comment on attachment 8974029 [details]
Bug 1455401 - Generate baseline code coverage reports.

https://reviewboard.mozilla.org/r/242352/#review248302

::: testing/mochitest/baselinecoverage/plain/mochitest.ini:4
(Diff revision 2)
> +[DEFAULT]
> +
> +[test_baselinecoverage.html]
> +skip-if = !(ccov && verify)

Maybe using run-if would make this clearer (as it would avoid the double negation).
Attachment #8974029 - Flags: review?(mcastelluccio) → review+
Comment on attachment 8974030 [details]
Bug 1455401 - Diff baseline coverage reports with test coverage reports.

https://reviewboard.mozilla.org/r/242360/#review248288

a few small changes, overall this is looking good.

::: testing/mozharness/mozharness/mozilla/testing/codecoverage.py:357
(Diff revision 2)
> +                    if 'baselinecoverage' not in test:
> +                        continue
> +
> +                    baseline_filetype = "." + test.split('.')[-1]
> +                    with open(grcov_file, 'r') as f:
> +                        baseline_tests_cov[baseline_filetype] = json.load(f)

the baselines are 40MB of .json, and we will be loading 3 of them into memory; this seems like it is ripe for optimization- maybe add a comment here indicating a future optimization.  of course if we settle on a single baseline it wouldn't matter.

::: testing/mozharness/mozharness/mozilla/testing/codecoverage.py:382
(Diff revision 2)
> +                                baseline_coverage = {}
> +                                found_a_baseline = False
> +                                for file_type in baseline_tests_cov:
> +                                    if file_type not in test:
> +                                        continue
> +                                    baseline_coverage = baseline_tests_cov[file_type]

what happens if the file type isn't here?  I am thinking of .xht, .htm, .jsm, and there are a few odd ones.

::: testing/mozharness/mozharness/mozilla/testing/codecoverage.py:445
(Diff revision 2)
> +            # File has line number differences due to gcov bug:
> +            #  https://bugzilla.mozilla.org/show_bug.cgi?id=1410217
> +            continue
> +
> +        # Get line numbers and the differences
> +        file_coverage = {i+1 for i, cov in enumerate(test_files[test_file]['coverage']) \

why i+1 ?
Attachment #8974030 - Flags: review?(jmaher) → review-
Comment on attachment 8974030 [details]
Bug 1455401 - Diff baseline coverage reports with test coverage reports.

https://reviewboard.mozilla.org/r/242360/#review248288

> the baselines are 40MB of .json, and we will be loading 3 of them into memory; this seems like it is ripe for optimization- maybe add a comment here indicating a future optimization.  of course if we settle on a single baseline it wouldn't matter.

Will do.

> what happens if the file type isn't here?  I am thinking of .xht, .htm, .jsm, and there are a few odd ones.

Right now, it would result in an assertion failure. We could give those the 'js' baseline (since it has the most coverage) for the moment until we settle on how many baselines we will have? I'll add a TODO comment there if so.

> why i+1 ?

No specific reason, I'll change this to 'i' for clarity.
Comment on attachment 8974029 [details]
Bug 1455401 - Generate baseline code coverage reports.

https://reviewboard.mozilla.org/r/242352/#review248312

::: testing/mozharness/mozharness/mozilla/testing/codecoverage.py:208
(Diff revision 2)
> +                found = False
> +
> +                # Find a test with this baseline file type and add
> +                # the baseline test to the list of tests to run.
> +                for test in tests:
> +                    if file_type in test:

Nit: missed this before, but you could do `test.endswith(file_type)`.
(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #11)
> I would get that part reviewed with :jgraham, while it gets sync'd
> automatically, we might have some quirks as upstream doesn't have ability
> for |skip-if = ccov|

Hi :jgraham, we are looking to add a test for baseline code coverage to wpt. It's simple and doesn't test anything but it is skipped by using the mozinfo field 'ccov'. Do you know how this will work out with upstream? Is there a way we could disable it from being synced or some other solution you could suggest to us?

To be specific, this is the test we are adding: https://hg.mozilla.org/try/file/ea5ca5e5ae8b/testing/web-platform/tests/baselinecoverage/wpt_baselinecoverage.html

And this is it's manifest: https://hg.mozilla.org/try/file/ea5ca5e5ae8b/testing/web-platform/meta/baselinecoverage/wpt_baselinecoverage.html.ini
Flags: needinfo?(james)
Comment on attachment 8974030 [details]
Bug 1455401 - Diff baseline coverage reports with test coverage reports.

https://reviewboard.mozilla.org/r/242360/#review248314

Some nits and comments about minor improvements you could make to improve perf.

::: testing/mozharness/mozharness/mozilla/testing/codecoverage.py:355
(Diff revision 2)
> +            for suite, data in self.per_test_reports.items():
> +                for test, grcov_file in data.items():
> +                    if 'baselinecoverage' not in test:
> +                        continue
> +
> +                    baseline_filetype = "." + test.split('.')[-1]

Nit: you can use os.path.splitext(test) here

::: testing/mozharness/mozharness/mozilla/testing/codecoverage.py:364
(Diff revision 2)
>              dest = os.path.join(dirs['abs_blob_upload_dir'], 'per-test-coverage-reports.zip')
>              with zipfile.ZipFile(dest, 'w', zipfile.ZIP_DEFLATED) as z:
>                  for suite, data in self.per_test_reports.items():
>                      for test, grcov_file in data.items():
> +                        if 'baselinecoverage' in test:
> +                            # Don't diff the baseline coverage with itself

We should also not upload the baseline reports.

::: testing/mozharness/mozharness/mozilla/testing/codecoverage.py:386
(Diff revision 2)
> +                                        continue
> +                                    baseline_coverage = baseline_tests_cov[file_type]
> +                                    found_a_baseline = True
> +                                    break
> +
> +                            assert found_a_baseline == True

You could remove `found_a_baseline` and just `assert baseline_coverage`.
Also, we should add a message here so we can quickly see in which case a baseline was not found (e.g. you can print the test here).

::: testing/mozharness/mozharness/mozilla/testing/codecoverage.py:427
(Diff revision 2)
> +    baseline_coverage, such that what is returned
> +    is the unique coverage for the test in question.
> +    '''
> +
> +    # Get all files into a quicker search format
> +    unique_test_coverage = copy.deepcopy(test_coverage)

I think we don't need to copy this, but we can just modify it directly.

::: testing/mozharness/mozharness/mozilla/testing/codecoverage.py:444
(Diff revision 2)
> +        if len(test_files[test_file]['coverage']) != len(baseline_files[test_file]['coverage']):
> +            # File has line number differences due to gcov bug:
> +            #  https://bugzilla.mozilla.org/show_bug.cgi?id=1410217
> +            continue
> +
> +        # Get line numbers and the differences

I think you could rewrite this part of the function to avoid iterating the test_file `coverage` array twice.
We can do it in a follow-up though.

::: testing/mozharness/mozharness/mozilla/testing/codecoverage.py:451
(Diff revision 2)
> +                                 if cov is not None and cov > 0}
> +
> +        baseline = {i+1 for i, cov in enumerate(baseline_files[test_file]['coverage']) \
> +                            if cov is not None and cov > 0}
> +
> +        unique_coverage = file_coverage - baseline

Maybe we should take into account hit counts too?
Attachment #8974030 - Flags: review?(mcastelluccio) → review-
Comment on attachment 8974030 [details]
Bug 1455401 - Diff baseline coverage reports with test coverage reports.

https://reviewboard.mozilla.org/r/242360/#review248314

> We should also not upload the baseline reports.

Ok, I'll make sure they are not uploaded.

> You could remove `found_a_baseline` and just `assert baseline_coverage`.
> Also, we should add a message here so we can quickly see in which case a baseline was not found (e.g. you can print the test here).

Based on :jmaher's comment above, I'll default to the .js baseline when we have tests whose extensions can't be found but I'll print a message also here. Does that work for you?

> I think you could rewrite this part of the function to avoid iterating the test_file `coverage` array twice.
> We can do it in a follow-up though.

I have an idea but I'm not sure if we'll gain much out of it because of files with no differences so I'll file a follow up bug and add a TODO statement here. We only iterate over the coverage array twice ~25% of the time because ~75% of the files have no differences and are caught by len(unique_coverage).

> Maybe we should take into account hit counts too?

What are your thoughts about taking them into account?

The baselines are variable over time, which would mean that the tests likely have some variability in their browser start and stop coverage. So if we difference the hit counts, we'd likely start picking up coverage that is not unique to the test itself. But on the other hand, if we don't take it into account, we might be missing important test coverage.
(In reply to Greg Mierzwinski [:sparky] from comment #18)
> > You could remove `found_a_baseline` and just `assert baseline_coverage`.
> > Also, we should add a message here so we can quickly see in which case a baseline was not found (e.g. you can print the test here).
> 
> Based on :jmaher's comment above, I'll default to the .js baseline when we
> have tests whose extensions can't be found but I'll print a message also
> here. Does that work for you?

Sounds good!

> > Maybe we should take into account hit counts too?
> 
> What are your thoughts about taking them into account?
> 
> The baselines are variable over time, which would mean that the tests likely
> have some variability in their browser start and stop coverage. So if we
> difference the hit counts, we'd likely start picking up coverage that is not
> unique to the test itself. But on the other hand, if we don't take it into
> account, we might be missing important test coverage.

Maybe let's ignore them for now, but add a comment in the code to explain we are and file a bug so that we remember to investigate more once we have everything in place.
> It's simple and doesn't test anything but it is skipped by using the mozinfo field 'ccov'. Do you know how this will work out with upstream? Is there a way we could disable it from being synced or some other solution you could suggest to us?

Put it in testing/web-platform/mozilla/tests and it won't be synced.
Flags: needinfo?(james)
(In reply to James Graham [:jgraham] from comment #20)
> > It's simple and doesn't test anything but it is skipped by using the mozinfo field 'ccov'. Do you know how this will work out with upstream? Is there a way we could disable it from being synced or some other solution you could suggest to us?
> 
> Put it in testing/web-platform/mozilla/tests and it won't be synced.

Ok great, thanks :jgraham!
Comment on attachment 8974030 [details]
Bug 1455401 - Diff baseline coverage reports with test coverage reports.

https://reviewboard.mozilla.org/r/242360/#review248668

some small details, while I want r+, I would like to have a good place to start with.

::: testing/mozharness/mozharness/mozilla/testing/codecoverage.py:208
(Diff revisions 2 - 3)
> +
> +                if test_ext not in baseline_tests:
> +                    # Add the '.js' test as a default baseline
> +                    # if none other exists.
> +                    baseline_test_suite = baseline_tests['.js']['suite']
> +                    baseline_test_name = baseline_tests['.js']['test']

this could be simplified:
if test_ext not in baseline_tests:
    # Add '.js' as default baseline if none exists
    test_ext = '.js'

baseline_test_suite = baseline_tests[test_ext]['suite']
baseline_test_name = baseline_tests[test_ext]['test']

::: testing/mozharness/mozharness/mozilla/testing/codecoverage.py:6
(Diff revision 3)
>  #!/usr/bin/env python
>  # This Source Code Form is subject to the terms of the Mozilla Public
>  # License, v. 2.0. If a copy of the MPL was not distributed with this file,
>  # You can obtain one at http://mozilla.org/MPL/2.0/.
>  
> +import copy

I don't think we use this anymore?
Attachment #8974030 - Flags: review?(jmaher) → review-
Comment on attachment 8974030 [details]
Bug 1455401 - Diff baseline coverage reports with test coverage reports.

https://reviewboard.mozilla.org/r/242360/#review248756

thanks that is cleaner.  I appreciate the TODO comments and related bugs- this seems like a good balance of getting something fairly stable in, yet not solving the whole problem.
Attachment #8974030 - Flags: review?(jmaher) → review+
Comment on attachment 8974030 [details]
Bug 1455401 - Diff baseline coverage reports with test coverage reports.

https://reviewboard.mozilla.org/r/242360/#review249066
Attachment #8974030 - Flags: review?(mcastelluccio) → review+
(In reply to Greg Mierzwinski [:sparky] from comment #24)
> Try run of the latest iteration:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=97a99b4da8fdcd8544329238e520830ecc238408

Can you make some changes to a couple of tests? This way the per-test-coverage parsing code would get triggered. You can apply this patch: https://hg.mozilla.org/try/rev/be75f030be3e213e850256794a036549b15483d8.
I have a slightly larger test set that was run here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=e6d954d68de3facbb0d0d12ed3eda5221e208aad

This is the current revision running just before the wpt baseline test was moved to it's current folder. What I find interesting though is that the wpt test I modified ended up failing but we still got coverage from it - which to me is great.

Would you like to see the most recent revsion (with the files moved) being run?
Flags: needinfo?(mcastelluccio)
No, that's fine, I was just curious to see the size of the reports with the baseline subtracted. They are way smaller, but still not so small (~6 MB each).
I see your patch applies on top of a revision before bug 1459311 landed, so after rebasing they will probably get even smaller.
Flags: needinfo?(mcastelluccio)
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3d404958db3c
Generate baseline code coverage reports. r=jmaher,marco
https://hg.mozilla.org/integration/autoland/rev/3525c92cb5f6
Diff baseline coverage reports with test coverage reports. r=jmaher,marco
Greg, I think the removal of uncovered files should happen after the baseline subtraction, not before. This way we will have more files to remove and the reports will be smaller.
Could you file a bug for that?
Flags: needinfo?(gmierz2)
https://hg.mozilla.org/mozilla-central/rev/3d404958db3c
https://hg.mozilla.org/mozilla-central/rev/3525c92cb5f6
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
Blocks: 1461201
:marco, sounds good it's filed: https://bugzilla.mozilla.org/show_bug.cgi?id=1461201

Although I have a feeling it won't make a difference because when rm_baseline_cov removes the baseline coverage from the test coverage we prevent any files with 0 uniquely covered lines from entering the new report here: https://hg.mozilla.org/integration/autoland/rev/3525c92cb5f6#l1.127
Flags: needinfo?(gmierz2)
Depends on: 1461202
(In reply to Greg Mierzwinski [:sparky] from comment #38)
> :marco, sounds good it's filed:
> https://bugzilla.mozilla.org/show_bug.cgi?id=1461201
> 
> Although I have a feeling it won't make a difference because when
> rm_baseline_cov removes the baseline coverage from the test coverage we
> prevent any files with 0 uniquely covered lines from entering the new report
> here: https://hg.mozilla.org/integration/autoland/rev/3525c92cb5f6#l1.127

Yeah probably not, but it's unneeded in the place where it is now.
You need to log in before you can comment on or make changes to this bug.