Update gcov code coverage flags for the linux64-ccov build.

RESOLVED FIXED in Firefox 55

Status

RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: sparky, Assigned: mitchhentges)

Tracking

(Blocks: 1 bug)

unspecified
mozilla55
Points:
---

Firefox Tracking Flags

(firefox55 fixed)

Details

Attachments

(1 attachment)

(Reporter)

Description

2 years ago
Recently, some changes have been made to gcov that allow us to use the '--coverage' flag instead. The changes should be made to the code-coverage mozconfig for linux64-ccov.
(Reporter)

Updated

2 years ago
Blocks: 1301170
(Assignee)

Updated

2 years ago
Attachment #8829251 - Flags: review?(gmierz2)
(Reporter)

Comment 2

2 years ago
mozreview-review
Comment on attachment 8829251 [details]
Bug 1332917: Replace linux64-ccov compile/link gcov flags with the more-consistent '--coverage' +560562

https://reviewboard.mozilla.org/r/106368/#review107306

Good stuff! Before we have this landed, let's see if a try run on linux64-ccov with all the test suites currently available still runs as expected.
Attachment #8829251 - Flags: review?(gmierz2) → review+
(Reporter)

Comment 5

2 years ago
Interesting...the size of the gcno file dropped from 200kB to 72kB. Maybe this is a side effect of using the --coverage flag because it may be better or it could be we are missing certain parts. I think the next step should be to see if the coverage data is actually good/unchanged. So as one possible step we can check the difference between info files within a single test that uses c++.

Another thing to look at would be the flags themselves. I am finding conflicting statements as to what --coverage replaces. Does it replace "-fprofile-arcs -ftest-coverage" or "-fprofile-arcs -ftest-coverage -lgcov" or just "-lgcov" at times? See [1], & [2].

[1]: http://stackoverflow.com/questions/28840261/gcov-what-is-the-difference-between-coverage-and-ftest-coverage-when-buildi
[2]: https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html#Instrumentation-Options
Flags: needinfo?(mitch9654)
(Assignee)

Comment 6

2 years ago
Regarding the flags: "--coverage" implies the following, according to the gcc docs [1]:
During compile: "-fprofile-arcs" and "-ftest-coverage"
During link   : "-lgcov"

Now, the "mozconfig" used to have "-fprofile-arcs" and "-ftest-coverage" for its linking flags (LDFLAGS). According to [2], these have no effect: only "-lgcov" affects linking.

Prior to these changes, we've recognized other builds where the ".gcno" file archive is incredibly small. I'm becoming confident that it's a "flaky" symptom. Besides, when re-running this build with the /exact same/ mozconfig, the ".gcno" archive dropped from 200kB to 72kB? If the build system configuration didn't change, but the archive size did, then perhaps the configuration didn't cause the initial drop from ~70mB to 200kB

I'll build locally and investigate what difference there is between the proper ~70mB ".gcno" archive, and the small archive being produced.

[1]: https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html#Instrumentation-Options
[2]: http://stackoverflow.com/a/23511191
(Assignee)

Comment 7

2 years ago
See this build, by Joel, which doesn't have my changes: https://treeherder.mozilla.org/#/jobs?repo=try&revision=8002f88014f819ce34055db8ecf72c0e1ceb2b4a&selectedJob=68153184
The "target.code-coverage-gcno.zip" is 232kB.

I looked into the 72kB ".gcno" file archive from my build, and it has the following root-level folders:
- browser
- gfx
- ipc
- js
- modules
- security
- toolkit

Joel's 232kB archive has:
- browser
- dom
- ipc
- js
- media
- modules
- security
- toolkit

I think that there's a larger underlying problem with the build system, and I don't believe that my changes are affecting it.
Flags: needinfo?(mitch9654)
this is interesting that you are not getting dom/ or media/ coverage data- are we comparing the same tests?  Do you have a try push that we can examine coverage for identical bits?  Ideally a push with the same commit using |try: -b o -p linux64-ccov -u none -t none|, then apply your changes and try again.

The coverage you get here will be for the make check tests, possibly we could try for cppunit tests or gtests as well?

Another avenue to investigate is looking in more detail at coverage data in a certain root directory, say toolkit.
Status: UNCONFIRMED → NEW
Ever confirmed: true
(Assignee)

Comment 10

2 years ago
Ok, so the Treeherder "Similar Jobs" tool's "Same options" flag doesn't quite work as expected, Greg's build from the last comment isn't "-u none".

My build (--coverage): https://treeherder.mozilla.org/#/jobs?repo=try&revision=9bf452937f870f959ee20082f3c48df628559ed8
My build (--fprofile-arcs, etc): https://treeherder.mozilla.org/#/jobs?repo=try&revision=aaba4c16298a67df1d449eb9a774abce1fab64ad
this looks like good progress
(Assignee)

Comment 12

2 years ago
The above build that _doesn't_ have my "--coverage" changes also produced a ".gcno" archive that is undersized: 189kB.
(Reporter)

Comment 13

2 years ago
Excellent progress so far Mitch!

Another thing that is interesting is that the gcno's have no consistency in size, even for just a build with no tests. One thing to do would be to see if a one-click loaner would give us a gcno with the same size more than once.

Do we know if these small gcno's can still be used to generate good c++ code coverage data or is it bad data?
(Assignee)

Comment 14

2 years ago
1.
> One thing to do would be to see if a one-click loaner would give us a gcno with the same size more than once.
Though I do think that this is a worthwhile exercise to figure out why we aren't getting all the data, I don't believe that this problem should block this bug, or stop the patch from landing. Perhaps a new bugzilla ticket should be reported?

2.
> Do we know if these small gcno's can still be used to generate good c++ code coverage data or is it bad data?
Small gcno's will not generate good c++ coverage data, because each gcno file has critical information required for processing coverage data: "It contains information to reconstruct the basic block graphs and assign source line numbers to blocks" [1]. If "lcov" is executed on code coverage data where the gcno archive is incomplete, it will throw an error (like it did on Friday in Calgary, when I was initially running "lcov" to understand how it worked, but the gcno archive we used was incomplete)

[1] https://gcc.gnu.org/onlinedocs/gcc/Gcov-Data-Files.html
(Reporter)

Comment 15

2 years ago
Unfortunately, this problem does block this bug from landing because I have yet to see these changes produce gcnos that are actually usable.

With the old mozconfig, at least we have good gcno's being produced even though it's inconsistent. We should certainly investigate this more using this new mozconfig since it consistently breaks unlike the old mozconfig, maybe gcov itself is causing these problems since it was only a change of flags that caused a problem. I think opening a new bug - as a blocker for this bug - would be a good idea.
this is excellent data, I know in the integration branches we do not do clobber builds, so you can think of them as partial builds- I would really like to see this on a test suite (like gtest or cppunittests), not just the build job to determine if the .gcna files are identical.  On try we do full clobber builds, that is the confusing part.

I would like to stress that ensuring we have the exact same base revision is important.  If source files change between comparisons we can have a much different set of data.

One other thought is that the build is done in parallel, possibly this is causing us some confusion or files to be overwritten.

:chris, could you think of ways that when compiling a build with --coverage (or other relate flags) would yield different .gcno files from build to build?  Any ideas on how to validate this?
Flags: needinfo?(cmanchester)
I can see from one of Mitch's try builds with retriggers in comment 10 that our current set of options will produce differently sized gcno archives if retriggered, so I think this is unrelated to that patch at hand, and that this patch should land.

I think I see why we're getting non-deterministic gcno archive sizes, and I think it's due to a race between packaging the archive and compiling some test binaries. I'll push a prospective fix to try.
Flags: needinfo?(cmanchester)
(In reply to Chris Manchester (:chmanchester) from comment #17)
> I can see from one of Mitch's try builds with retriggers in comment 10 that
> our current set of options will produce differently sized gcno archives if
> retriggered, so I think this is unrelated to that patch at hand, and that
> this patch should land.
> 
> I think I see why we're getting non-deterministic gcno archive sizes, and I
> think it's due to a race between packaging the archive and compiling some
> test binaries. I'll push a prospective fix to try.

Aha, so this wasn't due to a race with compiling test binaries, it looks like it was due to a bad interaction with sccache, our distributed compiler cache we use on try. When we pass the coverage flags to gcc it produces an extra output file (gcno) that sccache doesn't known about, so it doesn't become part of the cached result of that compilation call. The larger gcno bundles on try corresponded to cases we were getting more cache misses, so these calls were generating the gcno files locally.

If I add `no_sccache=1` to the top of the coverage mozconfig I see consistent gcno archive sizes.
(Assignee)

Comment 19

2 years ago
Thanks for your investigation Chris!

I started a new build with |--coverage| and |no_sccache=1| https://hg.mozilla.org/try/rev/26e504484a5e0a7e692863d23c257ad6f6b7d4df

|try: -b o -p linux64-ccov -u all -t none|
https://treeherder.mozilla.org/#/jobs?repo=try&revision=26e504484a5e0a7e692863d23c257ad6f6b7d4df
(Assignee)

Comment 20

2 years ago
Hmm, that build's "gcno" archive is 189kB. I might be missing something. By the way, the diff I linked doesn't show the sccache changes, see it's parent: https://hg.mozilla.org/try/rev/936a235842a9
(The first time I pushed to try with |no_sccache=1|, I improperly specified the "o" in the "try command")
(In reply to Mitchell Hentges [:mitchhentges] from comment #20)
> Hmm, that build's "gcno" archive is 189kB. I might be missing something. By
> the way, the diff I linked doesn't show the sccache changes, see it's
> parent: https://hg.mozilla.org/try/rev/936a235842a9
> (The first time I pushed to try with |no_sccache=1|, I improperly specified
> the "o" in the "try command")

The `no_sccache=1` may need to go on the first line of the mozconfig, it influences things that are included early in the file. At any rate, this investigation should probably be moved to a separate bug. Thanks for looking into this!
20:01 <@ted> mitchhentges: are the code coverage builds in taskcluster?
20:02 <@ted> if so you can just set an attribute on the task to disable sccache
20:02 < jmaher> ted: interesting- they are in taskcluster
20:02 <@ted> we don't currently use it anywhere, but the build task transform sets it by default:
             https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/transforms/build.py#22
20:03 < jmaher> linux64-ccov
20:03 < jmaher> ted: would you prefer task hacking vs mozconfig hacking?
20:03 <@ted> jmaher: i think that'd be nicer, yeah
20:03 < jmaher> cool
20:04 <@ted> https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/transforms/task.py#135
20:04 <@ted> that's where it lives in the task schema
(Assignee)

Comment 23

2 years ago
I'll report the relevant bug and submit a patch tonight
(Reporter)

Comment 24

2 years ago
As Chris suggests in comment 17, we can land this change whenever we want now. Bug 1334402 fixes the problem with the gcno sizes and is also ready to go.

Comment 25

2 years ago
mozreview-review
Comment on attachment 8829251 [details]
Bug 1332917: Replace linux64-ccov compile/link gcov flags with the more-consistent '--coverage' +560562

https://reviewboard.mozilla.org/r/106368/#review125158
Attachment #8829251 - Flags: review+
did a brief test and found a couple failures:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=dd0cdc4741d03a936dd96cb02301b8be6861ff06

the dt1 failure I believe is fixed (not sure if my source tree has the fix or not), but the other two failures are new.  I am not sure why this is the case, but the bc6 and bc7 failures are consistent.

Looking at the base revision used, this is from when the patch was originally created, I have a new patch from the tip of the tree, lets see if this helps out:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9a65eee90c6b270acafd8a674f95e4d39b2d1e04
Assignee: nobody → mitch9654

Comment 27

2 years ago
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f9eed66241c3
Replace linux64-ccov compile/link gcov flags with the more-consistent '--coverage' r=gmierz,jmaher+560562

Comment 28

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/f9eed66241c3
Status: NEW → RESOLVED
Last Resolved: 2 years ago
status-firefox55: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.