Closed Bug 1363469 Opened 3 years ago Closed 2 years ago

enable gcov to write data without a valid exit

Categories

(Testing :: Code Coverage, enhancement)

Version 3
enhancement
Not set

Tracking

(firefox56 fixed)

RESOLVED FIXED
mozilla56
Tracking Status
firefox56 --- fixed

People

(Reporter: rforbes, Assigned: marco)

References

Details

Attachments

(2 files, 5 obsolete files)

fuzzing tests don't complete like a normal test. They either crash or just keep running. We want to be able to get code coverage data without the browser exiting.

Here is a stack overflow article with an implementation

http://stackoverflow.com/questions/14977285/dumping-gcov-data-at-runtime

:choller made a patch to do something similar for asan coverage. We could remove the asan stuff and reduce it down to one signal handler.
Attached file 9022443.txt (obsolete) —
:marco I have included a similar patch :decoder made for asancov. Is this something you could take on or would you like us to?
Flags: needinfo?(mcastelluccio)
Assignee: nobody → mcastelluccio
Status: NEW → ASSIGNED
Flags: needinfo?(mcastelluccio)
Attached patch Patch (obsolete) — Splinter Review
The patch is basically the same as the asancov one.

I've verified that coverage counters are written when sending the signal.

The tool running the tests will have to periodically send USR1 to the Firefox process (parent and children).

Do you need a signal to reset the counters too?
Attached patch Patch (obsolete) — Splinter Review
This implements both dumping and resetting the coverage counters.

Looking at the GCC source code, it looks like, while __gcov_flush is protected by a mutex, __gcov_dump and __gcov_reset aren't. So I'm using a CrossProcessMutex here to avoid issues.
Attachment #8873226 - Attachment is obsolete: true
Attachment #8874130 - Flags: feedback?(choller)
Comment on attachment 8874130 [details] [diff] [review]
Patch

This looks really good! Thanks :)

Nathan, could you help reviewing this? We need this change for coverage measurements in fuzzing and I think the team working on CI coverage also will make use of this (e.g. for Mochitests, right now if one crashes, they lose all coverage for the previous ones, with this change you can signal the process to write out coverage).
Attachment #8874130 - Flags: review?(nfroyd)
Attachment #8874130 - Flags: feedback?(choller)
Attachment #8874130 - Flags: feedback+
Comment on attachment 8874130 [details] [diff] [review]
Patch

Review of attachment 8874130 [details] [diff] [review]:
-----------------------------------------------------------------

Somebody who knows more about CrossProcessMutex than I do (kanru?) should probably take a look; a couple of comments.

::: ipc/glue/CodeCoverageHandler.cpp
@@ +12,5 @@
> +using namespace mozilla;
> +using namespace mozilla::ipc;
> +
> +extern "C" void __gcov_dump();
> +extern "C" void __gcov_reset();

Some comments about what these functions do would be appropriate.  This is also as good a place as any to explain why a CrossProcesMutex needs to be locked when we're calling these functions.

@@ +20,5 @@
> +void CodeCoverageHandler::DumpCounters(int)
> +{
> +  CrossProcessMutexAutoLock lock(*CodeCoverageHandler::Get()->GetMutex());
> +
> +  fprintf(stderr, "[CodeCoverage] Requested dump.\n");

Should probably use fprintf_stderr for cross-platform compat, here and elsewhere, even though this code coverage thing is basically Linux64 only at this point.

@@ +38,5 @@
> +void CodeCoverageHandler::SetSignalHandlers()
> +{
> +  fprintf(stderr, "[CodeCoverage] Setting handlers for process %d.\n", getpid());
> +  signal(SIGUSR1, CodeCoverageHandler::DumpCounters);
> +  signal(SIGUSR2, CodeCoverageHandler::ResetCounters);

I think we want to use sigaction so we can specify the SA_RESTART flag.

@@ +57,5 @@
> +void CodeCoverageHandler::Init()
> +{
> +  if (instance == nullptr) {
> +    instance = new CodeCoverageHandler();
> +  }

May want to structure this:

MOZ_ASSERT(!instance);
instance = ...

to guard against Init being called multiple times.

@@ +64,5 @@
> +void CodeCoverageHandler::Init(const CrossProcessMutexHandle& aHandle)
> +{
> +  if (instance == nullptr) {
> +      instance = new CodeCoverageHandler(aHandle);
> +  }

Likewise here.

@@ +69,5 @@
> +}
> +
> +CodeCoverageHandler* CodeCoverageHandler::Get()
> +{
> +  return instance;

MOZ_ASSERT(instance) prior to returning it, I think.

::: ipc/glue/CodeCoverageHandler.h
@@ +14,5 @@
> +namespace mozilla {
> +
> +class CodeCoverageHandler {
> +public:
> +  static void Init();

This function...

@@ +21,5 @@
> +  CrossProcessMutex* GetMutex();
> +  CrossProcessMutexHandle GetMutexHandle(int aProcId);
> +
> +private:
> +  CodeCoverageHandler();

...and the corresponding handler look unused.  Shouldn't we be initializing something in ContentParent.cpp somewhere?  Otherwise ISTM that ContentParent will be touching a null CodeCoverageHandler instance...
Attachment #8874130 - Flags: review?(nfroyd) → feedback+
(In reply to Nathan Froyd [:froydnj] from comment #7)
> ::: ipc/glue/CodeCoverageHandler.h
> @@ +14,5 @@
> > +namespace mozilla {
> > +
> > +class CodeCoverageHandler {
> > +public:
> > +  static void Init();
> 
> This function...
> 
> @@ +21,5 @@
> > +  CrossProcessMutex* GetMutex();
> > +  CrossProcessMutexHandle GetMutexHandle(int aProcId);
> > +
> > +private:
> > +  CodeCoverageHandler();
> 
> ...and the corresponding handler look unused.  Shouldn't we be initializing
> something in ContentParent.cpp somewhere?  Otherwise ISTM that ContentParent
> will be touching a null CodeCoverageHandler instance...

The function is called in XRE_main in toolkit/xre/nsAppRunner.cpp.
(In reply to Marco Castelluccio [:marco] from comment #8)
> The function is called in XRE_main in toolkit/xre/nsAppRunner.cpp.

Oh, sorry!  It hid from me in splinter on a line all its own. :)

Also worth considering if you want to assert on the process type somehow in each of those initialization functions.
Attached patch Patch (obsolete) — Splinter Review
Attachment #8870363 - Attachment is obsolete: true
Attachment #8874130 - Attachment is obsolete: true
Attachment #8875238 - Flags: review?(nfroyd)
Attachment #8875238 - Flags: review?(kchen)
Comment on attachment 8875238 [details] [diff] [review]
Patch

Review of attachment 8875238 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you!
Attachment #8875238 - Flags: review?(nfroyd) → review+
Attachment #8875238 - Flags: review?(kchen) → review+
Attached patch Patch (obsolete) — Splinter Review
There are some test failures: https://treeherder.mozilla.org/#/jobs?repo=try&revision=1c2f60de80c35177a9c5307d5c42879ea9cb786a&selectedJob=106481801.

They are due to failing assertions:
> Assertion failure: instance, at /home/worker/workspace/build/src/ipc/glue/CodeCoverageHandler.cpp:89.

This means the CodeCoverageHandler::Init() function wasn't called in XRE_main in toolkit/xre/nsAppRunner.cpp.
To fix this problem for xpcshell tests, I'm calling CodeCoverageHandler::Init() in XRE_XPCShellMain too.
Attachment #8875238 - Attachment is obsolete: true
Attachment #8877145 - Flags: review?(nfroyd)
Attachment #8877145 - Flags: review?(nfroyd) → review+
Attached patch PatchSplinter Review
There were memory leaks because the singleton was not deleted: https://treeherder.mozilla.org/#/jobs?repo=try&revision=1e44a6d001b87663cde814bc7943427531b813a7&selectedJob=106715914.

So I switched to use a StaticAutoPtr instead of a raw pointer and used ClearOnShutdown to delete the singleton: https://treeherder.mozilla.org/#/jobs?repo=try&revision=a99de2605b8c3db9eb5b0db38f85cce546e9d4ae.

Carrying r+ since the changes seem straightforward.
Attachment #8877145 - Attachment is obsolete: true
Attached patch InterdiffSplinter Review
Pushed by mcastelluccio@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f704631f7ab5
Define signal handlers to dump and reset coverage counters. r=froydnj,kanru
Pushed by mcastelluccio@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ee9634a6f12c
Mark RecvShareCodeCoverageMutex as 'override'. r=me
https://hg.mozilla.org/mozilla-central/rev/f704631f7ab5
https://hg.mozilla.org/mozilla-central/rev/ee9634a6f12c
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
I just noticed today that this patch seems to have a bug:

The calls to sigaction are wrapped in MOZ_ASSERT which causes these calls to not happen in --disable-debug builds. I guess this was not intended, looking at the original patch. We should probably use regular if statements instead and print a warning or terminate when sigaction failed for some reason.
Status: RESOLVED → REOPENED
Flags: needinfo?(mcastelluccio)
Resolution: FIXED → ---
Depends on: 1390945
Thanks for fixing the problem while I was on PTO ;)
Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Flags: needinfo?(mcastelluccio)
Resolution: --- → FIXED
Depends on: 1406081
You need to log in before you can comment on or make changes to this bug.