Closed Bug 1380661 Opened 2 years ago Closed Last year

When dumping/resetting counters on demand, we should dump/reset JS counters as well and not only C++ ones

Categories

(Testing :: Code Coverage, enhancement)

Version 3
enhancement
Not set

Tracking

(firefox65 fixed)

RESOLVED FIXED
mozilla65
Tracking Status
firefox65 --- fixed

People

(Reporter: marco, Assigned: marco)

References

(Blocks 3 open bugs)

Details

Attachments

(6 files, 5 obsolete files)

1.20 KB, patch
nbp
: review+
Details | Diff | Splinter Review
1.93 KB, patch
nbp
: review+
Details | Diff | Splinter Review
5.62 KB, patch
jmaher
: review+
Details | Diff | Splinter Review
5.50 KB, patch
marco
: review+
Details | Diff | Splinter Review
10.82 KB, patch
marco
: review+
Details | Diff | Splinter Review
2.20 KB, patch
marco
: review+
Details | Diff | Splinter Review
After bug 1380659 is fixed, we will have a way to dump coverage counters on demand using JavaScript.
We will want to dump the JavaScript counters too.
Assignee: nobody → mcastelluccio
Status: NEW → ASSIGNED
Duplicate of this bug: 1243522
Blocks: 1431753
We'd need to do something similar to https://dxr.mozilla.org/mozilla-central/rev/e4107773cffb1baefd5446666fce22c4d6eb0517/js/src/jsopcode.cpp#2970 that works for JSScripts from all contexts.
nbp, any idea how to do this?

We already have a component that handles dumping/resetting C++ counters on demand. We need to make it dump/reset JavaScript counters too.
The component offers an interface for the parent process to dump/reset counters (https://dxr.mozilla.org/mozilla-central/rev/69f9ae4f6e8943d24569a9e4380f6df869f56b73/tools/code-coverage/nsCodeCoverage.cpp#25). When used, it also notifies the content processes to do the same.

So we would need each process to dump/reset coverage info for all its JSContexts.
Flags: needinfo?(nicolas.b.pierron)
(In reply to Marco Castelluccio [:marco] from comment #3)
> nbp, any idea how to do this?

Yes, what you want is roughtly to duplicate everything under:
  js::GetCodeCoverageSummary, GenerateLcovInfo

And replace all the dump logic by a loop over each Script to do:

    if (script->hasScriptCounts())
        script->getScriptCounts()->reset();

Where reset will iterate over the PCCountsVectors[1] and reset the numExec_ field.

Note, we should not remove the ScriptCounts allocations, because these might still be used by the baseline compiled code, and removing these would caused some UAF.

[1] https://searchfox.org/mozilla-central/rev/bffd3e0225b65943364be721881470590b9377c1/js/src/vm/JSScript.h#181,184,226,230
Flags: needinfo?(nicolas.b.pierron)
(In reply to Nicolas B. Pierron [:nbp] {backlog: ~36} from comment #4)
> (In reply to Marco Castelluccio [:marco] from comment #3)
> > nbp, any idea how to do this?
> 
> Yes, what you want is roughtly to duplicate everything under:
>   js::GetCodeCoverageSummary, GenerateLcovInfo
> 
> And replace all the dump logic by a loop over each Script to do:
> 
>     if (script->hasScriptCounts())
>         script->getScriptCounts()->reset();
> 
> Where reset will iterate over the PCCountsVectors[1] and reset the numExec_
> field.
> 
> Note, we should not remove the ScriptCounts allocations, because these might
> still be used by the baseline compiled code, and removing these would caused
> some UAF.
> 
> [1]
> https://searchfox.org/mozilla-central/rev/
> bffd3e0225b65943364be721881470590b9377c1/js/src/vm/JSScript.h#181,184,226,230

Implementing the reset should be feasible.
My problem is more about how to access the JSContexts of the current process.
No longer blocks: 1431753
Blocks: 1471572
(In reply to Marco Castelluccio [:marco] from comment #5)
> Implementing the reset should be feasible.
> My problem is more about how to access the JSContexts of the current process.

It sounds like this is not so easy? Would it be easier if I triggered the reset/dump from JS instead of C++?
Flags: needinfo?(nicolas.b.pierron)
I guess we can add a Testing function for that:
https://searchfox.org/mozilla-central/rev/97d488a17a848ce3bebbfc83dc916cf20b88451c/js/src/builtin/TestingFunctions.cpp#5297

Doing it in JS would only save the part of the code which has to find the JSRuntime, and you would still have to iterate over the Realms (old JSContext) to reset them.
However, you will have to do the cross-process dispatch in JS.
Flags: needinfo?(nicolas.b.pierron)
(In reply to Nicolas B. Pierron [:nbp] {backlog: 39} from comment #7)
> I guess we can add a Testing function for that:
> https://searchfox.org/mozilla-central/rev/
> 97d488a17a848ce3bebbfc83dc916cf20b88451c/js/src/builtin/TestingFunctions.
> cpp#5297
> 
> Doing it in JS would only save the part of the code which has to find the
> JSRuntime, and you would still have to iterate over the Realms (old
> JSContext) to reset them.
> However, you will have to do the cross-process dispatch in JS.

Yeah, I'll have to figure out how to do (and if it is possible) the cross-process dispatch in JS.
Is iterating over the Realms easier with JS, or is it exactly the same as in C++?
(In reply to Marco Castelluccio [:marco] from comment #8)
> Is iterating over the Realms easier with JS, or is it exactly the same as in
> C++?

I do not think this is even possible in JS, and I have no idea how to do that in C++, except maybe looking at the GC code.  Maybe Jan would know better since he has been revisiting all these code path lately.
Flags: needinfo?(jdemooij)
You can iterate over all realms belonging to the current context/thread/runtime, like this:

    for (RealmsIter realm(cx->runtime()); !realm.done(); realm.next()) {
        ...
    }

However it sounds like you want to do this for all contexts, so including DOM workers. That's not possible in the JS engine without racy hacks, you'll want to trigger that in Gecko somewhere. I'd look at how the memory reporting code (for about:memory) does this and then do something similar - it probably uses runnables.
Flags: needinfo?(jdemooij)
Here's the code the memory reporter uses to gather memory stats for workers: https://searchfox.org/mozilla-central/rev/c296d5b2391c8b37374b118180b64cca66c0aa16/dom/workers/WorkerPrivate.cpp#1365.

Anyway, I think the first step would be to implement resetting/dumping for all contexts except DOM workers. Then DOM workers will be handled ina follow-up.
Attachment #9018285 - Flags: review?(nicolas.b.pierron)
Marco, can you describe the design of what you are looking for?

I looked at the Part 1 and I do not understand. LCov* data structures are made as a way to collect the actual counters at a given time, when sweeping ScriptSources, not to be a persistent structure.  Seeing a reset function on the LCovRealm, let me think that you want to make it persist, can you elaborate on the design direction?
Flags: needinfo?(mcastelluccio)
Yeah you're right, that was not needed. I should have finished the rest before asking for review :)
Attachment #9018285 - Attachment is obsolete: true
Attachment #9018285 - Flags: review?(nicolas.b.pierron)
Flags: needinfo?(mcastelluccio)
Attachment #9019288 - Flags: review?(nicolas.b.pierron)
There's a problem here. In some cases after flushing, there are some scripts that have coverage data but are not "complete", that is there's no top level script for them.

For example, say this is script.js:
function test() {
}

async function main() {
  await flushCounters();
  // Here the report will contain script.js and a 'test' FN entry.

  test();

  await flushCounters();
  // Here the report, probably depending on timing, can either contain script.js and a 'test' FN entry and a 'test' FNDA entry, or can contain nothing because there's a 'test' JSScript but no script.js top-level script.
}

At the first flushCounters we have a top-level JSScript for script.js. At the second flushCounters we might not have it.

One possible solution to this is removing the hasTopLevelScript_ check from CodeCoverage.cpp and let it report "incomplete" JSScripts and making GenerateLcovInfo generate data for all JSScripts and not only starting from top-level ones. I'm not sure if this is correct though, why was the top-level check there?

If you want to reproduce this locally I can upload a test to reproduce it.
Attachment #9019290 - Flags: feedback?(nicolas.b.pierron)
Attachment #9019288 - Flags: review?(nicolas.b.pierron) → review+
Comment on attachment 9019290 [details] [diff] [review]
Part 2: Add JS API to reset script counts

Review of attachment 9019290 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/BytecodeUtil.cpp
@@ +3156,5 @@
> +JS_FRIEND_API(bool)
> +js::FlushCodeCoverage(JSContext* cx)
> +{
> +    for (RealmsIter realm(cx->runtime()); !realm.done(); realm.next()) {
> +        if (!GenerateLcovInfo(cx, realm, nullptr)) {

Question: My understanding is that every time this function is called, it would append the code coverage output into an existing file named after the process id. Thus to consume the data produced by this function, you would have to read this file right after.

It sounds to me that it might be better, in terms of API, if this function were to return the LCov output as a char* buffer, using the Sprinter class [1]. This way you have control over the storage and file manipulation, and you are free to name the files after the test cases.

Another solution would be to give a filename to FlushCodeCoverage, such that using Fprinter [1], it can dump the content into a specific file.

[1] https://searchfox.org/mozilla-central/source/js/src/vm/Printer.h
Attachment #9019290 - Flags: feedback?(nicolas.b.pierron) → feedback+
This is needed as the API to request the flushing will use the LCovSource objects stored in the Realm.
Attachment #9019290 - Attachment is obsolete: true
Attachment #9020705 - Flags: review?(nicolas.b.pierron)
This makes the API:
- Dump counters also for scripts that are not top-level, as sometimes the top-level can die before one of its children;
- Reset script counters after dumping them;
- Flush counters for all realms instead of just the one requesting it;
- Use the LCovRealm object stored in the Realm instead of creating a temporary one for the flushing, as this will allow us to use the scripts which have been recorded but have died before the flushing was requested.
Attachment #9020706 - Flags: review?(nicolas.b.pierron)
This is how the API is going to be used, not requesting review yet on this part as it might change depending on the other reviews.
These are tests which perform flushing and then parse and check the LCOV output, both with a single process and with multiple processes.
Attachment #9020708 - Flags: review?(nicolas.b.pierron)
Attachment #9020705 - Flags: review?(nicolas.b.pierron) → review+
Comment on attachment 9020706 [details] [diff] [review]
Part 3: Add JS API to reset script counts

Review of attachment 9020706 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/BytecodeUtil.cpp
@@ +3068,5 @@
>      }
>  
> +    // Hold the scripts that we have already flushed, to avoid flushing them twice.
> +    typedef HashSet<JSScript*, DefaultHasher<JSScript*>, SystemAllocPolicy> JSScriptSet;
> +    JSScriptSet scriptsDone;

Asking jonco for confirmation, but I think you should probably do something along the lines of:

  Rooted<GCHashSet<JSScript*>> scriptDone(cx);
Attachment #9020706 - Flags: review?(nicolas.b.pierron)
Attachment #9020706 - Flags: review+
Attachment #9020706 - Flags: feedback?(jcoppeard)
Comment on attachment 9020708 [details] [diff] [review]
Part 5: Test JS counters are being flushed

Review of attachment 9020708 [details] [diff] [review]:
-----------------------------------------------------------------

::: tools/code-coverage/tests/xpcshell/test_basic.js
@@ +30,5 @@
> +  let fnRecords = first_flush_records.get("test_basic.js").filter(record => record.type == "FN");
> +  let fndaRecords = first_flush_records.get("test_basic.js").filter(record => record.type == "FNDA");
> +  Assert.ok(fnRecords.some(record => record.name == "top-level"));
> +  Assert.ok(fnRecords.some(record => record.name == "run_test"));
> +  Assert.ok(fnRecords.some(record => record.name == "test_code_coverage_func"));

nit: I would prefer if we do not have any expectations if any of the non-executed function are not listed here.

This might be true today, only because we have some work-around work to disable the lazy parsing feature when code coverage is enabled.

@@ +50,5 @@
> +  Assert.ok(fnRecords.some(record => record.name == "top-level"));
> +  Assert.ok(fnRecords.some(record => record.name == "run_test"));
> +  Assert.ok(fnRecords.some(record => record.name == "test_code_coverage_func"));
> +  Assert.ok(fndaRecords.some(record => record.name == "run_test" && record.hits == 0));
> +  Assert.ok(!fndaRecords.some(record => record.name == "run_test" && record.hits != 0));

This sounds like a miss-conception.

One would expect that all the calls of Assert.ok and test_code_coverage_func to be recorded here.
This value should change if you were to add a branch after "await codeCoverage.flushCounters();" calls.

SpiderMonkey count the entry of each sequence of code, inserting the following should split the code sequence, and add a new counter which should report "run_test" as having hit-counts.

  await codeCoverage.flushCounters();
  if (Math.random() >= 2) { Assert.ok(false); }
Attachment #9020708 - Flags: review?(nicolas.b.pierron) → review+
Comment on attachment 9020706 [details] [diff] [review]
Part 3: Add JS API to reset script counts

Review of attachment 9020706 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/BytecodeUtil.cpp
@@ +3068,5 @@
>      }
>  
> +    // Hold the scripts that we have already flushed, to avoid flushing them twice.
> +    typedef HashSet<JSScript*, DefaultHasher<JSScript*>, SystemAllocPolicy> JSScriptSet;
> +    JSScriptSet scriptsDone;

Yes,

I didn't try it but I think this is what you want:

using ScriptSet = GCHashSet<JSScript*>;
Rooted<ScriptSet> scriptsDone(cx, ScriptSet(cx));
Attachment #9020706 - Flags: feedback?(jcoppeard) → feedback+
Attachment #9020707 - Flags: review?(nfroyd)
Carrying r+.
Attachment #9020706 - Attachment is obsolete: true
Attachment #9021195 - Flags: review+
Carrying r+. I've removed checks on non-executed function records, and made fndaRecord assertions that were checking run_test check another function which is not doing anything.
Attachment #9020708 - Attachment is obsolete: true
Attachment #9021196 - Flags: review+
Keywords: leave-open
Comment on attachment 9020707 [details] [diff] [review]
Part 4: Flush JS counters too when flushing GCOV counters

Review of attachment 9020707 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with the changes below.

::: tools/code-coverage/CodeCoverageHandler.cpp
@@ +23,3 @@
>  
>  using namespace mozilla;
> +using namespace mozilla::dom;

Nit: I think we shouldn't use this namespace, and just write dom::AutoJSAPI when we need to.  Up to you.

@@ +44,5 @@
> +  if (outDir && *outDir != 0) {
> +    AutoJSAPI jsapi;
> +    jsapi.Init();
> +    size_t length;
> +    char* result = js::GetCodeCoverageSummary(jsapi.cx(), &length);

It looks like you need to free this string when you're done with it.  (Alternatively, you can file a bug on getting GetCodeCoverageSummary to return a JSUniquePtr or similar.)

@@ +45,5 @@
> +    AutoJSAPI jsapi;
> +    jsapi.Init();
> +    size_t length;
> +    char* result = js::GetCodeCoverageSummary(jsapi.cx(), &length);
> +    if (result != nullptr) {

!result, please.

@@ +74,1 @@
>    printf_stderr("[CodeCoverage] flush completed.\n");

Maybe we should move this up right after __gcov_flush(), so the logic of the JS code coverage dump can look like:

if (!outDir || *outDir == 0) {
  return;
}

...
result = ...
if (!result) {
  return;
}

...

which is a little nicer.  Throw in a "[CodeCoverage] JS counters flushed\n" after all that if you like.
Attachment #9020707 - Flags: review?(nfroyd) → review+
Attachment #9020707 - Attachment is obsolete: true
Blocks: 1503417
Keywords: leave-open
Pushed by mcastelluccio@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/49ee481c49de
Add JS API to reset script counts. r=nbp
https://hg.mozilla.org/integration/mozilla-inbound/rev/eec591369f86
Flush JS counters too when flushing GCOV counters. r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/32f60b4620cb
Test JS counters are being flushed. r=nbp
https://hg.mozilla.org/integration/mozilla-inbound/rev/ef709c413b3c
Make use of JS engine coverage flushing functionality to reset/dump counters in per-test mode. r=jmaher
You need to log in before you can comment on or make changes to this bug.