Closed Bug 1019611 Opened 7 years ago Closed 6 years ago

Add a way to get the max GC pause during a given period of time

Categories

(Core :: JavaScript: GC, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla35

People

(Reporter: avih, Assigned: terrence)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 1 obsolete file)

Description is identical as the cloned bug, only this one's about GC rather than CC. Would this kind of interface work for GC?

+++ This bug was initially created as a clone of Bug #1019101 +++

It would be nice to know how much mainthread animations are interrupted by the CC.  This has to be callable from JS.  Avi and I discussed this, and I think an interface that would work would be to have two methods (on Component.utils or something), something like startRecordingPauses() and getMaxPause().

startRecordingPauses() could just clear a maxRecordingPause variable in nsJSEnvironment.  Whenever we compute a slice pause, just set maxRecordingPause to the new value if it is larger.  Then stopRecording() would return that value.
Bug 1019101 (same as this but for CC) is in inbound with the following API:

void clearMaxCCTime();
long getMaxCCSliceTimeSinceClear();

To keep the implementation simple, this pair is not reentrant and therefore only supports a single call site calling clear and then get, but it can be easily wrapped to support multiple call sites with partially overlapping clear/get pairs.
(In reply to Avi Halachmi (:avih) from comment #1)
? ... but it can be
> easily wrapped to support multiple call sites with partially overlapping
> clear/get pairs.

Turns out that such wrapper could use a more atomic getMax + clear sequence. I suggested that the clear function also returns the previous max value, such that we won't get an unlucky max value which hides/happened just between the calls to getMax and clear.
Attached patch expose_max_gc_pause-v0.diff (obsolete) — Splinter Review
Assignee: nobody → terrence
Status: NEW → ASSIGNED
Attachment #8455564 - Flags: review?(jcoppeard)
Thanks. Will this be accessible from JS code?
Comment on attachment 8455564 [details] [diff] [review]
expose_max_gc_pause-v0.diff

Review of attachment 8455564 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good!

::: js/public/GCAPI.h
@@ +268,5 @@
> +/* Return the largest GC pause since ClearMaxGCPauseAccumulator was called. */
> +extern JS_FRIEND_API(int64_t)
> +GetMaxGCPauseSinceClear(JSRuntime *rt);
> +
> +/* Clear the GC pause accumulator. */

Can we comment that it returns the previous value too?
Attachment #8455564 - Flags: review?(jcoppeard) → review+
Comment on attachment 8455564 [details] [diff] [review]
expose_max_gc_pause-v0.diff

Review of attachment 8455564 [details] [diff] [review]:
-----------------------------------------------------------------

Apologies for maybe not mentioning this earlier, but like with the CC measurements API, it'll be much more useful if it's accessible via JS.
Attachment #8455564 - Flags: feedback-
The attached patch should make the two functions:

getMaxGCPauseSinceClear() -> Number
clearMaxGCPauseAccumulator() -> Number

available on chrome globals.

Could you build with this patch applied and check that it does what you need?
Attachment #8455564 - Attachment is obsolete: true
Attachment #8456468 - Flags: feedback?(avihpit)
The patch doesn't apply cleanly to m-c, but regardless, I'm unable to compile recent m-c due to bug 1037667. Hopefully it'll get fixed in a day or two.

So I reverted to m-c from 2014-07-02 which I can compile - the patch also doesn't apply cleanly to this version (different minor failure than when applying it to latest m-c).

I fixed it and applied all of it, but then it failed to compile with "mozmake.EXE[4]: *** [js/src/compile] Error 2", which I'm guessing is due to the two weeks old changeset I'm using (so I don't see a point in trying to trace this failure).

So the patch looks good, but I can't test it yet.

Is exposing the functions as chrome globals better/worse than exposing it at Component.utils like bug 1019101 does for GC?
(In reply to Avi Halachmi (:avih) from comment #8)
> The patch doesn't apply cleanly to m-c, but regardless, I'm unable to
> compile recent m-c due to bug 1037667. Hopefully it'll get fixed in a day or
> two.

My qparent is m-i: 67373fe39c53, so will probably apply fine tomorrow.
 
> Is exposing the functions as chrome globals better/worse than exposing it at
> Component.utils like bug 1019101 does for GC?

Probably worse, but it requires a lot less plumbing and no new JSAPI.
I worked around bug 1037667 and made sure it builds cleanly.

Your patch then applied cleanly to m-c from today (qparent 194796:330ba968ed61).

It still fails to compile with your patch.

Would appreciate if you could test it with latest m-c and confirm that it should work.
Comment on attachment 8456468 [details] [diff] [review]
expose_max_gc_pause-v1.diff

Review of attachment 8456468 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Terrence Cole [:terrence] from comment #7)
> Created attachment 8456468 [details] [diff] [review]
> expose_max_gc_pause-v1.diff
> 
> The attached patch should make the two functions:
> 
> getMaxGCPauseSinceClear() -> Number
> clearMaxGCPauseAccumulator() -> Number
> 
> available on chrome globals.
> 
> Could you build with this patch applied and check that it does what you need?

I still can't compile this patch with today's m-c. It still _might_ be an issue on my side, possibly at my mozconfig, but latest m-c builds just fine without this patch.

Feedback- for now. I'd appreciate some help with building with the patch and then I'll revisit the feedback.
Attachment #8456468 - Flags: feedback?(avihpit) → feedback-
avih - sorry, got pulled away. Yes, the build completed for me, same version. I'm guessing it's a Windows thing. The error I see in your log is:

 5:42.71 e:/dev/moz/src/central/js/src\builtin/Profilers.cpp(337) : error C2039: 'runtimeFromMainThread' : is not a member of 'JSContext'
 5:42.71         e:/dev/moz/src/central/js/src\jscntxt.h(394) : see declaration of 'JSContext'
 5:42.71 e:/dev/moz/src/central/js/src\builtin/Profilers.cpp(337) : error C2227: left of '->gc' must point to class/struct/union/generic type
 5:42.71 e:/dev/moz/src/central/js/src\builtin/Profilers.cpp(337) : error C2228: left of '.stats' must have class/struct/union
 5:42.71 e:/dev/moz/src/central/js/src\builtin/Profilers.cpp(337) : error C2228: left of '.getMaxGCPauseSinceClear' must have class/struct/union
 5:42.71 e:/dev/moz/src/central/js/src\builtin/Profilers.cpp(337) : fatal error C1903: unable to recover from previous error(s); stopping compilation
Assignee: terrence → sphink
Comment on attachment 8460641 [details] [diff] [review]
Expose the GC's max-pause statistic to chrome JS; NOT_REVIEWED

Review of attachment 8460641 [details] [diff] [review]:
-----------------------------------------------------------------

This compiles, thanks.

I did try it a bit, and noticed it returns values like 24770 or 39070. I'm guessing it's microseconds? Might be useful to add a comment about the units.
Attachment #8460641 - Flags: feedback+
(In reply to Avi Halachmi (:avih) from comment #14)
> I did try it a bit, and noticed it returns values like 24770 or 39070. I'm
> guessing it's microseconds? Might be useful to add a comment about the units.

?
Flags: needinfo?(sphink)
Oops, bzexport assigned this to me. I'm leaving on vacation; you'd best continue with terrence.
Assignee: sphink → terrence
Flags: needinfo?(sphink) → needinfo?(terrence)
Yes, will do.

(In reply to Avi Halachmi (:avih) from comment #15)
> (In reply to Avi Halachmi (:avih) from comment #14)
> > I did try it a bit, and noticed it returns values like 24770 or 39070. I'm
> > guessing it's microseconds? Might be useful to add a comment about the units.
> 
> ?

Yes, it is microseconds.
Flags: needinfo?(terrence)
Is it waiting for something before it can land?
Avi, do you prefer microseconds or milliseconds for this?  It seems like the GC and CC interfaces should be consistent, or at least, not inconsistent for no reason.
(In reply to Andrew McCreight [:mccr8] from comment #19)
> Avi, do you prefer microseconds or milliseconds for this?  It seems like the
> GC and CC interfaces should be consistent, or at least, not inconsistent for
> no reason.

I don't have a strong preference either way - as long as the resolution results in reasonable accuracy and as long as it's documented. I also don't have a strong preference that all measurements use the same units, thought it could be nice if they did.

I've seen sub 20ms GC measurements (with the patch from comment 13), so ms resolution would give us with 5% or worse accuracy (or better in case of longer lags). Not terrible but not too great either, so I think microseconds makes sense here.

From my experiments with your CC measurements patch at bug 1019101, I noticed numbers between 50ms - 100+ ms IIRC (but I didn't experiment much), so ms resolution is about 2% or better accuracy - probably good enough.

Once we get things working, we could refine them at later stages IMO. For now, what we have from CC/GC (assuming the patch from comment 13 lands) is good enough IMO.
Ok, great.  Let me know if not having bug 1046271 gets in the way and I'll up my priority on fixing it.
(In reply to Avi Halachmi (:avih) from comment #18)
> Is it waiting for something before it can land?

Terrence?
Flags: needinfo?(terrence)
See Also: → GC.60fps
Attachment #8460641 - Flags: review?(sphink)
Flags: needinfo?(terrence)
Comment on attachment 8460641 [details] [diff] [review]
Expose the GC's max-pause statistic to chrome JS; NOT_REVIEWED

Review of attachment 8460641 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsapi.h
@@ +2085,5 @@
>  
>      /* Amount of bytes allocated by the GC. */
>      JSGC_BYTES = 3,
>  
> +    /* Number of times GC has been invoked. */

As long as you're updating the comment... does this include minor GCs?
Attachment #8460641 - Flags: review?(sphink) → review+
What's "everything"? Any plans to unbitrot/fix and land?
Avi: https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=cf159b0581df has results: 

Build errors on all opt builds:

/builds/slave/m-in-l64-000000000000000000000/build/js/src/builtin/Profilers.cpp:337:26: error: 'struct JSContext' has no member named 'runtimeFromMainThread'

Mochitest-1 error on a debug build:
TEST-UNEXPECTED-FAIL | leakcheck | 9508 bytes leaked (AsyncTransactionTrackersHolder, BufferRecycleBin, CompositableClient, CondVar, GfxTextureWasteTracker, ...)

tcole, do you have an ETA for when this can be fixed/reland? It's blocking the bigger project that avi is trying to measure.
Flags: needinfo?(terrence)
Ah, I didn't know it had turned into a blocker! I'll try to figure out what's going on today -- the error is totally ridiculous and I haven't been able to repro in my local builds.
Flags: needinfo?(terrence)
I think I found it. Somehow I was building without --enable-profiling.

https://tbpl.mozilla.org/?tree=Try&rev=5bb349a15a7c
https://hg.mozilla.org/mozilla-central/rev/e81f72f0710f
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
You need to log in before you can comment on or make changes to this bug.