Add telemetry to track the longest leaf phase in GC's that are catastrophically slow

RESOLVED FIXED in Firefox 41

Status

()

defect
RESOLVED FIXED
5 years ago
4 years ago

People

(Reporter: terrence, Assigned: terrence)

Tracking

(Blocks 1 bug)

Trunk
mozilla41
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox41 fixed)

Details

Attachments

(1 attachment, 1 obsolete attachment)

This will tell us if the majority of our catastrophically long max-pause GC's (e.g. >400-500ms) are caused by some specific thing(s) and help us narrow down on further diagnostics.
Blocks: GC.60fps
No longer blocks: GC.performance
Assignee: nobody → terrence
Blocks: 1164973
No longer blocks: GC.60fps
If we blow out the slice budget by more than 2x, record the longest phase. In theory we could get a bunch of evenly distributed phases that add up to a blown budget, but that seems unlikely to get to 2x, so this is probably sufficient.
Attachment #8615549 - Flags: review?(sphink)
Comment on attachment 8615549 [details] [diff] [review]
telemetry_record_long_phases-v0.diff

Review of attachment 8615549 [details] [diff] [review]:
-----------------------------------------------------------------

Hmm... I think we need to discuss this. The phase ordering is semantically meaningful, so you can't just append phases to the end. Which means that the meaning of these telemetry values is going to be very version-dependent. Are we ok with that? Or should we add a mapping layer?
Comment on attachment 8615549 [details] [diff] [review]
telemetry_record_long_phases-v0.diff

Review of attachment 8615549 [details] [diff] [review]:
-----------------------------------------------------------------

Clearing review for now, since it sounds like we want a version with a mapping table.
Attachment #8615549 - Flags: review?(sphink)
I made a big 'ol table for this, then realized that we already have a table that maps from Phase to some static data: phases. So I ended up just tacking telemetryBucket onto the end of that struct. This puts the mapping table inline and is, actually, much easier to read.
Attachment #8615549 - Attachment is obsolete: true
Attachment #8622565 - Flags: review?(sphink)
Comment on attachment 8622565 [details] [diff] [review]
telemetry_record_long_phases-v3.diff

Review of attachment 8622565 [details] [diff] [review]:
-----------------------------------------------------------------

Good idea, I like having it mixed in instead of a separate table.

::: js/src/gc/Statistics.cpp
@@ +982,5 @@
>          if (slices.back().budget.isTimeBudget()) {
>              int64_t budget = slices.back().budget.timeBudget.budget;
>              runtime->addTelemetry(JS_TELEMETRY_GC_BUDGET_MS, t(budget));
>              if (budget == runtime->gc.defaultSliceBudget())
>                  runtime->addTelemetry(JS_TELEMETRY_GC_ANIMATION_MS, t(sliceTime));

Rereading this makes me think it's a little funky for embeddings, especially those without any concept of animation, but oh well.

@@ +984,5 @@
>              runtime->addTelemetry(JS_TELEMETRY_GC_BUDGET_MS, t(budget));
>              if (budget == runtime->gc.defaultSliceBudget())
>                  runtime->addTelemetry(JS_TELEMETRY_GC_ANIMATION_MS, t(sliceTime));
> +
> +            if (sliceTime > 2 * budget * 1000) {

I sort of want this to be sliceTime_ms > 2 * budget_sec * 1000, or sliceTime > 2 * MsPerSec(budget), or whatever, but never mind -- instead, can you just comment this with something like "record telemetry for any slice that goes over 2x the budget"?

This does make me realize that we will be lumping together the slow phases for 10ms slices with the slow phases for 40ms slices, but I guess that's fine.

::: js/src/gc/Statistics.h
@@ +24,5 @@
>  class GCParallelTask;
>  
>  namespace gcstats {
>  
> +enum Phase : uint8_t {

640K phases is enough for anyone.
Attachment #8622565 - Flags: review?(sphink) → review+
Landed on mozilla-central this morning but didn't got marked for some reason.
https://hg.mozilla.org/mozilla-central/rev/f131bd3a61c1
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
You need to log in before you can comment on or make changes to this bug.