Closed Bug 1412729 Opened 7 years ago Closed 7 years ago

Reduce the size of GC Telemetry JSON

Categories

(Core :: JavaScript: GC, defect, P2)

55 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla59
Tracking Status
firefox-esr52 --- unaffected
firefox56 --- unaffected
firefox57 --- unaffected
firefox58 --- wontfix
firefox59 --- fixed

People

(Reporter: pbone, Assigned: pbone)

References

Details

Attachments

(4 files, 2 obsolete files)

See bug 1386511 comment 33.  We could probably reduce the size of the JSON used in the GC telemetry data.  We can probably further improve the json format used for both telemetry and perf.html.
Priority: -- → P3
We should try to land this in 58. See Bug 1386511 Comment 38.
Assignee: nobody → pbone
Status: NEW → ASSIGNED
Priority: P3 → P2
Depends on: 1386511
Also once Bug 1298018 lands document lazy_capacity in the correct places.
Depends on: 1298018
Attachment #8927302 - Flags: review?(gfritzsche)
Attachment #8927304 - Flags: review?(sphink)
Attachment #8927302 - Flags: review?(gfritzsche) → review+
Comment on attachment 8927303 [details] [diff] [review]
Bug 1412729 (part 2) - Reduce the size of most GCMinor telemetry objects

Review of attachment 8927303 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/gc/Nursery.cpp
@@ +550,5 @@
> +    if (newCapacity != previousGC.nurseryCapacity)
> +        json.property("new_capacity", newCapacity);
> +    const size_t lazyCapacity = spaceToEnd(allocatedChunkCount());
> +    if (lazyCapacity != previousGC.nurseryCapacity)
> +        json.property("lazy_capacity", lazyCapacity);

I'm ok with this, but mostly because I haven't needed these values.

My concern in general is that the profiler only has a rolling window of minor GC markers, so if you need to infer a value from previous markers then you may not have the information you need. Plus, I tend to look at one marker at a time, and unless it's something I really need to know, I probably won't bother to extract it from previous ones.

I don't actually know what these values are; I assume they're from the lazy background chunk allocation? Would new_capacity and/or lazy_capacity commonly be the same as cur_capacity? Because if so, that seems like a nicer way of stripping this down -- omit new_capacity if it's the same as cur_capacity, and omit lazy_capacity if it's the same as new_capacity. You still have to infer things, but it's all local to one json blob.
Attachment #8927303 - Flags: review?(sphink) → review+
Comment on attachment 8927304 [details] [diff] [review]
Bug 1412729 (part 3) - Reduce the size of GCSlice markers

Review of attachment 8927304 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not going to miss the 'when' field, but 'pause'... I don't like not having that. That's the single most valuable piece of information in these.

I assume the idea is that 'pause' is redundant with end_timestamp - start_timestamp. In that case, can we drop end_timestamp instead of pause? (It's more bytes, too!)

Also, can you document the units of start_timestamp in main-ping.rst?

::: js/src/gc/Statistics.cpp
@@ +634,5 @@
> +    // If you change JSON properties here, please update:
> +    // Telemetry ping code: toolkit/components/telemetry/GCTelemetry.jsm
> +    // Telemetry documentation: toolkit/components/telemetry/docs/data/main-ping.rst
> +    // Telemetry tests: toolkit/components/telemetry/tests/browser/browser_TelemetryGC.js
> +    // Perf.html: https://github.com/devtools-html/perf.html

Now I'm scared of changing anything! :-) But thank you very much for these pointers; I didn't know about all this stuff.
Attachment #8927304 - Flags: review?(sphink) → review-
Comment on attachment 8927305 [details] [diff] [review]
Bug 1412729 (part 4) - Attempt to reduce the size of GCMajor markers

Review of attachment 8927305 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/gc/Statistics.cpp
@@ +627,5 @@
>      json.property("scc_sweep_total", sccTotal, JSONPrinter::MILLISECONDS);
>      json.property("scc_sweep_max_pause", sccLongest, JSONPrinter::MILLISECONDS);
>  
> +    if (nonincrementalReason_ != AbortReason::None)
> +        json.property("nonincremental_reason", ExplainAbortReason(nonincrementalReason_));

Nice, I've found it confusing to have this field present when it didn't apply.

@@ +634,5 @@
> +    if (addedChunks)
> +        json.property("added_chunks", addedChunks);
> +    uint32_t removedChunks = getCount(STAT_DESTROY_CHUNK);
> +    if (removedChunks)
> +        json.property("removed_chunks", removedChunks);

Heh. Almost makes you want to have a json.nonzeroProperty() or json.propertyIfNonzero(). But... no.
Attachment #8927305 - Flags: review?(sphink) → review+
(In reply to Steve Fink [:sfink] [:s:] from comment #8)
> Comment on attachment 8927303 [details] [diff] [review]
> Bug 1412729 (part 2) - Reduce the size of most GCMinor telemetry objects
> 
> Review of attachment 8927303 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/gc/Nursery.cpp
> @@ +550,5 @@
> > +    if (newCapacity != previousGC.nurseryCapacity)
> > +        json.property("new_capacity", newCapacity);
> > +    const size_t lazyCapacity = spaceToEnd(allocatedChunkCount());
> > +    if (lazyCapacity != previousGC.nurseryCapacity)
> > +        json.property("lazy_capacity", lazyCapacity);
> 
> I'm ok with this, but mostly because I haven't needed these values.
> 
> My concern in general is that the profiler only has a rolling window of
> minor GC markers, so if you need to infer a value from previous markers then
> you may not have the information you need. Plus, I tend to look at one
> marker at a time, and unless it's something I really need to know, I
> probably won't bother to extract it from previous ones.

I'm going to encourage more analysis/inference of things like this in the profiler.  Except that...

> I don't actually know what these values are; I assume they're from the lazy
> background chunk allocation? Would new_capacity and/or lazy_capacity
> commonly be the same as cur_capacity? Because if so, that seems like a nicer
> way of stripping this down -- omit new_capacity if it's the same as
> cur_capacity, and omit lazy_capacity if it's the same as new_capacity. You
> still have to infer things, but it's all local to one json blob.

This is exactly what this code does.  At this point in the collection previousGC has just completed, and the name (particularly when isolated in a diff) is misleading.  It is the most-recently-completed GC.  And so perviousGC.nurseryCapacity is the current nursery size.  newCapacity is what it will be enlarged to.

Yes lazy_capacity is from the lazy chunk allocation change.  But new_capacity was there before and refers to the resize decision made at the end of the most recent collection.

Thanks for the review.
(In reply to Steve Fink [:sfink] [:s:] from comment #9)
> Comment on attachment 8927304 [details] [diff] [review]
> Bug 1412729 (part 3) - Reduce the size of GCSlice markers
> 
> Review of attachment 8927304 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I'm not going to miss the 'when' field, but 'pause'... I don't like not
> having that. That's the single most valuable piece of information in these.
> 
> I assume the idea is that 'pause' is redundant with end_timestamp -
> start_timestamp. In that case, can we drop end_timestamp instead of pause?
> (It's more bytes, too!)

Yes and yes.  I'll make that change.

> Also, can you document the units of start_timestamp in main-ping.rst?

Okay.

> ::: js/src/gc/Statistics.cpp
> @@ +634,5 @@
> > +    // If you change JSON properties here, please update:
> > +    // Telemetry ping code: toolkit/components/telemetry/GCTelemetry.jsm
> > +    // Telemetry documentation: toolkit/components/telemetry/docs/data/main-ping.rst
> > +    // Telemetry tests: toolkit/components/telemetry/tests/browser/browser_TelemetryGC.js
> > +    // Perf.html: https://github.com/devtools-html/perf.html
> 
> Now I'm scared of changing anything! :-) But thank you very much for these
> pointers; I didn't know about all this stuff.

You're welcome, it's as much for future-me as anyone else.
Updated based on review feedback. Thanks.
Attachment #8927304 - Attachment is obsolete: true
Attachment #8927630 - Flags: review?(sphink)
Attachment #8927630 - Flags: review?(sphink) → review+
Blocks: 1417315
Comment on attachment 8927305 [details] [diff] [review]
Bug 1412729 (part 4) - Attempt to reduce the size of GCMajor markers

Review of attachment 8927305 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/gc/Statistics.cpp
@@ +604,5 @@
>      TimeDuration total, longest;
>      gcDuration(&total, &longest);
>      json.property("max_pause", longest, JSONPrinter::MILLISECONDS);
>      json.property("total_time", total, JSONPrinter::MILLISECONDS);
> +    // We might be able to omit reason if perf.html was able to retrive it

minor typo: retrieve :)
Rebased patch so it applies cleanly. r+ carried forward.
Attachment #8927303 - Attachment is obsolete: true
Attachment #8929936 - Flags: review+
Keywords: checkin-needed
Pushed by ebalazs@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8fa1f53e62a2
(part 1) - Update out-of-date comments r=gfritzsche
https://hg.mozilla.org/integration/mozilla-inbound/rev/1b2ced05f1f4
(part 2) - Reduce the size of most GCMinor telemetry objects r=sfink
https://hg.mozilla.org/integration/mozilla-inbound/rev/8ce97fd74941
(part 3) - Reduce the size of GCSlice markers r=sfink
https://hg.mozilla.org/integration/mozilla-inbound/rev/fece9cf66e86
(part 4) - Attempt to reduce the size of GCMajor markers r=sfink
Keywords: checkin-needed
If you still want this on 58 please request uplift when you get a chance.
Flags: needinfo?(pbone)
Comment on attachment 8929936 [details] [diff] [review]
Bug 1412729 (part 2) - Reduce the size of most GCMinor telemetry objects r=sfink

Approval Request Comment
[Feature/Bug causing the regression]: Bug 1298018
[User impact if declined]: Slightly increased bandwidth usage for telemetry.
[Is this code covered by automated tests?]: No
[Has the fix been verified in Nightly?]: Yes
[Needs manual test from QE? If yes, steps to reproduce]: I Don't think so
[List of other uplifts needed for the feature/fix]:
[Is the change risky?]: No.
[Why is the change risky/not risky?]: No impact on users, tested well in Nightly.
[String changes made/needed]:
Flags: needinfo?(pbone)
Attachment #8929936 - Flags: approval-mozilla-beta?
Status: RESOLVED → REOPENED
Keywords: leave-open
Resolution: FIXED → ---
Comment on attachment 8927630 [details] [diff] [review]
Bug 1412729 (part 3) - Reduce the size of GCSlice markers

Approval Request Comment
[Feature/Bug causing the regression]: Bug 1386511
[User impact if declined]: Slightly increased telemetry bandwidth usage
[Is this code covered by automated tests?]: Yes
[Has the fix been verified in Nightly?]: Yes
[Needs manual test from QE? If yes, steps to reproduce]: No
[List of other uplifts needed for the feature/fix]:
[Is the change risky?]: No
[Why is the change risky/not risky?]: No direct impact on users, tested well in Nightly.
[String changes made/needed]:
Attachment #8927630 - Flags: approval-mozilla-beta?
Comment on attachment 8927305 [details] [diff] [review]
Bug 1412729 (part 4) - Attempt to reduce the size of GCMajor markers

Approval Request Comment
[Feature/Bug causing the regression]: Bug 1386511
[User impact if declined]: Slightly increased bandwidth usage for telemetry
[Is this code covered by automated tests?]: Yes
[Has the fix been verified in Nightly?]: Yes
[Needs manual test from QE? If yes, steps to reproduce]: Don't think so
[List of other uplifts needed for the feature/fix]:
[Is the change risky?]: No
[Why is the change risky/not risky?]: Tested for some time in Nightly, No direct impact on users.
[String changes made/needed]:
Attachment #8927305 - Flags: approval-mozilla-beta?
Thanks jcristau,

I forgot that uplift was an option for these, thanks for the reminder.
Hi :pbone,
I don't see the major impact on users. Can we let it ride the train?
Flags: needinfo?(pbone)
That's fine with me, I'm fairly ambivilent. jcristau suggested that we might like to uplift it. I think they imagined that bug 1386511 had a larger impact on telemetry size.
Flags: needinfo?(pbone) → needinfo?(jcristau)
The reason I thought you wanted this in 58 was comment 1.
Flags: needinfo?(jcristau)
Ah, my comment was based on a some of the code review by gfritzsche on bug 1386511 comment 32. and gfritzsche later suggested that it should land on the same train if possible. But that wasn't a strong comment either, and Georg is now on leave so I agree with "wontfix".
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Keywords: leave-open
Resolution: --- → FIXED
Attachment #8927305 - Flags: approval-mozilla-beta?
Attachment #8927630 - Flags: approval-mozilla-beta?
Attachment #8929936 - Flags: approval-mozilla-beta?
Blocks: 1424760
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: