Add a GC mark rate telemetry probe

RESOLVED FIXED in Firefox 66

Status

()

enhancement
RESOLVED FIXED
11 months ago
5 months ago

People

(Reporter: pbone, Assigned: allstars.chh)

Tracking

61 Branch
mozilla66
Points:
---

Firefox Tracking Flags

(firefox66 fixed)

Details

Attachments

(4 attachments, 4 obsolete attachments)

Reporter

Description

11 months ago
It'd be good to see how many objects the GC can mark per second in telemetry:

This can help us measure the impact of changes we make to the marking code and is also useful for Bug 1475822 as a measurement of memory bandwidth and latency.
Reporter

Comment 1

6 months ago
This refers to the marking code called in the main part of the GC:

Most code paths I can find go through:
https://searchfox.org/mozilla-central/source/js/src/gc/GC.cpp#6254

But there may be others, and it mightn't count marking objects directly referenced by roots and such, I'm not sure if that's going to be a problem

All this code uses the Marker class, so I'd add the counting code there:

https://searchfox.org/mozilla-central/source/js/src/gc/GCMarker.h#234

Or in JSTracer if we want it to be more general.  In that case we probably want the timing code in incrementalSlice():

https://searchfox.org/mozilla-central/source/js/src/gc/GC.cpp#7437
We calcuate the total mark time in Statistics::endGC() - see 'markTotal'.  We just need a count of objects marked.  The GCMarker::mark() method would be a good place to put this.
Assignee: nobody → allstars.chh
Status: NEW → ASSIGNED
Posted patch Patch (obsolete) — Splinter Review
Jonco, can you check this is what you have in mind?
I'll upload another data review request when I send r?.
Attachment #9026098 - Flags: feedback?(jcoppeard)
Comment on attachment 9026098 [details] [diff] [review]
Patch

Review of attachment 9026098 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, just a few minor comments.

::: js/src/gc/Marking.cpp
@@ +954,3 @@
>      TenuredCell* cell = TenuredCell::fromPointer(thing);
> +    MarkColor color = TypeParticipatesInCC<T>::value ?
> +                          markColor() : MarkColor::Black;

nit: this should either be all on one line, or like this:

TypeParticipatesInCC<T>::value ? markColor()
                               : MarkColor::Black;

@@ +957,3 @@
>  
> +    if (color == MarkColor::Black) {
> +        runtime()->gc.stats().markCount++;

Do we only want to count black marking?  I'd say always increment this.

::: js/src/gc/Statistics.h
@@ +298,5 @@
>      void writeLogMessage(const char* fmt, ...) { };
>  #endif
>  
> +    // The count of marked objects during GC.
> +    uint32_t markCount;

We want this change to have the minimum effect on performance so I think this should live in GCMarker, to save the indirection when incrementing it.

Also, this should probably be a size_t.

::: toolkit/components/telemetry/Histograms.json
@@ +831,5 @@
> +    "record_in_processes": ["main", "content"],
> +    "alert_emails": ["dev-telemetry-gc-alerts@mozilla.org", "jcoppeard@mozilla.com"],
> +    "expires_in_version": "never",
> +    "kind": "linear",
> +    "high": 10000,

Did you test this in a browser?  What kind of results do we get?  I don't have much of a feel for what the limit should be.

@@ +834,5 @@
> +    "kind": "linear",
> +    "high": 10000,
> +    "n_buckets": 100,
> +    "bug_numbers": [1475896],
> +    "description": "The rate of marking objects per ms during GC."

nit: should be "The number of objects marked per ms during GC"
Attachment #9026098 - Flags: feedback?(jcoppeard) → feedback+
(In reply to Jon Coppeard (:jonco) from comment #4)
> ::: toolkit/components/telemetry/Histograms.json
> @@ +831,5 @@
> > +    "record_in_processes": ["main", "content"],
> > +    "alert_emails": ["dev-telemetry-gc-alerts@mozilla.org", "jcoppeard@mozilla.com"],
> > +    "expires_in_version": "never",
> > +    "kind": "linear",
> > +    "high": 10000,
> 
> Did you test this in a browser?  What kind of results do we get?  I don't
> have much of a feel for what the limit should be.
> 
Now with your comments addressed (inc for marking gray and black), the rate goes from 600 to 1200. average is 1400.
I'll try to run the browser longer and have a more accurate high number.
Posted patch Patch. v2 (obsolete) — Splinter Review
Updated patch.
I've lowered the 'high' in Histograms to 3000, as I found on my old Linux laptop the average rate is like 3~500. 
I've requested a Windows laptop to see the score on Windows.
Attachment #9026098 - Attachment is obsolete: true
Posted file Data review request (obsolete) —
Hi Jonco, can you help to check this data collection review request?

Thanks
Attachment #9026384 - Flags: feedback?(jcoppeard)
Comment on attachment 9026384 [details]
Data review request

>1) What questions will you answer with this data?

"We will use this to measure improvements to the mark rate (e.g. for bug 721236) and catch regressions."

>2) Why does Mozilla need to answer these questions?  Are there benefits for users? Do we need this information to address product or business requirements? Some example responses:
>- To ensure the GC is working efficiently.
>- To tune and improve GC memory usage and performance

This one is really about performance only.

>4) Can current instrumentation answer these questions?
>- No, current instrumentation only keeps track of the marking rate.

You mean *does not* keep track of the marking rate.
Attachment #9026384 - Flags: feedback?(jcoppeard) → feedback+
(In reply to Jon Coppeard (:jonco) from comment #8)
> 
> >4) Can current instrumentation answer these questions?
> >- No, current instrumentation only keeps track of the marking rate.
> 
> You mean *does not* keep track of the marking rate.
Ah, originally what I have in mind is 'only keep track of the marking *_time_*', but somehow my fingers think differently. :P

Thanks for catching that.
Attachment #9026384 - Attachment is obsolete: true
Attachment #9026403 - Flags: feedback+
Comment on attachment 9026348 [details] [diff] [review]
Patch. v2

Review of attachment 9026348 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/Histograms.json
@@ +831,5 @@
> +    "record_in_processes": ["main", "content"],
> +    "alert_emails": ["dev-telemetry-gc-alerts@mozilla.org", "jcoppeard@mozilla.com"],
> +    "expires_in_version": "never",
> +    "kind": "linear",
> +    "high": 3000,

I'd make this higher.  We want this number to increase after all.  I think 10000 was fine as it was.
Posted patch Patch. v2.1 (obsolete) — Splinter Review
change 'high' in Histogram to 10000.
Attachment #9026348 - Attachment is obsolete: true
Attachment #9026898 - Flags: review?(jcoppeard)
Attachment #9026403 - Flags: review?(chutten)
Attachment #9026898 - Flags: review?(jcoppeard) → review+
Comment on attachment 9026403 [details]
Data review request v2

DATA COLLECTION REVIEW RESPONSE:

    Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate? 

Yes. As this collection is Telemetry it is documented in its definition file (Histograms.json), in the Probe Dictionary, and in the telemetry.mozilla.org Measurement Dashboard.

    Is there a control mechanism that allows the user to turn the data collection on and off?

Yes. As this collection is Telemetry it can be controlled in Firefox's Preferences.

    If the request is for permanent data collection, is there someone who will monitor the data over time?

Yes. :jcoppeard.

    Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 1, Technical.

    Is the data collection request for default-on or default-off?

Default on, all channels.

    Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)?

No.

    Is the data collection covered by the existing Firefox privacy notice?

Yes.

    Does there need to be a check-in in the future to determine whether to renew the data?

No. This collection is permanent.

---
Result: datareview+
Attachment #9026403 - Flags: review?(chutten) → review+
Comment on attachment 9026898 [details] [diff] [review]
Patch. v2.1

Review of attachment 9026898 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/Histograms.json
@@ +835,5 @@
> +    "high": 10000,
> +    "n_buckets": 100,
> +    "bug_numbers": [1475896],
> +    "description": "The number of objects marked per ms during GC."
> +  },

I have some drive-by comments about this collection.

You may wish to consider using kind: exponential so that lower buckets have higher resolution. You can see a preview of what the bucket assignments will be by using the Histogram Simulator[1].

You may also wish to calculate the rate at the time of marking. A large number of marked objects might not correspond to a large duration in MARK_MS. If you are interested in the rate, would it not make more sense to record the objects per millisecond (or some other rate) instead of the number of objects?

[1]: https://telemetry.mozilla.org/histogram-simulator/#low=1&high=10000&n_buckets=100&kind=linear&generate=normal
(In reply to Chris H-C :chutten from comment #14)
> You may also wish to calculate the rate at the time of marking. A large
> number of marked objects might not correspond to a large duration in
> MARK_MS. If you are interested in the rate, would it not make more sense to
> record the objects per millisecond (or some other rate) instead of the
> number of objects?
> 


Wow, apparently I cannot read to the end of the description where it says quite clearly "per ms during GC". I apologize for that.
chutten suggested me could add a low in histograms
https://mozilla.logbot.info/developers/20181123#c15648650

I'll try this telemetry on some general laptop to see if we should have a low here.
My previous numbers for marking rate is from my *debug* build, so the number is much smaller.

Just try a window opt build from [1], the number is quite high, the average is 223,790.
perhaps we should redefine the unit? or change the high and low values.


[1] https://queue.taskcluster.net/v1/task/dXweKvg-Rc-Q-HIfkxo9bQ/runs/0/artifacts/public/build/install/sea/target.installer.exe
(In reply to Yoshi Huang [:allstars.chh] from comment #17)
Ah yes, in that case we do want to change the range.  Maybe make the range 1000 to 1,000,000 to give us enough headroom, and make this exponential?
(In reply to Jon Coppeard (:jonco) from comment #18)
> (In reply to Yoshi Huang [:allstars.chh] from comment #17)
> Ah yes, in that case we do want to change the range.  Maybe make the range
> 1000 to 1,000,000 to give us enough headroom, and make this exponential?
Yeah, this is a good idea, will try this. Thanks
(In reply to Jon Coppeard (:jonco) from comment #18)
> (In reply to Yoshi Huang [:allstars.chh] from comment #17)
> Ah yes, in that case we do want to change the range.  Maybe make the range
> 1000 to 1,000,000 to give us enough headroom, and make this exponential?

Hi Jonco
I generated two builds to show the result of kind is "linear" and kind is "exponential".

But for the exponential one, from the documentation, 
https://firefox-source-docs.mozilla.org/toolkit/components/telemetry/telemetry/collection/histograms.html#exponential

"An exponential histogram fits this requirement since it has “narrow” buckets near the minimum value and significantly “wider” buckets near the maximum value."

But in our case, we want the opposite way -> "wider" buckets near minimum, and "narrow" buckets near maximum.
I am not sure about the benefits for using 'exponential' as kind.

Or should we just use linear?

Thanks
(In reply to Yoshi Cheng-Hao Huang [:allstars.chh] from comment #22)
Looking at those graphs I'm surprised at the high variation.

I'm not sure what I expected from the exponential histogram.  Let's make this linear then.
Posted patch Patch. v3.Splinter Review
change low to 1000, high to 1000000, and kind is still linear.
Attachment #9026898 - Attachment is obsolete: true
Attachment #9031451 - Flags: review+

Comment 25

5 months ago
Pushed by allstars.chh@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9670d2fdc166
Add telemetry for GC marking rate. r=jonco, data-review=chutten

Comment 26

5 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/9670d2fdc166
Status: ASSIGNED → RESOLVED
Last Resolved: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla66
You need to log in before you can comment on or make changes to this bug.