Closed Bug 1204994 Opened 9 years ago Closed 9 years ago

make use counters more efficient to send to/process on the telemetry server

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla43

Tracking Flags:

Tracking

Status

firefox43

---

fixed

People

(Reporter: froydnj, Assigned: froydnj)

References

Details

Attachments

(5 files)

part 0 - separate debugging use counters check from value check 9 years ago Nathan Froyd [:froydnj] 2.31 KB, patch	bzbarsky : review+	Details \| Diff \| Splinter Review
part 1 - add *CONTENT_DOCUMENTS_DESTROYED histograms 9 years ago Nathan Froyd [:froydnj] 3.29 KB, patch	bzbarsky : review+ vladan : review+	Details \| Diff \| Splinter Review
part 2 - rename all USE_COUNTER_* histograms to USE_COUNTER2_* histograms 9 years ago Nathan Froyd [:froydnj] 6.75 KB, patch	bzbarsky : review+ vladan : review+	Details \| Diff \| Splinter Review
part 2b - move histogram accumulation around in nsDocument::ReportUseCounters 9 years ago Nathan Froyd [:froydnj] 1.42 KB, patch	bzbarsky : review+	Details \| Diff \| Splinter Review
part 3 - update use counter tests for new CONTENT_DOCUMENT histograms 9 years ago Nathan Froyd [:froydnj] 7.90 KB, patch	bzbarsky : review+	Details \| Diff \| Splinter Review

Nathan Froyd [:froydnj]

Assignee

Description

•

9 years ago

For a given web feature F, the current implementation of use counters declares a boolean histogram for F, whose buckets indicate whether a given document used F or not. (I'm ignoring the "top-level" document histograms here and below for clarity of presentation.) So the percentage of documents on which F is used is: # of "true" values recorded in F's histogram / # of total values recorded in F's histogram One problem with this scheme is that for infrequently used features, we're always accumulating data in histograms, sending those histograms over the wire, and storing those histograms on the telemetry server. It would be better to only pay for transmission and storage of those features which actually get used. This bug is for transitioning to a new scheme, where we have a single histogram denoting how many content documents have been visited/destroyed. Use counters are then mere "count" histograms, which simply record the number of times a given feature was used in a document. So the percentage of documents that use a feature F is now: # of values in F's histogram / # of documents seen bsmedberg, this is somewhat akin to what you had suggested, yes? Roberto, do you think this scheme will help the server-side storage requirements?

Flags: needinfo?(rvitillo)

Flags: needinfo?(benjamin)

Nathan Froyd [:froydnj]

Assignee

Comment 1

•

9 years ago

Attached patch part 0 - separate debugging use counters check from value check — Details — Splinter Review

This change is only here to make future changes more obvious. It is a no-op on its own: just adding a block and shifting whitespace around.

Attachment #8661375 - Flags: review?(bzbarsky)

Nathan Froyd [:froydnj]

Assignee

Comment 2

•

9 years ago

Attached patch part 1 - add *CONTENT_DOCUMENTS_DESTROYED histograms — Details — Splinter Review

These histograms will be used in conjunction with the revised use counter histograms to ascertain the number of pages that don't use particular features. The associated comment in nsDocument.cpp explains how things work. bz for the dom/, vladan for telemetry review.

Attachment #8661376 - Flags: review?(vladan.bugzilla)

Attachment #8661376 - Flags: review?(bzbarsky)

Nathan Froyd [:froydnj]

Assignee

Comment 3

•

9 years ago

Attached patch part 2 - rename all USE_COUNTER_* histograms to USE_COUNTER2_* histograms — Details — Splinter Review

We're changing the definition of each of the use counter histograms. Therefore, they need new names, so as to not throw wrenches into the server-side machinery. This renaming is the most straightforward way to do things and similar to how we have renamed other histograms before. bz for the dom/ changes, vladan for histogram review, etc. (We do need to change how we accumulate values into the histograms; I separated that part out into its own commit and will squash prior to landing.)

Attachment #8661378 - Flags: review?(vladan.bugzilla)

Attachment #8661378 - Flags: review?(bzbarsky)

Nathan Froyd [:froydnj]

Assignee

Comment 4

•

9 years ago

Attached patch part 2b - move histogram accumulation around in nsDocument::ReportUseCounters — Details — Splinter Review

Since we no longer have boolean histograms, we need to only accumulate into the histograms when a particular use counter has actually been used.

Attachment #8661379 - Flags: review?(bzbarsky)

Vladan Djeric (:vladan)

Comment 5

•

9 years ago

Comment on attachment 8661376 [details] [diff] [review] part 1 - add *CONTENT_DOCUMENTS_DESTROYED histograms Review of attachment 8661376 [details] [diff] [review]: ----------------------------------------------------------------- ::: toolkit/components/telemetry/Histograms.json @@ +9147,5 @@ > + "expires_in_version": "never", > + "kind": "count", > + "description": "Number of content documents destroyed; used in conjunction with use counter histograms" > + }, > + "TOP_LEVEL_CONTENT_DOCUMENTS_DESTROYED": { Will this be different from FX_PAGE_LOAD_MS? @@ +9149,5 @@ > + "description": "Number of content documents destroyed; used in conjunction with use counter histograms" > + }, > + "TOP_LEVEL_CONTENT_DOCUMENTS_DESTROYED": { > + "expires_in_version": "never", > + "kind": "count", add alert_emails fields so you or your team can be notified in case the measurements somehow get broken and the distribution of data changes suddenly

Attachment #8661376 - Flags: review?(vladan.bugzilla)

Vladan Djeric (:vladan)

Updated

•

9 years ago

Attachment #8661378 - Flags: review?(vladan.bugzilla) → review+

Nathan Froyd [:froydnj]

Assignee

Comment 6

•

9 years ago

(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #5) > Comment on attachment 8661376 [details] [diff] [review] > part 1 - add *CONTENT_DOCUMENTS_DESTROYED histograms > > Review of attachment 8661376 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: toolkit/components/telemetry/Histograms.json > @@ +9147,5 @@ > > + "expires_in_version": "never", > > + "kind": "count", > > + "description": "Number of content documents destroyed; used in conjunction with use counter histograms" > > + }, > > + "TOP_LEVEL_CONTENT_DOCUMENTS_DESTROYED": { > > Will this be different from FX_PAGE_LOAD_MS? Yes, since FX_PAGE_LOAD_MS is the load time for a page, and this histogram is (effectively) the number of documents the browser loads. Very different units, for one thing. > @@ +9149,5 @@ > > + "description": "Number of content documents destroyed; used in conjunction with use counter histograms" > > + }, > > + "TOP_LEVEL_CONTENT_DOCUMENTS_DESTROYED": { > > + "expires_in_version": "never", > > + "kind": "count", > > add alert_emails fields so you or your team can be notified in case the > measurements somehow get broken and the distribution of data changes suddenly I don't know what emails to add here, as it's not my team per se that cares about these numbers. Is it absolutely critical that we have emails here?

Flags: needinfo?(vladan.bugzilla)

Boris Zbarsky [:bzbarsky]

Comment 7

•

9 years ago

Comment on attachment 8661375 [details] [diff] [review] part 0 - separate debugging use counters check from value check r=me

Attachment #8661375 - Flags: review?(bzbarsky) → review+

Boris Zbarsky [:bzbarsky]

Comment 8

•

9 years ago

Comment on attachment 8661376 [details] [diff] [review] part 1 - add *CONTENT_DOCUMENTS_DESTROYED histograms r=me

Attachment #8661376 - Flags: review?(bzbarsky) → review+

Nathan Froyd [:froydnj]

Assignee

Comment 9

•

9 years ago

After talking with rvitillo, it looks like I completely misunderstood how count histograms work. So we're going to leave the histograms as boolean ones and only ever aggregate true values into them.

Boris Zbarsky [:bzbarsky]

Comment 10

•

9 years ago

Comment on attachment 8661378 [details] [diff] [review] part 2 - rename all USE_COUNTER_* histograms to USE_COUNTER2_* histograms r=me

Attachment #8661378 - Flags: review?(bzbarsky) → review+

Boris Zbarsky [:bzbarsky]

Comment 11

•

9 years ago

Comment on attachment 8661379 [details] [diff] [review] part 2b - move histogram accumulation around in nsDocument::ReportUseCounters r=me

Attachment #8661379 - Flags: review?(bzbarsky) → review+

Vladan Djeric (:vladan)

Comment 12

•

9 years ago

(In reply to Nathan Froyd [:froydnj] from comment #6) > Yes, since FX_PAGE_LOAD_MS is the load time for a page, and this histogram > is (effectively) the number of documents the browser loads. Very different > units, for one thing. I was wondering whether "a page" in the FX_PAGE_LOAD_MS histogram and "a document" in your histogram represent roughly the same thing? I'm asking because I'd like to use one of these #s in Telemetry analyses. > I don't know what emails to add here, as it's not my team per se that cares > about these numbers. Is it absolutely critical that we have emails here? It's not critical, but it's easier to maintain Histograms.json when we know who "owns" or owned the histogram. Alternately, you can just tell us (in the description field) which component to file bugs related to this histogram

Flags: needinfo?(vladan.bugzilla)

Roberto Agostino Vitillo (:rvitillo)

Comment 13

•

9 years ago

(In reply to Nathan Froyd [:froydnj] from comment #0) > For a given web feature F, the current implementation of use counters > declares a boolean histogram for F, whose buckets indicate whether a given > document used F or not. (I'm ignoring the "top-level" document histograms > here and below for clarity of presentation.) So the percentage of documents > on which F is used is: > > # of "true" values recorded in F's histogram / # of total values recorded in > F's histogram > > One problem with this scheme is that for infrequently used features, we're > always accumulating data in histograms, sending those histograms over the > wire, and storing those histograms on the telemetry server. It would be > better to only pay for transmission and storage of those features which > actually get used. > > This bug is for transitioning to a new scheme, where we have a single > histogram denoting how many content documents have been visited/destroyed. > Use counters are then mere "count" histograms, which simply record the > number of times a given feature was used in a document. So the percentage > of documents that use a feature F is now: > > # of values in F's histogram / # of documents seen > > bsmedberg, this is somewhat akin to what you had suggested, yes? Roberto, > do you think this scheme will help the server-side storage requirements? This is going to help, thanks. The aggregator job has to be modified accordingly. On the server side count histograms, which are ultimately just scalars, are transformed into distributions and this is not what we want in this case. I made some changes to ignore USE_COUNTER2_ histograms for now; when the patch has landed I am going to properly add support for it to the aggregator.

Flags: needinfo?(rvitillo)

Nathan Froyd [:froydnj]

Assignee

Comment 14

•

9 years ago

(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #12) > (In reply to Nathan Froyd [:froydnj] from comment #6) > > Yes, since FX_PAGE_LOAD_MS is the load time for a page, and this histogram > > is (effectively) the number of documents the browser loads. Very different > > units, for one thing. > > I was wondering whether "a page" in the FX_PAGE_LOAD_MS histogram and "a > document" in your histogram represent roughly the same thing? I'm asking > because I'd like to use one of these #s in Telemetry analyses. They do not. The top-level document #s might be the same thing, but I'd have to investigate more closely to see whether that's the case.

Nathan Froyd [:froydnj]

Assignee

Comment 15

•

9 years ago

Comment on attachment 8661376 [details] [diff] [review] part 1 - add *CONTENT_DOCUMENTS_DESTROYED histograms Since the alert_emails field isn't critical, let's get this reviewed.

Attachment #8661376 - Flags: review?(vladan.bugzilla)

Nathan Froyd [:froydnj]

Assignee

Updated

•

9 years ago

Assignee: nobody → nfroyd

Vladan Djeric (:vladan)

Updated

•

9 years ago

Attachment #8661376 - Flags: review?(vladan.bugzilla) → review+

Nathan Froyd [:froydnj]

Assignee

Comment 16

•

9 years ago

Attached patch part 3 - update use counter tests for new CONTENT_DOCUMENT histograms — Details — Splinter Review

Tests are always a good idea; apologies for not writing these yesterday.

Attachment #8661998 - Flags: review?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 17

•

9 years ago

Comment on attachment 8661998 [details] [diff] [review] part 3 - update use counter tests for new CONTENT_DOCUMENT histograms r=me

Attachment #8661998 - Flags: review?(bzbarsky) → review+

Pulsebot

Comment 18

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/b217dfd70e22 https://hg.mozilla.org/integration/mozilla-inbound/rev/06bb41ed2935 https://hg.mozilla.org/integration/mozilla-inbound/rev/5f676d511599 https://hg.mozilla.org/integration/mozilla-inbound/rev/c5b25c30befc

Pulsebot

Comment 19

•

9 years ago

Backout: https://hg.mozilla.org/integration/mozilla-inbound/rev/42e8d6b514a9

Roberto Agostino Vitillo (:rvitillo)

Comment 20

•

9 years ago

As discussed on IRC, using boolean histograms instead of count histograms would simplify the processing on the backend.

Pulsebot

Comment 21

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/bad1e6a9f005 https://hg.mozilla.org/integration/mozilla-inbound/rev/e17c2061c0c8 https://hg.mozilla.org/integration/mozilla-inbound/rev/7d7181321bf1 https://hg.mozilla.org/integration/mozilla-inbound/rev/e8bb18a02a56

Carsten Book [:Tomcat]

Comment 22

•

9 years ago

https://hg.mozilla.org/mozilla-central/rev/bad1e6a9f005 https://hg.mozilla.org/mozilla-central/rev/e17c2061c0c8 https://hg.mozilla.org/mozilla-central/rev/7d7181321bf1 https://hg.mozilla.org/mozilla-central/rev/e8bb18a02a56

Status: NEW → RESOLVED

Closed: 9 years ago

status-firefox43: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla43

Benjamin Smedberg

Updated

•

9 years ago

Flags: needinfo?(benjamin)

Nobody; OK to take it and work on it

Updated

•

6 years ago

Component: DOM → DOM: Core & HTML

You need to log in before you can comment on or make changes to this bug.