Closed Bug 1539262 Opened 7 months ago Closed 7 months ago

Support origins lists longer than 2046 elements in Origin Telemetry

Categories

(Toolkit :: Telemetry, enhancement, P1)

enhancement
Points:
3

Tracking

()

RESOLVED FIXED
mozilla68
Tracking Status
firefox68 --- fixed

People

(Reporter: chutten|PTO, Assigned: chutten|PTO)

References

Details

Attachments

(6 files)

We expect the first (and for the foreseeable future only) user of Origin Telemetry (Content Blocking) to have a list of origins that has about 2500 elements in it (the Disconnect list).

We're going to have to support that.

This bug will:

  • Take care of splitting the origins list if it is longer than 2046 elements in a way that is transparent to API consumers.
  • Also add in support for the meta-origin which is set to 1 if a consumer tries to record an unknown origin to a known metric more than 0 times. Might give it a value of "__OTHER__" or something, and reserve it for internal use.
  • Notify the "origin-telemetry-storage-limit-reached" topic when the number of encoded prioData items would reach or exceed the value in the toolkit.telemetry.prioping.dataLimit preference.

In order to notify the "prio" ping when we reach the data limit, we need to
keep an accounting of how many prioData elements we'd need to encode what's in
storage.

This also adds the pref reading and topic notification code for the
"origin-telemetry-storage-limit-reached" topic that the "prio" ping observes.

Content Blocking's list is longer than the largest bool vector size supported
by PrioEncoder, so we need to split the list into shards before encoding.

This means we need to use the metric name and shard number together to identify
the encoding so it's possible to decode it later.

While I'm here, restructure GetEncodedSnapshots to make my life easier when I
eventually try to put the heavy lifting on its own thread. There's a clearer
split now between JS stuff and non-JS stuff.

Depends on D25128

Since reporting intervals are ~1 day/1 session, the Origin Telemetry prototype
must support the possibility that multiple origins will be recorded for the
same metric.

For example, if the user is sampled to record two pageloads where the same
ultra-common tracker is present and blocked we must record that tracker as
having been blocked twice.

This requires a bit of a shift in storage and plaintext snapshot. Instead of
being an array of origins with duplicates, now we're storing origins as a bag
(aka multiset, aka hashtable of origin->count).

Depends on D25131

Pushed by chutten@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/43ee3d28167f
Count the number of prioDatas needed to encode the recorded Origin Telemetry. r=janerik
https://hg.mozilla.org/integration/autoland/rev/e3c8dbb5cb1d
Test that Origin Telemetry notifies when it reaches the data limit. r=janerik
https://hg.mozilla.org/integration/autoland/rev/c3872bfb8197
Origin Telemetry support for origins lists exceeding PrioEncoder's limit. r=janerik
https://hg.mozilla.org/integration/autoland/rev/0b750c9fbbdc
Record if Origin Telemetry was used with an unknown origin. r=janerik
https://hg.mozilla.org/integration/autoland/rev/a2f60534ffdb
Test 'unknown origin' support in Origin Telemetry. r=janerik
https://hg.mozilla.org/integration/autoland/rev/7c940e9caee9
Support multiple origins in the same metric in Origin Telemetry r=janerik

I ran 2012x64 builds as part of my try run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=650218d94c06de247005c0bc404d3903bfc2d519

But apparently it's a 32-bit-only failure? I dunno. I've reopened the revisions and will use try to figure out what's needed to make windows builds happy with my gtest.

First attempt: https://treeherder.mozilla.org/#/jobs?repo=try&revision=3aad737f6fa46dae1ad0ae37f58c02f851503fee

Flags: needinfo?(chutten)
Pushed by chutten@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c15f491f127a
Count the number of prioDatas needed to encode the recorded Origin Telemetry. r=janerik
https://hg.mozilla.org/integration/autoland/rev/47eadc1c796b
Test that Origin Telemetry notifies when it reaches the data limit. r=janerik
https://hg.mozilla.org/integration/autoland/rev/683118ef1afa
Origin Telemetry support for origins lists exceeding PrioEncoder's limit. r=janerik
https://hg.mozilla.org/integration/autoland/rev/4049a4b00a49
Record if Origin Telemetry was used with an unknown origin. r=janerik
https://hg.mozilla.org/integration/autoland/rev/80e1339ed57f
Test 'unknown origin' support in Origin Telemetry. r=janerik
https://hg.mozilla.org/integration/autoland/rev/d13bf5b0872c
Support multiple origins in the same metric in Origin Telemetry r=janerik
You need to log in before you can comment on or make changes to this bug.