Add a dummy pageload origin to origin Telemetry on every page unload
Categories
(Core :: Privacy: Anti-Tracking, task, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox69 | --- | fixed |
People
(Reporter: englehardt, Assigned: xeonchen)
References
Details
Attachments
(1 file)
Origin Telemetry doesn't currently have a notion of pageloads. recordOrigin
will place origin events from different pageloads within the same buffer, so the total number of buffers returned is not a reliable measure of the total number of pageloads.
We can work around this by adding a special "pageload" dummy origin that is sent on every sampled pageload, irrespective of the actual set of origins blocked/exempt (if any).
We can do this in two parts:
-
Add a dummy origin to https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/core/TelemetryOriginData.inc. Perhaps
ORIGIN("PAGELOAD", "PAGELOAD")
? -
Call
recordOrigin
with a metric id ofOriginMetricID::ContentBlocking_Blocked_TestOnly
orOriginMetricID::ContentBlocking_Blocked
(depending on the mode) and origin/hash "PAGELOAD" once for every sampled pageload (i.e., pageloads whereIsReportingEnabled()
is true). We'll want to call this even when no cookies are blocked. I suspect we can hard code this call here https://searchfox.org/mozilla-central/rev/94c6b5f06d2464f6780a52f32e917d25ddc30d6b/dom/base/ContentBlockingLog.cpp#114, before we loop through the log.
I think it's sufficient to only call the dummy origin with the "blocked" metric ID. Origin Telemetry only prepares buffers for metric IDs that actually have at least one true value and the blocked metric will be the most common.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 1•5 years ago
|
||
Comment 2•5 years ago
|
||
Depending on the design of the eventual dataset, I'm not sure it'll be trivial to take blocked
's pageloads count and use it in analyses of exempted rules. I'd double-check that with... oh geez, who's working that part of this... Anthony, is it you?
Comment 3•5 years ago
•
|
||
From my perspective, this is fine since I'm also dealing with raw bit-vectors at the end. I have a script to help map from the bit-vector into something human-readable using the TelemetryOriginData.inc
file. I can't speak strongly about the analysis, but I imagine it would look something like this:
aggregates = {
"PAGELOAD": 1000,
"some.origin.com": 2,
...
}
total = aggregates["PAGELOAD"]
del aggregates["PAGELOAD"]
normalized = {origin: count/total for origin, count in aggregates.items()}
Pushed by xeonchen@gmail.com: https://hg.mozilla.org/integration/autoland/rev/c77c46ac90a5 add dummy page load origin; r=chutten
Comment 5•5 years ago
|
||
Backed out changeset c77c46ac90a5 (Bug 1552536) by xeonchen's request
Backout link: https://hg.mozilla.org/integration/autoland/rev/82b9eaa4679754eb3ff38e6472a3d9f77600affa
Assignee | ||
Updated•5 years ago
|
Pushed by xeonchen@gmail.com: https://hg.mozilla.org/integration/autoland/rev/c63967f172ee add dummy page load origin; r=Ehsan,chutten
Comment 7•5 years ago
|
||
bugherder |
Description
•