Create new telemetry ping for TopSites
Categories
(Firefox :: Top Sites, task, P1)
Tracking
()
People
(Reporter: rachel, Assigned: nanj)
References
Details
Attachments
(4 files, 1 obsolete file)
69 bytes,
text/x-github-pull-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
ryanvm
:
approval-mozilla-beta+
|
Details | Review |
3.29 KB,
text/plain
|
teon
:
data-review-
|
Details |
69 bytes,
text/x-github-pull-request
|
Details | Review |
Description
Part of some of the upcoming TopSItes changes, this specifically refers to the discussion from Jan 22, 2021.
Some outstanding TODOs here:
- Agree on schema.
- Determine if we need/want a client id (or can use some other identifier).
Timeline
We are targeting FX 87 for all of this work(TopSites Phase1), however as this piece lays the groundwork for much of the data pipeline integration, it would be good to have this in ...within the next few weeks or so.
Assignee | ||
Comment 1•4 years ago
|
||
(In reply to Rachel Tublitz [:rachel] from comment #0)
Some outstanding TODOs here:
- Determine if we need/want a client id (or can use some other identifier).
Just wanted to note that there is already a user id available. It's stored in about:prefs as browser.newtabpage.activity-stream.impressionId
, and used by various components that couldn't include the regular client_id
in the telemetry. For example, Pocket uses it for impressions/clicks.
Assignee | ||
Comment 2•4 years ago
•
|
||
Here are some contexts about the existing telemetry of AcitivtyStream/TopSites. Hope it helps :)
- AS telemetry schemas are defined here
- AS telemetry document
- All pings are sent by PingCentre client from the browser. AS don't use the standard Firefox Telemetry client nor the Telemetry Event client because:
- Certain AS pings (such as Pocket impressions/clicks) require the minimum latency as PingCentre sends pings right away without batching
- Certain AS pings are not allowed to include the
client_id
, whereas both of them reportclient_id
by default
- All AS pings are routed through TelemetryFeed before sending off to the data pipeline. It provides the following functionalities:
- Lifecycle management, used for newtab session tracking
- Viewability handling, i.e. only submits telemetry when the measured objects are visible. This is necessary as preloading and lazy rendering prevails on the newtab page
- Telemetry policy enforcement, for instance, do not include
client_id
for certain events
- Once pings get to the data pipeline and pass the validation, they can be republished within the pipeline for other uses. As an example, bug 1559411 republishes the Pocket pings for the external integration
Assignee | ||
Comment 3•4 years ago
|
||
I've spoken to :chutten about using Glean for this data collection other than PingCentre, so far we've found Glean fits the bill pretty well. It supports:
- Payloads without "client-id"
- Low latency ping delivery
- Dedicated table for each ping type
- Stellar documentation and dev support
As we're going to phase out PingCentre client to favor Glean, I'd highly recommend adopting Glean for this project.
Note that Glean's size limitation might not be enough for certain ping types, but the Glean team is open to adding new types (such as bug 1613944) to meet our needs.
Thanks for the details on the existing AS pipeline – that's really helpful context!
Re: Glean, my understanding is that FOG isn't quite ready for new customers yet and the target for this project is to ship by Firefox 89, and of course implementation is starting asap. Unless something has changed I don't think the FOG timeline quite works out here.
Fwiw custom doctype telemetry is also capable of the advantages listed so we wouldn't be giving much up if that ends up being our route.
Comment 5•4 years ago
|
||
:nanj and I had a chat yesterday and, depending on the sizes of the strings involved, FOG could be a fine choice as far as the client's concerned. The biggest missing pieces in FOG at the moment are the builtin pings ("metrics" and "baseline" and "events") so if the collection is a Custom Ping we're already ready to go. (( We're also a little light on Migration Documentation. It'd be written to a Telemetry-using audience, though, so it might not actually be that big of a loss for PingCentre folks ))
However, Glean is rather specific about how long Strings and Event Extra values can be (100 and 50 characters of utf8, respectively), so if the values are too long and the timelines are too tight to go through the Add or Change a Glean Metric Type Process then this will indeed need to be something else for 89.
Though 89 doesn't hit beta until April 19, so that seems very reachable if we nail down our requirements soonish.
Assignee | ||
Comment 6•4 years ago
|
||
Thank you both for the inputs!
Looks like we need a sync-up to work out those details and figure out the best plan. Will send an invite to you soon.
Assignee | ||
Comment 7•4 years ago
|
||
Just a quick update on my previous recommendation on FOG, as we are now striving to get the whole system up and running soon (Firefox 87), it's a little risky to adopt FOG given the tight timeline. As such, please disregard my recommendation in Comment 3.
Let's stick to the plan that :sunahsuh proposed above, we will revisit the adoption of FOG in future iterations.
Thanks!
Reporter | ||
Updated•4 years ago
|
Comment 8•4 years ago
|
||
Assignee | ||
Comment 9•4 years ago
|
||
Updated•4 years ago
|
Assignee | ||
Comment 10•4 years ago
•
|
||
Hey Teon,
This patch only implements the data collection for the sponsored TopSites, in particular, the impression and click pings without client_id. The ones with client_id will be added later separately in order to limit the complexity of this patch.
I didn't list the whole payload schema in the request form, you can either reference the "Data Collection Proposal" doc or the telemetry documents in the patch for more details of the collection. Please let me know if you want to see the whole schema documented in the request.
Thank you!
Assignee | ||
Comment 11•4 years ago
|
||
After confirming with Ryan Harter that we'd like to collect the sponsored TopSites impression/click with client_id as scalar counters.
Added that collection into this implementation as it didn't add too much complexity to this patch, nor to the data review request. Please disregard my previous comment.
Comment 12•4 years ago
|
||
Comment 13•4 years ago
•
|
||
Comment on attachment 9203545 [details]
data_review_request_contextual_services_topsites.md
DATA COLLECTION REVIEW RESPONSE:
Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate?
Yes. This collection is Telemetry so is documented in its definitions file Events.yaml and the Probe Dictionary.
Is there a control mechanism that allows the user to turn the data collection on and off?
Yes. This collection is Telemetry so can be controlled through Firefox's Preferences.
If the request is for permanent data collection, is there someone who will monitor the data over time?
Yes, :nanj is responsible.
Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?
Category 3
Is the data collection request for default-on or default-off?
Default on for all channels.
Does the instrumentation include the addition of any new identifiers?
Yes.
Is the data collection covered by the existing Firefox privacy notice?
No.
Does there need to be a check-in in the future to determine whether to renew the data?
No. This collection is permanent.
Result: datareview-
Assignee | ||
Comment 14•4 years ago
|
||
Teon, thanks for your feedback!
re: "Is the data collection covered by the existing Firefox privacy notice?", I believe the answer is "no", and we will update the policy notice to reflect the changes with this feature.
/cc Mika, as I promised to tag you in all the data reviews for this project :)
Assignee | ||
Comment 15•4 years ago
|
||
[Tracking Requested - why for this release]:
We need this telemetry for the feature rollout in 87.
Updated•4 years ago
|
Comment 16•4 years ago
|
||
Responding to comment 13:
(1) The Privacy Notice does need to be changed for this data request. We will post this to our public governance forum for comment prior to making that change.
(2) Because this involves category 3 data mitigations have been identified and these will also be published. As part of that, and because this is related to a feature, there will also be a future checkin to evaluate the continued need for this data.
Assignee | ||
Comment 17•4 years ago
|
||
STR for QA:
- Make sure
browser.topsites.useRemoteSetting
is set to true - Creat a new boolean pref
browser.topsites.experiment.ebay-2020-1
and set it to true - Enable PingCentre logging
browser.ping-centre.log
- Open the browser console to view the PingCentre logs
- Open a newtab page
Test points:
- Impression
You should be able to see the impression ping in the browser console as follows (note: the context_id
will be different):
TELEMETRY PING (activity-stream): {"experiments":{},"locale":"en-US","version":"88.0a1","release_channel":"default","position":1,"tile_id":-1,"advertiser":"amazon","source":"newtab","context_id":"{61c00d4f-a3a9-8c4f-8b07-d9b3c7214554}"}
Now open "about:telemetry#keyed-scalars-tab", you should see a keyed scalar called contextual.services.topsites.impression
, and its name is newtab_1
, and its value is 1 (or whatever impressions you have seen).
- Click
You should be able to see the click ping in the browser console as follows (note: the context_id
will be different). Note that the impression ping and click ping happen to share the same payload now, you can use the network activity monitor to see that those two pings are sent to different endpoints. (impression to */topsites-impresssion/*
and click to */topsites-click/*
)
TELEMETRY PING (activity-stream): {"experiments":{},"locale":"en-US","version":"88.0a1","release_channel":"default","position":1,"tile_id":-1,"advertiser":"amazon","source":"newtab","context_id":"{61c00d4f-a3a9-8c4f-8b07-d9b3c7214554}"}
Now open "about:telemetry#keyed-scalars-tab", you should see a keyed scalar called contextual.services.topsites.click
, and its name is newtab_1
, and its value is 1 (or whatever number that you have clicked).
- TopSites collapse & expand
Collapsing the TopSites section on newtab should silence the impression ping as it's now not visible. Expanding it again should see the impression again.
Caveats
- Since newtab caches the last rendered page for the fast page loading during the next browser startup. It will not send the impression ping if there is any sponsored TopSites shown on the first newtab. However, the click ping should be sent still.
Comment 18•4 years ago
|
||
Comment 19•4 years ago
|
||
bugherder |
Assignee | ||
Comment 20•4 years ago
|
||
[Tracking Requested - why for this release]:
Reset the tracking flag that was unset in comment 16.
Comment 21•4 years ago
|
||
I have verified this bug on the latest Nightly 88.0a1 build (Build ID: 20210223230332) on Windows 10 x64, macOS 10.15.7 and Linux Mint 20. In order to verify this, I have used the instructions described in Comment 17.
- The impression and click pings are correctly displayed in the browser console.
- The impression and click keyed scalars are correctly registered in "about:telemetry" page.
However, it seems that the "click" telemetry is not registered if the Sponsored Top Sites are opened using middle click or the context menu options (Open in New Tab or Open in New Window). I have logged this behavior in Bug 1694629.
Updated•4 years ago
|
Assignee | ||
Comment 22•4 years ago
|
||
(In reply to Cosmin Muntean [:cmuntean], Ecosystem QA from comment #21)
However, it seems that the "click" telemetry is not registered if the Sponsored Top Sites are opened using middle click or the context menu options (Open in New Tab or Open in New Window). I have logged this behavior in Bug 1694629.
Thanks for the validation!
Re: the middle click and the context menu click, those are "known issues" as we don't have any telemetry coverage for those user interactions in the newtab. While I don't think this would be a release blocker, let's track those missing pieces in Bug 1694629 for future work.
Assignee | ||
Comment 23•4 years ago
|
||
Comment on attachment 9202631 [details]
Bug 1688698 - Add telemetry for sponsored TopSites
Beta/Release Uplift Approval Request
- User impact if declined: We need this for the Contextual Services/Sponsored Topsites experiments & rollout in 87.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): This is a telemetry patch, doesn't change the user-facing code.
- String changes made/needed: None
Updated•4 years ago
|
Comment 24•4 years ago
|
||
Comment on attachment 9202631 [details]
Bug 1688698 - Add telemetry for sponsored TopSites
Approved for 87.0b3.
Comment 25•4 years ago
|
||
bugherder uplift |
Comment 26•4 years ago
|
||
Comment on attachment 9203595 [details]
data_review_request_contextual_services_topsites.md
for full discussion on the contextual services, please check https://bugzilla.mozilla.org/show_bug.cgi?id=1694693
Comment 27•4 years ago
|
||
I have verified this bug on the latest Beta 87.0b3 build (Build ID: 20210225185804) on Windows 10 x64, macOS 10.15.7 and Linux Mint 20.
- The impression and click pings for the Sponsored Top Sites are correctly displayed in browser console.
- The impression and click keyed scalars for the Sponsored Top Sites are correctly registered in "about:telemetry" page.
Description
•