Closed Bug 1688698 Opened 4 years ago Closed 4 years ago

Create new telemetry ping for TopSites

Categories

(Firefox :: Top Sites, task, P1)

task

Tracking

()

VERIFIED FIXED
88 Branch
Tracking Status
firefox87 + verified
firefox88 + verified

People

(Reporter: rachel, Assigned: nanj)

References

Details

Attachments

(4 files, 1 obsolete file)

Description

Part of some of the upcoming TopSItes changes, this specifically refers to the discussion from Jan 22, 2021.

Some outstanding TODOs here:

  1. Agree on schema.
  2. Determine if we need/want a client id (or can use some other identifier).

Timeline

We are targeting FX 87 for all of this work(TopSites Phase1), however as this piece lays the groundwork for much of the data pipeline integration, it would be good to have this in ...within the next few weeks or so.

(In reply to Rachel Tublitz [:rachel] from comment #0)

Some outstanding TODOs here:

  1. Determine if we need/want a client id (or can use some other identifier).

Just wanted to note that there is already a user id available. It's stored in about:prefs as browser.newtabpage.activity-stream.impressionId, and used by various components that couldn't include the regular client_id in the telemetry. For example, Pocket uses it for impressions/clicks.

Here are some contexts about the existing telemetry of AcitivtyStream/TopSites. Hope it helps :)

  • AS telemetry schemas are defined here
  • AS telemetry document
  • All pings are sent by PingCentre client from the browser. AS don't use the standard Firefox Telemetry client nor the Telemetry Event client because:
    • Certain AS pings (such as Pocket impressions/clicks) require the minimum latency as PingCentre sends pings right away without batching
    • Certain AS pings are not allowed to include the client_id, whereas both of them report client_id by default
  • All AS pings are routed through TelemetryFeed before sending off to the data pipeline. It provides the following functionalities:
    • Lifecycle management, used for newtab session tracking
    • Viewability handling, i.e. only submits telemetry when the measured objects are visible. This is necessary as preloading and lazy rendering prevails on the newtab page
    • Telemetry policy enforcement, for instance, do not include client_id for certain events
  • Once pings get to the data pipeline and pass the validation, they can be republished within the pipeline for other uses. As an example, bug 1559411 republishes the Pocket pings for the external integration

I've spoken to :chutten about using Glean for this data collection other than PingCentre, so far we've found Glean fits the bill pretty well. It supports:

  • Payloads without "client-id"
  • Low latency ping delivery
  • Dedicated table for each ping type
  • Stellar documentation and dev support

As we're going to phase out PingCentre client to favor Glean, I'd highly recommend adopting Glean for this project.

Note that Glean's size limitation might not be enough for certain ping types, but the Glean team is open to adding new types (such as bug 1613944) to meet our needs.

Thanks for the details on the existing AS pipeline – that's really helpful context!

Re: Glean, my understanding is that FOG isn't quite ready for new customers yet and the target for this project is to ship by Firefox 89, and of course implementation is starting asap. Unless something has changed I don't think the FOG timeline quite works out here.

Fwiw custom doctype telemetry is also capable of the advantages listed so we wouldn't be giving much up if that ends up being our route.

:nanj and I had a chat yesterday and, depending on the sizes of the strings involved, FOG could be a fine choice as far as the client's concerned. The biggest missing pieces in FOG at the moment are the builtin pings ("metrics" and "baseline" and "events") so if the collection is a Custom Ping we're already ready to go. (( We're also a little light on Migration Documentation. It'd be written to a Telemetry-using audience, though, so it might not actually be that big of a loss for PingCentre folks ))

However, Glean is rather specific about how long Strings and Event Extra values can be (100 and 50 characters of utf8, respectively), so if the values are too long and the timelines are too tight to go through the Add or Change a Glean Metric Type Process then this will indeed need to be something else for 89.

Though 89 doesn't hit beta until April 19, so that seems very reachable if we nail down our requirements soonish.

Thank you both for the inputs!

Looks like we need a sync-up to work out those details and figure out the best plan. Will send an invite to you soon.

Just a quick update on my previous recommendation on FOG, as we are now striving to get the whole system up and running soon (Firefox 87), it's a little risky to adopt FOG given the tight timeline. As such, please disregard my recommendation in Comment 3.

Let's stick to the plan that :sunahsuh proposed above, we will revisit the adoption of FOG in future iterations.

Thanks!

Assignee: nobody → najiang
Status: NEW → ASSIGNED
Priority: -- → P1
See Also: → 1689365
Attachment #9202631 - Attachment description: Bug 1688698 - WIP → Bug 1688698 - Add telemetry for sponsored TopSites

Hey Teon,

This patch only implements the data collection for the sponsored TopSites, in particular, the impression and click pings without client_id. The ones with client_id will be added later separately in order to limit the complexity of this patch.

I didn't list the whole payload schema in the request form, you can either reference the "Data Collection Proposal" doc or the telemetry documents in the patch for more details of the collection. Please let me know if you want to see the whole schema documented in the request.

Thank you!

Attachment #9203545 - Flags: data-review?(teon)

After confirming with Ryan Harter that we'd like to collect the sponsored TopSites impression/click with client_id as scalar counters.

Added that collection into this implementation as it didn't add too much complexity to this patch, nor to the data review request. Please disregard my previous comment.

Attachment #9203545 - Attachment is obsolete: true
Attachment #9203545 - Flags: data-review?(teon)
Attachment #9203595 - Flags: data-review?(teon)
Blocks: 1693393

Comment on attachment 9203545 [details]
data_review_request_contextual_services_topsites.md

DATA COLLECTION REVIEW RESPONSE:

Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate?

Yes. This collection is Telemetry so is documented in its definitions file Events.yaml and the Probe Dictionary.

Is there a control mechanism that allows the user to turn the data collection on and off?

Yes. This collection is Telemetry so can be controlled through Firefox's Preferences.

If the request is for permanent data collection, is there someone who will monitor the data over time?

Yes, :nanj is responsible.

Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 3

Is the data collection request for default-on or default-off?

Default on for all channels.

Does the instrumentation include the addition of any new identifiers?

Yes.

Is the data collection covered by the existing Firefox privacy notice?

No.

Does there need to be a check-in in the future to determine whether to renew the data?

No. This collection is permanent.


Result: datareview-

Flags: needinfo?(emily)
Attachment #9203545 - Flags: data-review-

Teon, thanks for your feedback!

re: "Is the data collection covered by the existing Firefox privacy notice?", I believe the answer is "no", and we will update the policy notice to reflect the changes with this feature.

/cc Mika, as I promised to tag you in all the data reviews for this project :)

[Tracking Requested - why for this release]:

We need this telemetry for the feature rollout in 87.

Responding to comment 13:
(1) The Privacy Notice does need to be changed for this data request. We will post this to our public governance forum for comment prior to making that change.

(2) Because this involves category 3 data mitigations have been identified and these will also be published. As part of that, and because this is related to a feature, there will also be a future checkin to evaluate the continued need for this data.

Flags: needinfo?(emily)

STR for QA:

  • Make sure browser.topsites.useRemoteSetting is set to true
  • Creat a new boolean pref browser.topsites.experiment.ebay-2020-1 and set it to true
  • Enable PingCentre logging browser.ping-centre.log
  • Open the browser console to view the PingCentre logs
  • Open a newtab page

Test points:

  • Impression

You should be able to see the impression ping in the browser console as follows (note: the context_id will be different):

TELEMETRY PING (activity-stream): {"experiments":{},"locale":"en-US","version":"88.0a1","release_channel":"default","position":1,"tile_id":-1,"advertiser":"amazon","source":"newtab","context_id":"{61c00d4f-a3a9-8c4f-8b07-d9b3c7214554}"}

Now open "about:telemetry#keyed-scalars-tab", you should see a keyed scalar called contextual.services.topsites.impression, and its name is newtab_1, and its value is 1 (or whatever impressions you have seen).

  • Click

You should be able to see the click ping in the browser console as follows (note: the context_id will be different). Note that the impression ping and click ping happen to share the same payload now, you can use the network activity monitor to see that those two pings are sent to different endpoints. (impression to */topsites-impresssion/* and click to */topsites-click/*)

TELEMETRY PING (activity-stream): {"experiments":{},"locale":"en-US","version":"88.0a1","release_channel":"default","position":1,"tile_id":-1,"advertiser":"amazon","source":"newtab","context_id":"{61c00d4f-a3a9-8c4f-8b07-d9b3c7214554}"}

Now open "about:telemetry#keyed-scalars-tab", you should see a keyed scalar called contextual.services.topsites.click, and its name is newtab_1, and its value is 1 (or whatever number that you have clicked).

  • TopSites collapse & expand

Collapsing the TopSites section on newtab should silence the impression ping as it's now not visible. Expanding it again should see the impression again.

Caveats

  • Since newtab caches the last rendered page for the fast page loading during the next browser startup. It will not send the impression ping if there is any sponsored TopSites shown on the first newtab. However, the click ping should be sent still.
Pushed by najiang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/c572be4456be Add telemetry for sponsored TopSites r=thecount
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 88 Branch

[Tracking Requested - why for this release]:

Reset the tracking flag that was unset in comment 16.

I have verified this bug on the latest Nightly 88.0a1 build (Build ID: 20210223230332) on Windows 10 x64, macOS 10.15.7 and Linux Mint 20. In order to verify this, I have used the instructions described in Comment 17.

  • The impression and click pings are correctly displayed in the browser console.
  • The impression and click keyed scalars are correctly registered in "about:telemetry" page.

However, it seems that the "click" telemetry is not registered if the Sponsored Top Sites are opened using middle click or the context menu options (Open in New Tab or Open in New Window). I have logged this behavior in Bug 1694629.

Status: RESOLVED → VERIFIED

(In reply to Cosmin Muntean [:cmuntean], Ecosystem QA from comment #21)

However, it seems that the "click" telemetry is not registered if the Sponsored Top Sites are opened using middle click or the context menu options (Open in New Tab or Open in New Window). I have logged this behavior in Bug 1694629.

Thanks for the validation!

Re: the middle click and the context menu click, those are "known issues" as we don't have any telemetry coverage for those user interactions in the newtab. While I don't think this would be a release blocker, let's track those missing pieces in Bug 1694629 for future work.

Comment on attachment 9202631 [details]
Bug 1688698 - Add telemetry for sponsored TopSites

Beta/Release Uplift Approval Request

  • User impact if declined: We need this for the Contextual Services/Sponsored Topsites experiments & rollout in 87.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This is a telemetry patch, doesn't change the user-facing code.
  • String changes made/needed: None
Attachment #9202631 - Flags: approval-mozilla-beta?

Comment on attachment 9202631 [details]
Bug 1688698 - Add telemetry for sponsored TopSites

Approved for 87.0b3.

Attachment #9202631 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Comment on attachment 9203595 [details]
data_review_request_contextual_services_topsites.md

for full discussion on the contextual services, please check https://bugzilla.mozilla.org/show_bug.cgi?id=1694693

Attachment #9203595 - Flags: data-review?(teon) → data-review-

I have verified this bug on the latest Beta 87.0b3 build (Build ID: 20210225185804) on Windows 10 x64, macOS 10.15.7 and Linux Mint 20.

  • The impression and click pings for the Sponsored Top Sites are correctly displayed in browser console.
  • The impression and click keyed scalars for the Sponsored Top Sites are correctly registered in "about:telemetry" page.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: