Closed Bug 1382735 Opened 8 years ago Closed 8 years ago

Review data collection for Top Stories in Activity Stream

Categories

(Firefox :: New Tab Page, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
Firefox 57
Tracking Status
firefox57 --- fixed
firefox58 --- unaffected

People

(Reporter: nanj, Unassigned)

References

Details

# Motivation As Top Stories (powered by Pocket) will be shipped as one of features in Activity Stream system addon, we're looking to collect various metrics to, * Analyze user's engagement with those recommended stories in Activity Stream * Allow Pocket team to optimize the recommended content based on the user's engagement # Target Audience All users of Activity Stream # Opt-out and Users Control All Activity Stream users could opt-out this data collection by either disabling the Top Story feature in Activity Stream, or disabling the telemetry in Activity Stream. # What to Collect Similar to what we're currently collecting from the newtab page (i.e. Tiles), there are two types of pings that we're interested in if the Top Story feature is enabled in Activity Stream. For more details, see https://github.com/mozilla/activity-stream/pull/2911/files#diff-f92d07e23fd9de1eb2f4ad96bd041287R99 1) Impression ping. The browser reports an impression ping to the metrics server (i.e Onyx for Tiles) whenever the user opens a newtab. The payload contains following fields, ```js { "action": "activity_stream_impression_stats", "client_id": "26288a14-5cc4-d14f-ae0a-bb01ef45be9c", "session_id": "005deed0-e3e4-4c02-a041-17405fd703f6", "addon_version": "1.0.12", "locale": "en-US", "source": "pocket", "page": "about:newtab", // "id" is a GUID generated by Pocket to record the recommended story served in the newtab "tiles": [{"id": 10000}, {"id": 10001}, {"id": 10002}] } ``` 2) Click/Block/Save_to_Pocket ping. The browser reports a ping when the user interacts with the Top Stories tiles. ```js { "action": "activity_stream_impression_stats", "client_id": "26288a14-5cc4-d14f-ae0a-bb01ef45be9c", "session_id": "005deed0-e3e4-4c02-a041-17405fd703f6", "addon_version": "1.0.12", "locale": "en-US", "source": "pocket", "page": "about:newtab", // "pos" is the 0-based index to record the tile's position in the Top Story section. "tiles": [{"id": 10000, "pos": 0}], // A 0-based index to record which tile in the "tiles" list that the user just interacted with. "click|block|pocket": 0 } ``` # Correctness&Monitoring&Usage The Activity Stream engineering team is responsible for ensuring the correctness of those metrics, which in turn will be used by both A-S team and Pocket team to analyze the user engagement and content optimization. This data collection reuses the data pipeline for Tiles, which is being managed and monitored by Cloud-Services team. We will be applying the same data retention policy (6 month) to it as Tiles.
Per Francois, this is [Category 3](https://wiki.mozilla.org/Firefox/Data_Collection#Data_Collection_Categories) data, which requires a full review. bsmedberg, could you review this please? Note that the PR is on GH (https://github.com/mozilla/activity-stream/pull/2911), we merge code into m-c regularly.
Flags: needinfo?(benjamin)
Component: Activity Streams: General → Activity Streams: Newtab
I'll send an email about the process here under separate cover, and I think I understand the data but I want to clarify the following questions: * Mozilla can the article IDs map back to particular topsite URLs that we provided (via pocket) * We are recording ** each time we show the link ** each time the user clicks the link ** if/when the user blocks that link * This data collection is tied to a client ID: is this the same telemetry client ID used for other telemetry? * A user can disable this feature entirely by unchecking top stories in the newtab page * Unchecking the "Allow Nightly to automatically send technical and interaction data to Mozilla" box in preferences will keep the feature enabled but will disable the data collection. Assuming these things are true, this is 1) type 3 data 2) probably not ok as-is because we're tying it to the telemetry clientID 3) may be ok with some changes, but we need to talk through the possible changes which would allow you to solve your business problem while providing as much risk reduction as possible 4) ultimately, whether to accept the risk here is a product decision. 5) will almost-certainly require changes in the Firefox privacy notice
Flags: needinfo?(benjamin)
Flags: needinfo?(najiang)
(In reply to Benjamin Smedberg [:bsmedberg] from comment #2) > * Mozilla can the article IDs map back to particular topsite URLs that we > provided (via pocket) True. > * We are recording > ** each time we show the link > ** each time the user clicks the link > ** if/when the user blocks that link True. > * This data collection is tied to a client ID: is this the same telemetry > client ID used for other telemetry? Yes. > * A user can disable this feature entirely by unchecking top stories in the > newtab page Yes. > * Unchecking the "Allow Nightly to automatically send technical and > interaction data to Mozilla" box in preferences will keep the feature > enabled but will disable the data collection. Just to be clear, the telemetry of Activity Stream is controlled by "datareporting.healthreport.uploadEnabled" *AND* "browser.newtabpage.activity-stream.telemetry". Aside from Nightly, eventually, we're looking to enable this telemetry for all the channels.
Flags: needinfo?(najiang)
Thanks everyone for the feedback! Here are a few notes from that meetings, * The impression ping is of Category 2. * The ClientID should not be included in the click/block/save-to-pocket ping. Rather, Activity-Stream and Pocket team will come up with a solution to ensure the uniqueness without relying on any client GUID. This also makes this ping downgrade to be Category 2, correct? :francois, :ellee Makes sense?
Flags: needinfo?(francois)
Flags: needinfo?(ellee)
(In reply to Nan Jiang [:nanj] from comment #4) > * The ClientID should not be included in the click/block/save-to-pocket > ping. Rather, Activity-Stream and Pocket team will come up with a solution > to ensure the uniqueness without relying on any client GUID. This also makes > this ping downgrade to be Category 2, correct? In my own notes from that meeting I have Elvin saying that removing the client_id would not suffice because we're still recording what URLs they are visiting. My understanding is that removing user identifiers doesn't change the fact that partial browsing history is Category 3 data.
Flags: needinfo?(francois)
Hi, it doesn't cleanly 'downgrade' the data to Category 2 but it makes it much less concerning. I did say that removing clientid may not be enough, but after discussing it further at the meeting I revised my assessment towards the end, and I believe where we left it was that Nate and Tim were going to explore how difficult it would be to create a separate ping, with no identifiers, that just communicated the click/link information.
Flags: needinfo?(ellee)
Priority: -- → P2
Did this review happen as part of https://github.com/mozilla/activity-stream/pull/2911 or elsewhere?
Flags: needinfo?(najiang)
Yes, this is closed by PR 2911.
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(najiang)
Resolution: --- → FIXED
Target Milestone: --- → Firefox 57
Component: Activity Streams: Newtab → New Tab Page
You need to log in before you can comment on or make changes to this bug.