Closed
Bug 1382735
Opened 8 years ago
Closed 8 years ago
Review data collection for Top Stories in Activity Stream
Categories
(Firefox :: New Tab Page, enhancement, P2)
Firefox
New Tab Page
Tracking
()
RESOLVED
FIXED
Firefox 57
| Tracking | Status | |
|---|---|---|
| firefox57 | --- | fixed |
| firefox58 | --- | unaffected |
People
(Reporter: nanj, Unassigned)
References
Details
# Motivation
As Top Stories (powered by Pocket) will be shipped as one of features in Activity Stream system addon, we're looking to collect various metrics to,
* Analyze user's engagement with those recommended stories in Activity Stream
* Allow Pocket team to optimize the recommended content based on the user's engagement
# Target Audience
All users of Activity Stream
# Opt-out and Users Control
All Activity Stream users could opt-out this data collection by either disabling the Top Story feature in Activity Stream, or disabling the telemetry in Activity Stream.
# What to Collect
Similar to what we're currently collecting from the newtab page (i.e. Tiles), there are two types of pings that we're interested in if the Top Story feature is enabled in Activity Stream. For more details, see https://github.com/mozilla/activity-stream/pull/2911/files#diff-f92d07e23fd9de1eb2f4ad96bd041287R99
1) Impression ping. The browser reports an impression ping to the metrics server (i.e Onyx for Tiles) whenever the user opens a newtab. The payload contains following fields,
```js
{
"action": "activity_stream_impression_stats",
"client_id": "26288a14-5cc4-d14f-ae0a-bb01ef45be9c",
"session_id": "005deed0-e3e4-4c02-a041-17405fd703f6",
"addon_version": "1.0.12",
"locale": "en-US",
"source": "pocket",
"page": "about:newtab",
// "id" is a GUID generated by Pocket to record the recommended story served in the newtab
"tiles": [{"id": 10000}, {"id": 10001}, {"id": 10002}]
}
```
2) Click/Block/Save_to_Pocket ping. The browser reports a ping when the user interacts with the Top Stories tiles.
```js
{
"action": "activity_stream_impression_stats",
"client_id": "26288a14-5cc4-d14f-ae0a-bb01ef45be9c",
"session_id": "005deed0-e3e4-4c02-a041-17405fd703f6",
"addon_version": "1.0.12",
"locale": "en-US",
"source": "pocket",
"page": "about:newtab",
// "pos" is the 0-based index to record the tile's position in the Top Story section.
"tiles": [{"id": 10000, "pos": 0}],
// A 0-based index to record which tile in the "tiles" list that the user just interacted with.
"click|block|pocket": 0
}
```
# Correctness&Monitoring&Usage
The Activity Stream engineering team is responsible for ensuring the correctness of those metrics, which in turn will be used by both A-S team and Pocket team to analyze the user engagement and content optimization.
This data collection reuses the data pipeline for Tiles, which is being managed and monitored by Cloud-Services team. We will be applying the same data retention policy (6 month) to it as Tiles.
| Reporter | ||
Comment 1•8 years ago
|
||
Per Francois, this is [Category 3](https://wiki.mozilla.org/Firefox/Data_Collection#Data_Collection_Categories) data, which requires a full review. bsmedberg, could you review this please?
Note that the PR is on GH (https://github.com/mozilla/activity-stream/pull/2911), we merge code into m-c regularly.
Flags: needinfo?(benjamin)
Updated•8 years ago
|
Component: Activity Streams: General → Activity Streams: Newtab
Updated•8 years ago
|
Comment 2•8 years ago
|
||
I'll send an email about the process here under separate cover, and I think I understand the data but I want to clarify the following questions:
* Mozilla can the article IDs map back to particular topsite URLs that we provided (via pocket)
* We are recording
** each time we show the link
** each time the user clicks the link
** if/when the user blocks that link
* This data collection is tied to a client ID: is this the same telemetry client ID used for other telemetry?
* A user can disable this feature entirely by unchecking top stories in the newtab page
* Unchecking the "Allow Nightly to automatically send technical and interaction data to Mozilla" box in preferences will keep the feature enabled but will disable the data collection.
Assuming these things are true, this is
1) type 3 data
2) probably not ok as-is because we're tying it to the telemetry clientID
3) may be ok with some changes, but we need to talk through the possible changes which would allow you to solve your business problem while providing as much risk reduction as possible
4) ultimately, whether to accept the risk here is a product decision.
5) will almost-certainly require changes in the Firefox privacy notice
Flags: needinfo?(benjamin)
Updated•8 years ago
|
Flags: needinfo?(najiang)
| Reporter | ||
Comment 3•8 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #2)
> * Mozilla can the article IDs map back to particular topsite URLs that we
> provided (via pocket)
True.
> * We are recording
> ** each time we show the link
> ** each time the user clicks the link
> ** if/when the user blocks that link
True.
> * This data collection is tied to a client ID: is this the same telemetry
> client ID used for other telemetry?
Yes.
> * A user can disable this feature entirely by unchecking top stories in the
> newtab page
Yes.
> * Unchecking the "Allow Nightly to automatically send technical and
> interaction data to Mozilla" box in preferences will keep the feature
> enabled but will disable the data collection.
Just to be clear, the telemetry of Activity Stream is controlled by "datareporting.healthreport.uploadEnabled" *AND* "browser.newtabpage.activity-stream.telemetry". Aside from Nightly, eventually, we're looking to enable this telemetry for all the channels.
Flags: needinfo?(najiang)
| Reporter | ||
Comment 4•8 years ago
|
||
Thanks everyone for the feedback!
Here are a few notes from that meetings,
* The impression ping is of Category 2.
* The ClientID should not be included in the click/block/save-to-pocket ping. Rather, Activity-Stream and Pocket team will come up with a solution to ensure the uniqueness without relying on any client GUID. This also makes this ping downgrade to be Category 2, correct?
:francois, :ellee Makes sense?
Flags: needinfo?(francois)
Flags: needinfo?(ellee)
Comment 5•8 years ago
|
||
(In reply to Nan Jiang [:nanj] from comment #4)
> * The ClientID should not be included in the click/block/save-to-pocket
> ping. Rather, Activity-Stream and Pocket team will come up with a solution
> to ensure the uniqueness without relying on any client GUID. This also makes
> this ping downgrade to be Category 2, correct?
In my own notes from that meeting I have Elvin saying that removing the client_id would not suffice because we're still recording what URLs they are visiting.
My understanding is that removing user identifiers doesn't change the fact that partial browsing history is Category 3 data.
Flags: needinfo?(francois)
Comment 6•8 years ago
|
||
Hi, it doesn't cleanly 'downgrade' the data to Category 2 but it makes it much less concerning.
I did say that removing clientid may not be enough, but after discussing it further at the meeting I revised my assessment towards the end, and I believe where we left it was that Nate and Tim were going to explore how difficult it would be to create a separate ping, with no identifiers, that just communicated the click/link information.
Flags: needinfo?(ellee)
Updated•8 years ago
|
Updated•8 years ago
|
Updated•8 years ago
|
Comment 7•8 years ago
|
||
Did this review happen as part of https://github.com/mozilla/activity-stream/pull/2911 or elsewhere?
Flags: needinfo?(najiang)
| Reporter | ||
Comment 8•8 years ago
|
||
Yes, this is closed by PR 2911.
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(najiang)
Resolution: --- → FIXED
Updated•8 years ago
|
Target Milestone: --- → Firefox 57
| Assignee | ||
Updated•6 years ago
|
Component: Activity Streams: Newtab → New Tab Page
You need to log in
before you can comment on or make changes to this bug.
Description
•