Closed Bug 1203577 Opened 4 years ago Closed 4 years ago

Remove hardcoded adgroup buckets.

Categories

(Firefox :: New Tab Page, defect)

40 Branch
defect
Not set

Tracking

()

RESOLVED FIXED
Firefox 43
Tracking Status
firefox41 + fixed
firefox42 --- fixed
firefox43 --- fixed

People

(Reporter: emtwo, Assigned: emtwo)

References

Details

Attachments

(2 files, 1 obsolete file)

No description provided.
There were a lot of meetings/discussions/emails that resulted in bug 1152145. Is there some documented approval to the community that this is being done? I believe we were recommended of getting buy-in from dev-planning for this type of change.
Comment on attachment 8660148 [details] [diff] [review]
Remove hardcoded buckets and require adgroup_name from server

The .rst would need to be updated and get feedback approval with the additional scope of data collection. I believe the main concern was removing this safety, Mozilla could query anybody's history. I don't have the full context of what has changed since we did implement the checks that are being removed.

http://mxr.mozilla.org/mozilla-central/source/browser/docs/DirectoryLinksProvider.rst#195

bsmedberg/merwin, who should we ?feedback for this change?
Flags: needinfo?(merwin)
Flags: needinfo?(benjamin)
Added .rst change.

Note: This bug was initially requested by dougt so I presume he had some discussions already about the implications of removing hardcoded categories and can chime in here.
Attachment #8660148 - Attachment is obsolete: true
Attachment #8660148 - Flags: review?(edilee)
Attachment #8660180 - Flags: review?(edilee)
Regarding comment 2, Ed, I was the point on making this requirement.  Things have changed.  We are in the middle of moving to a faster delivering of the newtabpage under the GoFaster work.  Part of the payload that is possible to remote is not only these buckets, but the test against these buckets.  So, this belt and suspender just isn't going to work anymore.  Therefore, I am advocating that we remove this stuff now.  Given that you put up patches, I assume you agree with my assessment.  Feel free to make whatever dev.planning post you'd like.  Happy to help you draft it.
I'm comfortable with this change, given that Doug established the requirement previously, that it doesn't involve any additional data collection, and that I don't think it materially changes the privacy analysis. There are a number of ongoing Tiles efforts we need to check in on and discuss, but this one is fine from my perspective. I'd defer to bsmedberg regarding whether he want updated documentation or further discussion.
Flags: needinfo?(merwin)
(In reply to merwin from comment #6)
> I don't think it materially changes the privacy analysis
The requirement of hardcoded buckets came about to address various concerned raised by dougt and others, e.g., the potential of sending very sensitive data to Mozilla servers without users even noticing until it's too late. From comment 5, it sounds like we'll be sending more than what we have been in order to do tests without any technical safeguards on what the tests can do.

When the feature was originally going to go to Release, these and related concerns were serious enough to have people want to delay it. I hear that things have changed, but I don't have the context to be able to explain why it's now okay to do exactly the very concerning data collection that would have killed the feature and/or cause irreparable harm to Mozilla's brand.

I could make a post to dev.planning about the concerns raised from before, but I don't have the context to explain why it's okay now other than "Things have changed."
Comment on attachment 8660180 [details] [diff] [review]
v2: Remove hardcoded buckets and require adgroup_name from server

I did not think the hardcoding was necessary to begin with, but it was good defense-in-depth while this was being first launched.

My concerns here is around transparency and testing.

Transparency: the lists that the client downloads should be publicly available and there should be a permanent archive of them. Will they be maintained in some sort of source control, or is this in a database? What are your plans to achieve transparency?

List selection and testing: is there a public policy about how lists will be selected to avoid user history phishing? I am not the data steward for that selection process, but I do need to see that there is somebody responsible for it and there is a public policy around that process. If we plan to test groupings on small groups of users, how will that be accomplished? I'd like to focus early testing on the opt-in telemetry audience as much as possible.
Flags: needinfo?(benjamin) → needinfo?(edilee)
Fyi, I'm not working on Tiles, so I don't have the context to answer the questions from comment 8... dougt/kghim?
Flags: needinfo?(edilee) → needinfo?(kghim)
Comment on attachment 8660180 [details] [diff] [review]
v2: Remove hardcoded buckets and require adgroup_name from server

r=Mardak

>+++ b/browser/docs/DirectoryLinksProvider.rst
> - ``frecent_sites`` - array of strings of the sites that can trigger showing a
>   Suggested Tile if the user has the site in one of the top 100 most-frecent
>-  pages. Only preapproved array of strings that are hardcoded into the
>-  DirectoryLinksProvider module are allowed.
>+  pages.
The patch is fine for the removal of the hardcoded functionality. This still needs f+ for the data collection change -- probably from bsmedberg once his questions from comment 8 are answered. Depending on the answers, it might be useful to include some of it in the .rst.

>+++ b/browser/modules/DirectoryLinksProvider.jsm
> // Suggested sites should always have an adgroup name.
Probably should say "must" instead of "should always" ;)
Attachment #8660180 - Flags: review?(edilee) → review+
(In reply to Ed Lee :Mardak from comment #10)
> > // Suggested sites should always have an adgroup name.
> Probably should say "must" instead of "should always" ;)
Actually, on second thought, I'm not sure if this is the exact logic we want, but it's fine for now.

We could have a custom server provided description and there would be no need for the adgroup name. So we could check for either... unless the custom description makes use of the adgroup name.... Probably just file a bug for this.
And to add some context I got from IRC: the plan is to get this uplifted so that it'll be available in the next Beta (42) going live next week.

I would assume this is so tests can be run Suggested Tiles with arbitrary lists on the Beta audience.
> the lists that the client downloads should be publicly available and there should be a permanent archive of them. Will they be maintained in some sort of source control, or is this in a database? What are your plans to achieve transparency?

Yes. You can see all of the buckets for each region and for all products here:

    http://rlr.github.io/webtiles-previewer-redux/

We will have date range support eventually in this engineering app so that anyone can see what we ran during a given period.

The current set of add ads and buckets is here:

   https://tiles.services.mozilla.com/v3/links/fetch/en-US/release (you can change the channel and the product)

As for the historic tile data (including the set of urls in the buckets), we don't have anything like that right now.  I'll ask Tim to build something like that into Splice 2 and expose that in the above engineeering tool.


> List selection and testing: is there a public policy about how lists will be selected to avoid user history phishing? I am not the data steward for that selection process, but I do need to see that there is somebody responsible for it and there is a public policy around that process.

Yes, that is Marshall.  He can explain how we protect user's privacy wrt tile selection.  


> If we plan to test groupings on small groups of users, how will that be accomplished? I'd like to focus early testing on the opt-in telemetry audience as much as possible.

Good suggested. However we're not testing on small groups yet. We might use the remote new tab work to do something like this.
Comment on attachment 8660180 [details] [diff] [review]
v2: Remove hardcoded buckets and require adgroup_name from server

ok, LGTM
Attachment #8660180 - Flags: feedback+
> 
> > List selection and testing: is there a public policy about how lists will be selected to avoid user history phishing? I am not the data steward for that selection process, but I do need to see that there is somebody responsible for it and there is a public policy around that process.
> 
> Yes, that is Marshall.  He can explain how we protect user's privacy wrt
> tile selection.  

Here is the criteria.  Kevin owns the process for this.
https://docs.google.com/document/d/1pXzKqn6Gwr4WKEHruobL_0UO_9nMDlGixTOCwqonPg4/edit#

Here is where we describe this publicly.
https://blog.mozilla.org/privacy/2015/05/21/putting-our-data-privacy-principles-into-action/
These changes are mostly for the expectation that adgroup_name will always be there and getFrecentSitesName() no longer exists.
Attachment #8660954 - Flags: review?(edilee)
Attachment #8660954 - Flags: review?(edilee) → review+
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #8)

> Transparency: the lists that the client downloads should be publicly
> available and there should be a permanent archive of them. Will they be
> maintained in some sort of source control, or is this in a database? What
> are your plans to achieve transparency?

Once we have fully tested out the performance aspects of the audience buckets, I plan to make them public. The vision is to have direct user input influence these buckets and be able to add and correct additional URLs and keywords.
Flags: needinfo?(kghim)
> Once we have fully tested out the performance aspects of the audience buckets, I plan to make them public.

To avoid confusion, these buckets *are* public.  They are available at the url I referenced in comment 13.


>  The vision is to have direct user input influence these buckets and be able to add and correct additional URLs and keywords.

+1 Long term plan. Very consistent with user being in control.
https://hg.mozilla.org/mozilla-central/rev/14ba12f4b715
https://hg.mozilla.org/mozilla-central/rev/08cf2a208a60
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 43
Comment on attachment 8660180 [details] [diff] [review]
v2: Remove hardcoded buckets and require adgroup_name from server

Approval Request Comment
[Feature/regressing bug #]: 1152145

[User impact if declined]: No direct user impact but with this change we have more flexibility to provide users with a wider variety of tile suggestions.

[Describe test coverage new/current, TreeHerder]: Updates to several mochitests and xpcshell tests in the attached patches

[Risks and why]: Low risk: Removing the hardcoded buckets does not change our logic in suggesting tiles and there are several tests that cover these changes.

[String/UUID change made/needed]: N/A
Attachment #8660180 - Flags: approval-mozilla-aurora?
Comment on attachment 8660180 [details] [diff] [review]
v2: Remove hardcoded buckets and require adgroup_name from server

Let's take it, we have time to fix potential regressions and it will improve the feature maintainability.
Attachment #8660180 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Tracked for 41 to ensure if we take the fix, it lands before RC2.
Marina, could you please nominate this patch for uplift to Beta? I will gtb RC2 in the next hour or so. Thanks!
Flags: needinfo?(msamuel)
Comment on attachment 8660180 [details] [diff] [review]
v2: Remove hardcoded buckets and require adgroup_name from server

Approval Request Comment
[Feature/regressing bug #]: bug 1152145

[User impact if declined]: No direct user impact but with this change we have more flexibility to provide users with a wider variety of tile suggestions.

[Describe test coverage new/current, TreeHerder]: Updates to several mochitests and xpcshell tests in the attached patches. Landed in m-c and aurora.

[Risks and why]: Low risk: Removing the hardcoded buckets does not change our logic in suggesting tiles and there are several tests that cover these changes.

[String/UUID change made/needed]: N/A
Flags: needinfo?(msamuel)
Attachment #8660180 - Flags: approval-mozilla-beta?
Comment on attachment 8660180 [details] [diff] [review]
v2: Remove hardcoded buckets and require adgroup_name from server

[Triage Comment] This patch will improve the suggested tiles experience by showing only relevant tiles to end-users and fix a recent glitch. After discussing this issue with DougT, it seems like a must fix for 41. Let's uplift to m-r and include in 41 RC2.
Attachment #8660180 - Flags: approval-mozilla-beta? → approval-mozilla-release+
Marina, can you please confirm that we only need to uplift patch v2 on to 41 (moz-release branch)?
Flags: needinfo?(msamuel)
I'm getting failures like https://treeherder.mozilla.org/logviewer.html#?job_id=99952&repo=mozilla-release with just the one patch uplifted. Looks like the other patch touches that test.
Ritu, both patches will need to be uplifted together.
Flags: needinfo?(msamuel)
Comment on attachment 8660954 [details] [diff] [review]
v1: DirectoryLinksProvider test updates

Sorry for the confusion, I only made the request for the first patch. Adding the request here as well.
Attachment #8660954 - Flags: approval-mozilla-release?
Attachment #8660954 - Flags: approval-mozilla-beta?
Comment on attachment 8660954 [details] [diff] [review]
v1: DirectoryLinksProvider test updates

Approved for uplift to m-r as this is a test only change. This is perhaps the reason why the XPCshell tests were failing on mozilla-release.
Attachment #8660954 - Flags: approval-mozilla-release?
Attachment #8660954 - Flags: approval-mozilla-release+
Attachment #8660954 - Flags: approval-mozilla-beta?
You need to log in before you can comment on or make changes to this bug.