Closed Bug 1522309 Opened 5 years ago Closed 5 years ago

Cookie Restrictions using the Strict list study

Categories

(Data Science :: Experiment Collaboration, task)

x86_64
Other
task
Not set
normal
Points:
3

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: tanvi, Assigned: flawrence)

References

(Blocks 1 open bug)

Details

Attachments

(4 files, 10 obsolete files)

19 bytes, text/plain
tdsmith
: review+
Details
106.47 KB, application/zip
Details
110.77 KB, application/x-xpinstall
Details
110.77 KB, application/x-xpinstall
Details

Brief Description of the request:
We would like to run a pref-flip like study that blocks tracking cookies from the strict list and the basic list, in two separate cohorts. Since there are two different prefs for this, we have written an addon that can flip multiple prefs at once.

Any timelines for the request or how this fits into roadmaps: Start the study in Q1 and let it run into Q2 for 8 weeks+.

Links to any assets (e.g Start of a PHD, BRD; any document that helps describe the project):
Description of Basic and Strict lists:
https://docs.google.com/document/d/1yvLLVUS5Bs29a20m2INz2Wt47YB8lO6UvR-jSyslFl4/edit

This will be similar to a current pref-flip study, that turns on Cookie Restrictions for the Basic list: https://docs.google.com/document/d/1Lx688hjg7XOU0mhSXLMl2sFK5xsJIVx5406IpaqvvBo/edit?ts=5be0603e

Name of Data Scientist (If Applicable): flawrence

More details:
3-5 Cohorts, including the Control.
Cohort 1 - control.
Cohort 2 - block tracking cookies with basic list.
Cohort 3 - block tracking cookies with strict list.
[optional] Cohort 4 - block all third party cookies.
[optional] Cohort 5 - turn on tracking protection with the basic list, but don't block any cookies.

Note that this is very similar to the work that Felix is doing for an ongoing pref-flip study and an upcoming breakage study.

I spoke to Felix about this to get his opinion. I think we can stick to just 2 branches. The percentages are not set in stone, but an example:

branch 1 - control - 0.4% of the release population.

  • no pref changes

branch 2 - block tracking cookies with strict list - 0.4% of the release population.

  • network.cookie.cookieBehavior = 4 (set this pref back to default on uninstall)
  • urlclassifier.trackingAnnotationTable = test-track-simple,base-track-digest256,content-track-digest256 (set this pref back to default on uninstall)

Nihanth, I think this is the info you need to finish the manifest file for your addon and have it ready for QA testing. Please enter these prefs/branches and let me know if you have any questions. Once its ready, let Tony know. Thanks!

Yes, this is extremely similar to bug 1506908. In fact we can simply reuse its power analysis and dedicate 0.4% to each branch. And I should be able to re-use my analysis script too. (So I'm grabbing this bug :P)

I would recommend not including branches 4 and 5: we learned in #1506908 that blocking all third party cookies does increase churn; we don't need to re-learn that at the expense of our users (we now know that our metrics are indeed sensitive enough to detect some negative effects related to cookie blocking).

I would also recommend leaving out branch 2; we are running three separate experiments comparing branch 2 to branch 1 (bug 1506908, bug 1518855, bug 1513103), and we have good reason to believe that branch 2's effect is going to be somewhere between branch 1 and branch 3; let's compare the strict list to no blocking because that should be the largest effect, and we want to avoid boiling frogs.

Implementation gotchas to avoid:

  • We want the control branch not to block tracking cookies even if a new version of Firefox rolls out that blocks tracking cookies. These users should be reset to the new default when unenrolled.
  • This conflicts with the four experiments detailed in the above three bugs; we need the enrollments to be mutually exclusive.
Assignee: nobody → flawrence
Status: NEW → ASSIGNED
Points: --- → 3

One more gotcha to avoid:

  • Make sure we exclude users with non-default settings and ad blockers installed during enrollment.

(In reply to Tanvi Vyas[:tanvi] from comment #4)

One more gotcha to avoid:

  • Make sure we exclude users with non-default settings and ad blockers installed during enrollment.

Do you know if this is something that can be done from Normandy? Or is this enrollment condition something I need to implement?

Flags: needinfo?(tanvi)

(In reply to Nihanth Subramanya [:nhnt11] from comment #5)

(In reply to Tanvi Vyas[:tanvi] from comment #4)

One more gotcha to avoid:

  • Make sure we exclude users with non-default settings and ad blockers installed during enrollment.

Do you know if this is something that can be done from Normandy? Or is this enrollment condition something I need to implement?

I believe this is all done with Normandy.

Flags: needinfo?(tanvi)

Yes, looks like we can filter out active addons with the addons object, and preferences with preferenceIsUserSet(prefkey).

Hi Felix, I had a couple of questions that I think you're the right person to ask:

  1. We seem to have decided that if any of the target prefs were modified by the user during the study, then at the end of the study, while cleaning up, we simply don't clean up those user-touched prefs. Shouldn't we end the study instead since the user at that point is a compromised data point for the study?

We can detect if the user modifies any of the target prefs at runtime (using a pref listener) and immediately take action, or we can check at uninstall-time and take action then. The former is probably the better option for data integrity (e.g. if the user changes the pref for a couple of days and then changes it back, we'll miss that at uninstall-time); the latter is the easier of the two to implement.

  1. If, at install-time, any of the target prefs already have user values, we don't touch any prefs - i.e. abort mission. In this case as well, should we end the study and uninstall the addon? Probably not too important since I believe we already filter these users with non-default prefs via Normandy, but the addon does a second check as well.

Let me know if there's any context that I failed to mention for these questions to make full sense. Thanks!

Flags: needinfo?(flawrence)

Oh, and one more thing: is there any special telemetry that you'd like this addon to send? For example, I think in the past, we've sent pref values every day via Telemetry pings which enable us to check if users have been messing with them - I'm not 100% sure though, I wasn't involved in any such previous studies.

  1. Once someone goes into the experiment, they must not come out of the experiment until it ends for everyone - otherwise you get bias. If we unenroll them from the experiment then my naive queries would count them as having churned, which is not true. So let's not do that.

If we believed that large numbers (e.g. 0.5% of experiment subjects) were going to change their prefs during the experiment, then yes it would be worth monitoring the preferences to see if they changed them - so that we could use this as a direct signal that the user was unhappy with their branch.

Remember that we're testing the effect of a change to the defaults - if someone overrides the defaults we set, then that doesn't mean we should ignore them as an irrelevant data point. As in, we're testing "what would happen if we defaulted to blocking strict list cookies", not "what would happen if we always blocked strict list cookies".

  1. The ideal for me is that if there are user values then the user never gets enrolled and never sends in any data saying they were enrolled. The next most ideal case is that the analytics treats them as regular test subjects and they add a small amount of unbiased white noise to the experiment. I don't take "uninstall" event at face value because I expect them to be subtly biased in experiment-ruining ways :)
Flags: needinfo?(flawrence)

(In reply to Felix Lawrence from comment #10)

  1. Once someone goes into the experiment, they must not come out of the experiment until it ends for everyone - otherwise you get bias. If we unenroll them from the experiment then my naive queries would count them as having churned, which is not true. So let's not do that.

If we believed that large numbers (e.g. 0.5% of experiment subjects) were going to change their prefs during the experiment, then yes it would be worth monitoring the preferences to see if they changed them - so that we could use this as a direct signal that the user was unhappy with their branch.

Remember that we're testing the effect of a change to the defaults - if someone overrides the defaults we set, then that doesn't mean we should ignore them as an irrelevant data point. As in, we're testing "what would happen if we defaulted to blocking strict list cookies", not "what would happen if we always blocked strict list cookies".

  1. The ideal for me is that if there are user values then the user never gets enrolled and never sends in any data saying they were enrolled. The next most ideal case is that the analytics treats them as regular test subjects and they add a small amount of unbiased white noise to the experiment. I don't take "uninstall" event at face value because I expect them to be subtly biased in experiment-ruining ways :)

I think that gives me enough to proceed, thanks. I will not make the addon uninstall itself. Thanks! Could you comment on this follow-up from comment 9 please?

(In reply to Nihanth Subramanya [:nhnt11] from comment #9)

Oh, and one more thing: is there any special telemetry that you'd like this addon to send? For example, I think in the past, we've sent pref values every day via Telemetry pings which enable us to check if users have been messing with them - I'm not 100% sure though, I wasn't involved in any such previous studies.

Flags: needinfo?(flawrence)

What's the duration of enrollment for this study? The basic list study ran for 8 weeks with 1 week of enrollment - should it be the same for this study? I.e. end the study, cleanup, and uninstall after 7 days of enrollment?

Flags: needinfo?(tanvi)

Oh, and one more thing: is there any special telemetry that you'd like this addon to send? For example, I think in the past, we've sent pref values every day via Telemetry pings which enable us to check if users have been messing with them - I'm not 100% sure though, I wasn't involved in any such previous studies.

No, I don't think this is worthwhile: if an appreciable number of users were going to change these preferences then this would give us useful information (i.e. a clear signal that users were unhappy with the default we thrust upon them). But in the absence of evidence otherwise, I think it's highly unlikely that a detectable % of users will do this, so IMO it's not worth the engineering effort unless it's very easy to do, and not worth the analysis effort unless it's automated.

What's the duration of enrollment for this study? The basic list study ran for 8 weeks with 1 week of enrollment - should it be the same for this study? I.e. end the study, cleanup, and uninstall after 7 days of enrollment?
Can all this be handled by Normandy? If so should we leave it there as a single source of truth, and so that we can more easily change our minds later (ending it early or later)? Maybe ask someone who's more familiar with building add-on experiments?

Thanks for clarifying the "end the study, cleanup and uninstall after 7 days of enrollment" point - this is the opposite of the intended functionality! The intention is to enroll one week's worth of users and keep them in the experiment for 8 weeks. One of the goals of the experiment is to look for long term effects on users.

Flags: needinfo?(flawrence)

Here is the initial version of the add-on: tested, reviewed, signed, and ready for QA.

Please see https://github.com/nhnt11/multipreffer/blob/master/README.md for details on the working of the pref-flipping mechanism.

Please see https://github.com/mozilla/cookie-restrictions-strict-list-study/blob/master/src/variations.json for the cohort definitions.

Assignee: flawrence → nhnt11
Flags: needinfo?(tanvi)

(In reply to Felix Lawrence from comment #13)

Oh, and one more thing: is there any special telemetry that you'd like this addon to send? For example, I think in the past, we've sent pref values every day via Telemetry pings which enable us to check if users have been messing with them - I'm not 100% sure though, I wasn't involved in any such previous studies.

No, I don't think this is worthwhile: if an appreciable number of users were going to change these preferences then this would give us useful information (i.e. a clear signal that users were unhappy with the default we thrust upon them). But in the absence of evidence otherwise, I think it's highly unlikely that a detectable % of users will do this, so IMO it's not worth the engineering effort unless it's very easy to do, and not worth the analysis effort unless it's automated.

What's the duration of enrollment for this study? The basic list study ran for 8 weeks with 1 week of enrollment - should it be the same for this study? I.e. end the study, cleanup, and uninstall after 7 days of enrollment?
Can all this be handled by Normandy? If so should we leave it there as a single source of truth, and so that we can more easily change our minds later (ending it early or later)? Maybe ask someone who's more familiar with building add-on experiments?

Thanks for clarifying the "end the study, cleanup and uninstall after 7 days of enrollment" point - this is the opposite of the intended functionality! The intention is to enroll one week's worth of users and keep them in the experiment for 8 weeks. One of the goals of the experiment is to look for long term effects on users.

Thanks for the clarifications Felix! I've set the study to expire after 56 days - this needs to be baked into the addon to make it uninstall itself.

Accidentally assigned this to myself. Reassigning to Felix.

Assignee: nhnt11 → flawrence

(In reply to Tanvi Vyas[:tanvi] from comment #2)

I spoke to Felix about this to get his opinion. I think we can stick to just 2 branches. The percentages are not set in stone, but an example:

branch 1 - control - 0.4% of the release population.

  • no pref changes

branch 2 - block tracking cookies with strict list - 0.4% of the release population.

  • network.cookie.cookieBehavior = 4 (set this pref back to default on uninstall)
  • urlclassifier.trackingAnnotationTable = test-track-simple,base-track-digest256,content-track-digest256 (set this pref back to default on uninstall)

Nihanth, I think this is the info you need to finish the manifest file for your addon and have it ready for QA testing. Please enter these prefs/branches and let me know if you have any questions. Once its ready, let Tony know. Thanks!

I just want to note that for the Control cohort, I interpreted "no pref changes" as:

  • network.cookie.cookieBehavior = 0
  • urlclassifier.trackingAnnotationTable = test-track-simple,base-track-digest256

I think this is an important subtlety since if we run the study on 66, if we do no pref changes on the Control, the defaults are identical to the Experiment cohort resulting in duplicate Cohorts.

Experimenter link

Yes, "no pref changes" is ambiguous. The control branch should have no third party cookies blocked - even if we default to blocking basic list cookies by the time we start this experiment.

the defaults are identical to the Experiment cohort
If this is the case, then where is the switch that determines whether we use the basic list or the strict list?

(In reply to Nihanth Subramanya [:nhnt11] from comment #15)

Thanks for the clarifications Felix! I've set the study to expire after 56 days - this needs to be baked into the addon to make it uninstall itself.

What if we want the experiment to run for longer? Can Normandy handle this and make sure the addon in uninstalled after X days, where we decide what X is later?

Comment on attachment 9041171 [details]
cookie-restrictions-strict-list-study@shield.mozilla.org-1.0-signed.xpi

(In reply to Tanvi Vyas[:tanvi] from comment #19)

(In reply to Nihanth Subramanya [:nhnt11] from comment #15)

Thanks for the clarifications Felix! I've set the study to expire after 56 days - this needs to be baked into the addon to make it uninstall itself.

What if we want the experiment to run for longer? Can Normandy handle this and make sure the addon in uninstalled after X days, where we decide what X is later?

Missed this, sorry; for posterity, we clarified this and Normandy can indeed handle uninstalling the addon.

I'm going to upload a new version of the add-on with indefinite expiry.

Attachment #9041171 - Attachment is obsolete: true

This has no changes except for expiry set to 365 days. It's a mandatory parameter so I just made it a super large value.

Let me know if you think we should make it something more reasonable like 16 or 24 weeks. I'd prefer just leaving it at 365 days so it's obvious that we don't intend to use that value (now that I'm saying this, maybe I should have made it 9999 or something...).

Johann, I believe this add on had your peer review? Is that correct?

Flags: needinfo?(jhofmann)

Yup, r=me

Flags: needinfo?(jhofmann)
Attached file Needs Design Review β€”
Attachment #9045016 - Flags: review?(ethompson)

Here is a new build (signed) with updated code to handle the browser.contentblocking.category pref correctly.

Johann r+'d this here: https://github.com/mozilla/cookie-restrictions-strict-list-study/pull/6#pullrequestreview-208145689

Attachment #9042408 - Attachment is obsolete: true

Johann, Tanvi mentioned that you're changing the behavior of the browser.contentblocking.category pref in 67.

Could you please comment on how the behavior is changing? Considering this add-on relies on BrowserGlue's mechanism to recompute the category pref, it's important to make sure that either the changes you're making don't affect this add-on or we cover the new behavior - since this study might run in 67 as well.

Thanks!

Flags: needinfo?(jhofmann)
Attachment #9045016 - Flags: review?(ethompson)

(In reply to Nihanth Subramanya [:nhnt11] from comment #26)

Johann, Tanvi mentioned that you're changing the behavior of the browser.contentblocking.category pref in 67.

Could you please comment on how the behavior is changing? Considering this add-on relies on BrowserGlue's mechanism to recompute the category pref, it's important to make sure that either the changes you're making don't affect this add-on or we cover the new behavior - since this study might run in 67 as well.

Thanks!

Hey, this is happening in 1529517, we're adding new prefs that control the features that are shown in the Content Blocking UI. I hope the bug description makes things clear, the idea is that in 67 or early 68, depending on when this lands, you won't directly control network.cookieBehavior and friends anymore but change the feature set in e.g. browser.contentblocking.features.standard. This allows our UI to react to the pref settings appropriately.

Flags: needinfo?(jhofmann)

signed xpi for testing purposes

Comment on attachment 9045016 [details]
Needs Design Review

data science r+ on the basis that this is the same design as an experiment that's already worked :)
Attachment #9045016 - Flags: review+

Final version for signing please.

Flags: needinfo?(mcooper)
Flags: needinfo?(mcooper)

Here's a signed build of the new version of the add-on which sets prefs on the default branch.

Attachment #9046800 - Attachment is obsolete: true
Attachment #9050328 - Attachment is obsolete: true
Attachment #9052921 - Attachment is obsolete: true
Attachment #9053344 - Attachment is obsolete: true

Final add-on for signing and deploying.

Cosmetic changes compared to 2.0.

  • Version bumped to 2.1
  • branch names changed to "control" and "test" per Felix's request.
Attachment #9056931 - Attachment is obsolete: true
Flags: needinfo?(mcooper)
Attachment #9058248 - Attachment is obsolete: true
Flags: needinfo?(mcooper)

We're re-running the study because users were uninstalled by armagaddon:
https://experimenter.services.mozilla.com/experiments/strict-list-cookie-restrictions-3/

The slug and version have been changed for this iteration.
The study was previously disabled due to armagaddon.
Please sign this so QA can test V3.

Flags: needinfo?(mcooper)
Flags: needinfo?(mcooper) → needinfo?(rdalal)
Flags: needinfo?(rdalal)

please sign for testing

Attachment #9070672 - Attachment is obsolete: true
Attachment #9070700 - Attachment is obsolete: true
Flags: needinfo?(mcooper)
Flags: needinfo?(mcooper)

This study is running for two months. If possible, we would like to request a preliminary analysis after the first month to get an early indication if there is any substantial churn or other problems.

Hi Felix, would it be possible to do a one month analysis on the strict list experiment in early August?

https://experimenter.services.mozilla.com/experiments/strict-list-cookie-restrictions-3/

Flags: needinfo?(flawrence)
Flags: needinfo?(flawrence)
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: