Closed Bug 1428308 Opened 8 years ago Closed 7 years ago

[Shield] Opt-out Study: TAAR Experiment v2

Categories

(Shield :: Shield Study, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: fwollsen, Assigned: mlopatka)

References

Details

Attachments

(2 files, 28 obsolete files)

117.87 KB, image/png
emanuela
: feedback-
Details
475.24 KB, image/jpeg
Details
This experiment will evaluate the efficacy of the second version of Telemetry-Aware Add-on Recommender (TAAR v2.0) - https://github.com/mozilla/taar PHD: https://docs.google.com/document/d/1ZrfxNfBiEiAkqz4ZW9wmWfJF5sdfQg-Xq6_2mY1EXtI/edit?pli=1 Add-on: https://github.com/motin/taar-experiment-v2-shield-study (Bugzilla bug for v1 of this study: https://bugzilla.mozilla.org/show_bug.cgi?id=1400900)
Attached: 0.2.0 for signing
Flags: needinfo?(mcooper)
Flags: needinfo?(mcooper)
Attached: 1.0.0 for signing
Flags: needinfo?(mcooper)
Data review request for rweiss and/or sguha. Details are up to date in the PHD document, including raw JSON as an appendix.
Flags: needinfo?(sguha)
Flags: needinfo?(rweiss)
Flags: needinfo?(mcooper)
Can we get QA on this SHIELD study please?
Flags: needinfo?(kraj)
Also requesting Firefox peer-review as per new SHIELD guidelines.
Flags: needinfo?(rhelmer)
(In reply to mlopatka from comment #6) > Can we get QA on this SHIELD study please? when are we planning to ship this?
(In reply to krupa raj [:kraj] from comment #8) > (In reply to mlopatka from comment #6) > > Can we get QA on this SHIELD study please? > > when are we planning to ship this? Aiming for January 24th, but ASAP based on QA and other review axes.
(In reply to mlopatka from comment #7) > Also requesting Firefox peer-review as per new SHIELD guidelines. It looks like this is based on an upstream shield template / utils / etc, could you point me at that? I think my feedback would probably make more sense there. This is what I see in the legacy parts of this: 1) I think more of the Cu.imports in bootstrap.js could be lazy 2) `${__SCRIPT_URI_SPEC__}` seems a bit too clever and makes it less readable - these two-line template string building + import could just be a one-line import w/ a static string, for instance 3) I'd move everything out of bootstrap.js into separate JSMs 4) I don't see the browser.runtime.onMessage.addListener added in bootstrap.js being removed Since StudyUtils.jsm is webpack'd I can't really review that, but I'd like to (per my first sentence :) ) Overall I don't think any of this necessarily needs to block, except we should see if lazy-importing the Console.jsm and Log.jsm works (#1 above) and also ensure all listeners are being removed (#4). The bulk of the code here is from that upstream project and parts of it are quite dense so I'd need more time to really go over all of this. Is this based on code we've been using for studies for a while? If so I am not too concerned overall, and the work going on to move away from the legacy bits will make a lot of this much easier/safer. Oh and also I'd much prefer to use either github or mozreview for reviews :) If part of the shipping process here is to package XPIs and attach them to bugs then I *do* think we need a way to ensure that it matches the repo that it came from (especially if the packaging is not happening on mozilla infra!), but that should be some kind of mechanical process.
Flags: needinfo?(rhelmer) → needinfo?(fwollsen)
Just chiming in on the part about the shipping process. I would love to see this process move to Github as well. At this moment the protocol for shipping an addon (SHIELD) study in Firefox specifically requires the use of Bugzilla for the review request procedure as per current documentation: https://docs.google.com/document/d/1hOMjZ7l1K0HL8DUp7HCr8BeRO7NGw0SwYjK2nfwtRJQ/edit# I know that this process is currently in discussion, so it is a great opportunity to propose an improved "mechanical process" for better security and comfortable review process.
(In reply to mlopatka from comment #11) > Just chiming in on the part about the shipping process. I would love to see > this process move to Github as well. FWIW I am open to mozreview or phabricator (which Firefox is moving to) as well - more thinking that we need a simpler way to ensure that the XPI we actually ship to users matches the expected repository, where the review happened. This is the part I am thinking should be automated. > At this moment the protocol for > shipping an addon (SHIELD) study in Firefox specifically requires the use of > Bugzilla for the review request procedure as per current documentation: > https://docs.google.com/document/d/ > 1hOMjZ7l1K0HL8DUp7HCr8BeRO7NGw0SwYjK2nfwtRJQ/edit# > > I know that this process is currently in discussion, so it is a great > opportunity to propose an improved "mechanical process" for better security > and comfortable review process. Thanks for the link!
(In reply to mlopatka from comment #11) > At this moment the protocol for > shipping an addon (SHIELD) study in Firefox specifically requires the use of > Bugzilla for the review request procedure as per current documentation: > https://docs.google.com/document/d/ > 1hOMjZ7l1K0HL8DUp7HCr8BeRO7NGw0SwYjK2nfwtRJQ/edit# Also FWIW - bugzilla has a really simple github PR integration (can't seem to find more official docs right now): https://blog.glob.com.au/2013/10/21/github-pull-requests-and-bugzilla/ If the goal is to record in Bugzilla how the review happened and also as a place to attach unsigned+signed XPIs to, then that might be something to consider in the meantime.
@matt_g thought you might be interested in the discussion starting from comment#10 onward wrt The Shield shipping and review process.
Flags: needinfo?(mgrimes)
@rhelmer, your feedback is gold. Let's conduct the Firefox peer-review over Github via this tracking issue: https://github.com/motin/taar-experiment-v2-shield-study/issues/7 I have responded there, and you may close the issue when you are satisfied with the study-specific parts of the code. :)
Flags: needinfo?(fwollsen)
(The SHIELD guidelines only requires the peer reviewer to explicitly sign off in the bug, but we can keep all discussions and commentary in Github)
Depends on: 1432157
Thanks for the feedback on the review process. We're still ironing out the new steps, so this is extremely valuable. Expect to hear from Gregg or Bianca as a followup.
Flags: needinfo?(mgrimes) → needinfo?(glind)
Depends on: 1432211
Depends on: 1432424
Depends on: 1432466
Depends on: 1432476
Depends on: 1432499
Depends on: 1432504
Attached 1.0.1 for signing
Flags: needinfo?(mcooper)
Flags: needinfo?(mcooper)
(In reply to fwollsen from comment #15) > @rhelmer, your feedback is gold. Let's conduct the Firefox peer-review over > Github via this tracking issue: > https://github.com/motin/taar-experiment-v2-shield-study/issues/7 > I have responded there, and you may close the issue when you are satisfied > with the study-specific parts of the code. :) Oh sorry if I wasn't clear earlier - I think the study is fine to go as-is, r+ from me on moving forward with it now... we're just figuring out the process, we can leave things like my suggestion about using github for next time. r=me
Science review: R+
Depends on: 1435233
Depends on: 1435237
Depends on: 1435252
Here is a patch to ensure the user never sees the popup in Private Browsing mode. The XPI needs to be signed.
Flags: needinfo?(mcooper)
Attachment #8948060 - Attachment is patch: false
Here's a signed version of 1.0.2. Bianca, FYI the attachment you posted is not a patch, and calling it such confused Bugzilla a bit. A patch is a textual format that shows the diff between two versions of code. For example, the output of `git diff` is pretty close to a patch. I unchecked the "is patch" button for your attachment.
Attachment #8940188 - Attachment is obsolete: true
Attachment #8940233 - Attachment is obsolete: true
Attachment #8943329 - Attachment is obsolete: true
Attachment #8943394 - Attachment is obsolete: true
Attachment #8944952 - Attachment is obsolete: true
Attachment #8944958 - Attachment is obsolete: true
Flags: needinfo?(mcooper)
1. Yes, the PHD is linked and the code is public. 2. Yes, standard SHIELD participation preferences are exposed to the user. 3. Not permanent. 4. All data appears to be Type 2 or less. 5. Default on for study enrollees 6. No new identifiers 7. Yes. 8. No.
Flags: needinfo?(rweiss)
Simple fix that removes logging after each web-navigation to address QA concerns about private browsing. Needs to be signed.
Attachment #8948060 - Attachment is obsolete: true
Attachment #8948722 - Attachment is obsolete: true
Flags: needinfo?(mcooper)
Flags: needinfo?(mcooper)
redundant NI requests cleaned up. New version should address Bug 1435233.
Flags: needinfo?(sguha)
Flags: needinfo?(glind)
(In reply to Rebecca Weiss from comment #24) > 1. Yes, the PHD is linked and the code is public. > 2. Yes, standard SHIELD participation preferences are exposed to the user. > 3. Not permanent. > 4. All data appears to be Type 2 or less. > 5. Default on for study enrollees > 6. No new identifiers > 7. Yes. > 8. No. Can you link to the Data Review Template in the bug so that all those cc'ed can see what questions these answers correspond to?
Flags: needinfo?(rweiss)
Data Review Form (questions corresponding to the answers provided in Comment#24) https://github.com/mozilla/data-review/blob/master/review.md
Flags: needinfo?(rweiss)
Here is an update to this experiment that should both prevent the user from seeing the panel in Private Browsing mode and any related logging messages. It needs to be signed.
Flags: needinfo?(mcooper)
Flags: needinfo?(mcooper)
Depends on: 1437023
This new version is to address the issue in #1437013 (https://bugzilla.mozilla.org/show_bug.cgi?id=1437013). The addon needs signing.
Flags: needinfo?(mcooper)
Didn't realize there's another bug. Cancelling needinfo for signing.
Flags: needinfo?(mcooper)
Restored NI for signing. None of the other bugs were specified as potential blockers in the QA report, so lets get 1.0.5 landed and then see about the minor fixes.
Flags: needinfo?(mcooper)
Attached 1.0.6 for signing
Attachment #8948765 - Attachment is obsolete: true
Attachment #8948766 - Attachment is obsolete: true
Attachment #8949450 - Attachment is obsolete: true
Attachment #8949451 - Attachment is obsolete: true
Attachment #8949858 - Attachment is obsolete: true
Flags: needinfo?(mcooper)
Attached file taarexpv2@shield.mozilla.org-1.0.7.xpi (obsolete) —
Attached 1.0.7 for signing (no new fixes except changed guid to conform to Shield standards)
Attachment #8950195 - Attachment is obsolete: true
Attachment #8950316 - Attachment is obsolete: true
Flags: needinfo?(mcooper)
Flags: needinfo?(mcooper)
new version available for testing and fully signed.
Flags: needinfo?(madalin.cotetiu)
Attached file taarexpv2@shield.mozilla.org-1.0.8.xpi (obsolete) —
Attached 1.0.8 for signing (Fixes #1438213)
Attachment #8950499 - Attachment is obsolete: true
Attachment #8950638 - Attachment is obsolete: true
Flags: needinfo?(mcooper)
We have finished with the testing for 1.0.7 version and I have sent a complete report. A short summary: No possible blockers have been logged. One new issue found (Bug 1438213). Build looks good, we will perform another round of testing when 1.0.8 will be signed and sent another report.
Flags: needinfo?(madalin.cotetiu)
Flags: needinfo?(mcooper)
We have completed the testing for the 1.0.8 version and here are all the details and conclusions: Add-on version used in tests: taarexpv2@shield-study.mozilla.com-1.0.8-signed.xpi Covered Platforms: - Windows 10x64 - Mac OS 10.13 - Ubuntu 16.04x64 - Windows 7x64 - local server set Firefox Version used in tests: Latest FF Release No new issues found All the blockers are not reproducing and are marked as verified fixed All the issues have been triaged (there is only one issue with “NEW” status - Bug 1435238 but definitely not a blocker) We think that the current build is in a good shape and that it is safe to proceed to ship this study.
Flags: needinfo?(kraj)
We need to delay the launch to wait for the DynamoDB back-fill job to complete. matt_g was informed (off-bug) and we will sync up on Monday, Feb 26th with an updated eta for launch.
Attached file taarexpv2@shield.mozilla.org-1.0.9.xpi (obsolete) —
Attached 1.0.9 for signing. Identical package contents, but package.json is changed to allow the add-on to install in 59.
Flags: needinfo?(mcooper)
Flags: needinfo?(mcooper)
Depends on: 1442631
Attaching 1.0.10 for signing, addressing #1442596 and #1442655
Attachment #8951257 - Attachment is obsolete: true
Attachment #8951312 - Attachment is obsolete: true
Attachment #8955106 - Attachment is obsolete: true
Attachment #8955215 - Attachment is obsolete: true
Flags: needinfo?(mcooper)
Flags: needinfo?(mcooper)
Flags: needinfo?(madalin.cotetiu)
Flags: needinfo?(marius.coman)
Depends on: 1443186
Flags: needinfo?(marius.coman)
Flags: needinfo?(madalin.cotetiu)
Attached 1.0.11 for signing
Attachment #8955586 - Attachment is obsolete: true
Attachment #8955587 - Attachment is obsolete: true
Flags: needinfo?(mcooper)
Flags: needinfo?(mcooper)
Dashboard for monitoring the enrollment in this study: https://sql.telemetry.mozilla.org/dashboard/taar-experiment-v2
r=me on the latest xpi
This is live. @mlopatka - let me know if you want to review the actual shield recipe. Things of note: * Used the latest signed addon in c50 * Filters: Versions 58 and 59, release channel, not firstRun, profiles 2-9 days old (inclusive), enrolling for 10 days * Users who met the profile time-window will not be unenrolled as they continue to "grow older". e.g. if you were opted in at day 8, you will continue to be in the experiment. * Martin needs to let us know in <10 days if he'd like to continue new enrollment or else it will auto-freeze. Thanks
Depends on: 1443487
Attached 1.0.12 for signing
Flags: needinfo?(mcooper)
Changes exclusively related to UX elements. No other code has been touched. UX has requested that the popup doorhanger aligns with visual elements of the overall Firefox UX. Diff showing changes contained in pop-up code only: https://github.com/motin/taar-experiment-v2-shield-study/compare/1.0.11...1.0.12
Attached 1.0.13 for signing
Attachment #8956218 - Attachment is obsolete: true
Attachment #8956219 - Attachment is obsolete: true
Attachment #8957132 - Attachment is obsolete: true
Please terminate this study prematurely due to UX concerns over the incongruity between pop-up visual characteristics and the Firefox UX. This study will ship to a new population as soon as aesthetic elements are resolved.
Done. Looking to ship v2 on March 12th
Flags: needinfo?(mcooper)
Screenshot of the logo and font, on macOS.
r=me on the contents of the XPI, matches the github repo tagged with 1.0.13 (already reviewed)
Comment on attachment 8957375 [details] Screen Shot 2018-03-08 at 3.50.41 PM.png Hi! Would you mind signing off on the logo and font changes for this study? In addition to this screenshot, you should be able to install the add-on from attachment 8957370 [details] if you'd like.
Attachment #8957375 - Flags: feedback?(emanuela)
Attached image doorhanger.jpg
The visual looks ok on macOS. When I tried on a Windows machine, I noticed one pixel on the right side of the button - see attachment. Unfortunately, I'm not able to inspect the element so I cannot help with the fixing. But I also think it's not too critical, the panel now looks way closer to what we have in product than before.
Attachment #8957375 - Flags: feedback?(emanuela) → feedback-
I approve the new version.
New targeting criteria for Normandy: 2 <= profile_age_in_days <= 21 in one of the eligible locales not enrolled in the previous launch of this study (launched 2018-03-05, terminated 2018-03-08)
We have completed the testing for the 1.0.13 version and here are all the details and conclusions: Add-on version used in tests: taarexpv2@shield-study.mozilla.com-1.0.13-signed.xpi Covered Platforms: Windows 10x64 Mac OS 10.13 Ubuntu 16.04x64 Windows 7x64 - local server set Firefox Version used in tests: FF59 Release Candidates New issues found: Bug 1444366 Bug 1444386 Both Bug 1444366 and Bug 1444386 are marked as "WONTFIX". Ship it!
We are live with fresh sample using build 1.0.13. We do not have access to any method to (not) target users who were previously in this study. Therefore the only safe solution is to start our profile creation date window 4 days ago. So this is going to be more like: (<today's date [3/12]> - 4) <= profile_creation_date <= (<current date of recipe eval> - 2). We are targeting new profiles for 10 days. Please let us know before then whether we need to continue to recruit new users, and we'll modify the recipe.
Is there any way to get a sample from the range of profile creation dates before the previous experiment launch? (<today's date [3/12]> - 4) <= profile_creation_date <= (<current date of recipe eval > - 2) is a *significantly* different sample than the (<today's date [3/12]> - 10 <= profile_creation_date <= (<current date of recipe eval> - 2) Originally requested and even more so than the specified in the PHD for this experiment. (<today's date [3/12]> - 21) <= profile_creation_date <= (<current date of recipe eval> - 2) Cutting the profile age down to such a narrow and new demographic subset means that the collaborative filtering module of TAAR will almost never be leveraged and we wont be able to validate its utility or contribution to the ensemble module with out some older "new" clients. I'm not sure what the flexibility of the Normandy;s targetting is like but something more like: (<today's date [3/12]> - 4) <= profile_creation_date <= (<current date of recipe eval > - 2) || (<today's date [3/12]> - 9) <= profile_creation_date In order to also get a sample that can meaningfully evaluate the Collaborative Recommender?
Flags: needinfo?(rrayborn)
So per a conversation with Martin we've modified the recipe to the following: * 7 days for profile creation dates before the dates used last time * 14 days for profile creation dates after the dates used last time (I think this is the request, we can make it shorter/longer as needed, but we want full weeks) So basically we want two date segments missing the middle period of what we sampled last time. Here's the recipe targeting (profileCreationDate is short for normandy.telemetry.main.environment.profile.creationDate): ( ( ( profileCreationDate > 17595 - 16 # 7 days before the floor of the last version && profileCreationDate <= 17595 - 9 # Before the floor of the last version ) || ( profileCreationDate >= 17597 # After the ceiling of the last version (17595 + 3 - 1) && profileCreationDate < 17597 + 14 # 2 weeks after the previous line && profileCreationDate < (normandy.request_time / (24*60*60*1000)) - 1 # Let users be around for at least 1-2 days ) ) # Same as previous recipes && !normandy.isFirstrun && normandy.version >= '58' && normandy.version < '60' && normandy.channel == 'release' )
Flags: needinfo?(rrayborn)
Enrollment is looking good, I'm seeing some older profile creation dates and some brand new users in the current pings. I'd like to leave enrollment open through the weekend since the profile creation dates will be severely bimodal in their age (which may means that we need to treat them as two separate experimental groups). Early next week we can pause/halt enrollment and keep the clients enrolled at that time in the study to observe addon retention behaviour.
That sounds good. Thanks for the update Martin!
Enrollment has been paused, but the study will remain live.
In the interest of transparency and openness, this add-on ought to have a proper description of it and its purpose in the Firefox add-on window. (I just found it and wondered "what the heck"?) This is especially important since it's an "opt-out study" that the end-user need to inform themselves of to be able to make an informed decision.
TAAR v2 study ran successfully evaluating the effects of exposing the Telemetry Aware Addon Recommendations (TAAR) service in the about:addons client view. The experiment launched on March 12th, 2018 and ended on April 23th, 2018. Top line results observed: Probability of Install Ensemble +0.8% over Control overall Ensemble +1.2% over Control for NON en-US locales Add-on Diversity (Unique installs / total installs) Ensemble +0.2% over Control overall Ensemble +0.6% over Control for en-US locales Full detailed analyis here: https://metrics.mozilla.com/protected/taar/taar-v2-preliminary-results.html
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Attachment #8957223 - Attachment is obsolete: true
Attachment #8957370 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: