[Shield] Add-On Study: Add-On Study: Federated Learning v2 relaunch
Categories
(Shield :: Shield Study, enhancement)
Tracking
(geckoview66 ?, firefox66- affected)
People
(Reporter: experimenter, Assigned: isegall)
Details
Add-On Study: Federated Learning v2 relaunch
We seek to replicate the federated learning study performed last year (PHD: https://docs.google.com/document/d/1DuZ1nQ2ve7k-98BKUKCZegtVzAxnJBEgYJYv6IUqZck/edit?ts=5b0885a3) with an updated architecture and additional probes.
This study is a relaunch of the previous Federated Learning v2 study (https://experimenter.services.mozilla.com/experiments/federated-learning-v2/), which was launched with errors.
Public facing description: This study uses federated learning, a privacy-preserving learning methodology, to gain insights into how our users utilize the history and bookmark features in the awesome bar. We'll use these findings both to optimize the experience in the awesome bar and explore our ability to use federated learning in the future.
More information: https://experimenter.services.mozilla.com/experiments/add-on-study-federated-learning-v2-relaunch/
Note: signed xpi available at https://bugzilla.mozilla.org/show_bug.cgi?id=1532217
Comment 2•5 years ago
|
||
Add-On Study: Federated Learning v2 - 2.2.0 build
Targeted: Firefox Release 66.x
We have finished testing the Add-On Study: Federated Learning v2 - 2.2.0 build experiment.
QA’s recommendation: GREEN - SHIP IT
Reasoning:
- No new issues have been found during testing the 2.2.0 build.
Testing Summary:
- Verified that each of the branches can be accessed using their changed names (values).
- Verified that the "control" and "not-submitting" types of branches generate the "shield-study-addon" telemetry probes.
- Verified that the "control" and "not-submitting" types of branches don't generate the "frecency-update" telemetry probes.
- Verified that the rest of the branches (model1, model2, model3-submitting and model4-submitting) generate both "shield-study-addon" and "frecency-update" telemetry probes.
- Performed regression testing to ensure that the changes performed to the add-on’s architecture don’t affect the user-facing behavior and telemetry probes.
Tested Platforms:
- Windows 10 x64
- MacOS 10.14
Tested Firefox versions:
- Firefox Release 66.0.1 (en-US)
- Firefox Release 66.0.2 (en-US)
Updated•5 years ago
|
[Tracking Requested - why for this release]: This experiment is targeting the Firefox 66 release population.
Updated•5 years ago
|
Reporter | ||
Comment 4•5 years ago
|
||
Experiment Type: Opt-Out Study What are the branches of the study: - Treatment model4-not-submitting 10%: Initial coefficients are parameters geometrically far from existing ones (same as model2, all 1s), testing set - Treatment model4-submitting 10%: Initial coefficients are parameters geometrically far from existing ones (same as model2, all 1s), training set - Treatment model3-not-submitting 10%: Initial coefficients are existing parameters (same as model1), testing set - Treatment model3-submitting 10%: Initial coefficients are existing parameters (same as model1), training set - Treatment model2 20%: Initial coefficients are parameters geometrically far from existing ones (use all 1s), same users train and receive updated model - Treatment model1 20%: Initial coefficients are existing parameters, same users train and receive updated model - Treatment Control 20%: control What version and channel do you intend to ship to? 1% of Release Firefox 66.0 Are there specific criteria for participants? Version: Please include 66.0+ and not just a 66.0 exact match. Since the study will be enrolling during the Fx66 launch we expect enrollment to ramp up as clients update from 65 to 66 on March 19th. Locales: en-all Geographic regions: all Prefs: browser.urlbar.suggest.searches = True browser.urlbar.suggest.history = True browser.urlbar.suggest.bookmark = True browser.privatebrowsing.autostart = False #remove auto private browsing users privacy.sanitize.sanitizeOnShutdown = False #remove clear history on shutdown users browser.urlbar.matchBuckets Does not exist as a pref! #restricts to search first, which is the default on all clients 57 and newer Studies: NA Any additional filters: What is your intended go live date and how long will the study run? Apr 01, 2019 - May 13, 2019 (42 days) What is the main effect you are looking for and what data will you use to make these decisions? In order to evaluate the winning branch for the search parameters, we will be using both interaction data and survey results. The data we will examine is: - Number of characters typed before choosing result (used in v1) - Rank of result chosen (used in v1) - How soon the chosen result entered the result set - Total time elapsed between beginning query and choosing result - Number of abandoned searches - Total amount of usage of awesomebar - Search satisfaction score from survey Additionally, we will examine the standard engagement metrics to ensure that no branch sees significant decays (not anticipated). Search satisfaction score will be the gold standard for determining the winning branch. In the case that there is no clear winner using a 1% margin, we will evaluate which of the above metrics are most strongly correlated with positive search experience and see if a clear winner emerges from there. In the case that there is conflicting output, as in the previous study, we will consult with the search team to decide if there is enough evidence to support replacing the current parameters. If we see the training-test branches clearly outperform the dogfooding branches in both the well-seeded and badly-seeded cases, we will make note of that for upcoming federated learning studies. This result would be counter to the way that federated learning has been used in the past, and would warrant further investigation. We will also be examining how the convergence of the badly-seeded branches compares to their well-seeded counterparts and if they arrive at the same model. This outcome will affect how careful we need to be with start values for similar analyses (adding additional branches, running simulations, etc etc). We have already developed our survey (with Rosanne Scholl of S&I), which is available here: https://qsurvey.mozilla.com/s3/URL-bar-satisfaction-survey. Screenshot to be added as indicated in the text. Based on Shell's comments in the previous experimenter link, we should launch the survey April 22 and April 29. If those don't launch for any reason, it'll launch upon add-on expiration on May 6th. May 13th is the new ending date as a buffer to allow users to respond and to correlated them with telemetry data. Who is the owner of the data analysis for this study? Ilana Segall Will this experiment require uplift? False QA Status of your code: https://github.com/mozilla/federated-learning-v2-study-addon/blob/master/docs/TESTPLAN.md Link to more information about this study: https://experimenter.services.mozilla.com/experiments/add-on-study-federated-learning-v2-relaunch/
Comment 5•5 years ago
|
||
Untracking since we have this information easily findable now in Experimenter.
Comment 6•5 years ago
|
||
experiment complete
Description
•