Closed Bug 1477077 Opened 6 years ago Closed 6 years ago

[Shield] Opt-out Study: Awesome Bar Federated Learning

Categories

(Shield :: Shield Study, defect)

defect
Not set
normal

Tracking

(firefox61 unaffected, firefox62+ fixed)

RESOLVED FIXED
Tracking Status
firefox61 --- unaffected
firefox62 + fixed

People

(Reporter: fhartmann, Assigned: fhartmann)

References

Details

Attachments

(2 files, 2 obsolete files)

Basic description of experiment:
The current algorithm for ranking history search suggestions in the awesome bar is based on manually selected numbers for weighting  frequency and recency (“frecency”) to determine the relevance of a potential search result. These numbers were not selected in a data-driven way and there might be a configuration of them that leads to much more relevant search results. Our experiment involves an optimization process that selects a configuration based on what search results users actually select. This is done using Federated Learning, a distributed way of doing machine learning where the data stays completely locally. Instead of sharing their data with Mozilla, the data is only used locally to improve the machine learning model. Periodically, users send updates of the model back to the server to improve the model for everyone. These updates are very high-level (e.g. "weight frequency more, weight recency less") and do not contain personal information.

What are the branches of the study?
Treatment, control, control without decay
- treatment: The full optimization process is performed, weights change after every iteration and the ranking is recomputed
- control: Search works exactly the same way it currently does in Firefox, we only collect additional statistics
- control-no-decay: In the current algorithm, frecency scores are decayed over time. treatment loses this effect since scores are recomputed all the time. To see if the decaying is useful and to make a fairer comparison, this variation only removes the decaying effect

What percentage of users do you want in each branch?
15% in treatment, 5% in control, 5% in control no decay. We decided on these numbers using a power analysis [1]

What Channels and locales do you intend to ship to?
Beta channel, all locales

What is your intended go live date and how long will the study run?
Launch on 7/20, run for 1 week

Are there specific criteria for participants?
Must be on Firefox 62+

What is the main effect you are looking for and what data will you use to make these decisions?
All powers were chosen using a power analysis [1].
- Baseline awesomebar product metrics do not decrease
   - Frequency of awesomebar usage does not decrease by a more than a margin of 1%
   - Number of letters typed before choosing result does not increase much. There is no current Telemetry data for this, so we cannot decide on an effect size now.    - Our study collects this information though, so we will be able to compare treatment and control afterwards
   - % of  places (history, bookmarks, tabs, etc) results selected vs other types does not decrease by more than a margin of 1%
   - Overall rate of page loads did not decrease: The number of visited pages per ping does not decrease by more than 2%
- Federated learning system is successfully exercised
   - 4000 clients send back model updates in 80%+ of iterations
   - Loss of the model decreases over duration of experiment in treatment branch
   - Each iteration should take less than one hour
- Stretch: resultant model improves history search results (any of these three should improve, not necessarily all)
   - Average rank of selected history suggestion should improve (event telemetry) by 5%
   - Number of letters typed before choosing result decreases. There is no current    - Telemetry data for this, so we cannot decide on an effect size now. Our study collects this information though, so we will be able to compare treatment and control afterwards
   - % of history results selected vs other types increases by 1%

Who is the owner of the data analysis for this study?
Florian Hartmann (fhartmann@)

Who will have access to the data?
Florian Hartmann (fhartmann@), Sunah Suh (ssuh@), Context Graph team

Do you plan on surveying users at the end of the study?
No

User facing title of the experiment:
Awesome Bar improved history search

User facing description of the experiment:
The current Awesome Bar suggests history entries based on a number of factors. This experiment tries to weight the factors in a way that works better for the searches that users actually perform

Code Review performed by:
Drew Willcoxon (Client-side), Sunah Suh (Server-side)

QA Status of your add-on:
Under review

Link to any relevant google docs / Drive files that describe the project. Links to prior art if it exists:
Engineering Design: https://docs.google.com/document/d/1DuZ1nQ2ve7k-98BKUKCZegtVzAxnJBEgYJYv6IUqZck/edit
Bugzilla: https://bugzilla.mozilla.org/show_bug.cgi?id=1462102
Data review: https://bugzilla.mozilla.org/show_bug.cgi?id=1462109#c4
GitHub: https://github.com/florian/federated-learning
Blog post: https://florian.github.io/federated-learning/

[1] https://dbc-caf9527b-e073.cloud.databricks.com/#notebook/20985/command/21000
Blocks: 1462102
Flags: needinfo?(carmen.fat)
The data review was already performed [1] by chutten@, so I won't add another needinfo here.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1462109#c4
The addon was already reviewed by Drew [1].
@Drew: Can you still sign off this bug? ("Needinfo the Firefox peer that has reviewed your study. The reviewer must explicitly sign off in the bug.")

[1] https://github.com/florian/federated-learning-addon/pull/1
Flags: needinfo?(adw)
I reviewed Florian's webextension and sign off on it.
Flags: needinfo?(adw)
@Ilana: We updated our PHD and completed our power analysis. I'm adding a needinfo for you for the science sign-off.

("Needinfo a Shield Study owner (mgrimes@, isegall@, glind@, kardekani@, rrayborn@, jgaunt@) from the cc list. The Shield Study owner is responsible for signing off on the science review and the risk matrix.")
Flags: needinfo?(isegall)
Sign off for Federated Learning for the Awesome Bar - (GREEN)

Federated Learning for the Awesome Bar
Targeted: Firefox Beta 62.0bx

We have finished testing the Federated Learning for the Awesome Bar experiment.

QA’s recommendation: GREEN - SHIP IT

Reasoning:
- Since the team decided to drop the survey for this version and the issues we previously considered blockers were addressed/mitigated, we consider that the experiment is ready to be launched.

Testing Summary:
- Full Functional test suite: TestRail (https://goo.gl/fJqLnK)
- Verified that the Telemetry pings are correctly sent
- Verified that the prefs modified by the add-on are reset after uninstalling it

Tested Platforms:
- Windows 10 x64
- Ubuntu 16.04 x64
- macOS 10.13

Tested Firefox versions:
- Firefox Beta 62.0b6
- Firefox Beta 62.0b7
- Firefox Beta 62.0b8
- Firefox Beta 62.0b9
Flags: needinfo?(carmen.fat)
Science Review: R+
Flags: needinfo?(isegall)
Flags: needinfo?(mcooper)
Flags: needinfo?(mcooper)
Attachment #8993818 - Attachment is obsolete: true
Attachment #8993819 - Attachment is obsolete: true
Flags: needinfo?(rdalal)
@Liz, this one is coming in hot. Beta 62, running for 2 weeks.
Flags: needinfo?(lhenry)
Flags: needinfo?(rdalal)
This sounds great! Please take this as my signoff and let me know if I need to sign off anywhere else.
Flags: needinfo?(lhenry)
This study is now live on Beta 62. According to the calendar we should end the study Aug 6th.
This study was ended based on my 8/5 conversation with Sunah; she will provide details on the outcome of the study before we close this bug.
Flags: needinfo?(ssuh)
Clearing the needinfo for now -- florian is hard at work on the analysis (but in the context of finishing his thesis next week so a final report for mozilla might take a big longer.)

Preliminary results are looking very good though -- the mechanics of building the model using an addon, telemetry pings and spark streaming worked without a hitch, and the model's actual performance appear to have improved by some metrics.
Flags: needinfo?(ssuh)
Hi Sunah, any updates on the analysis here?
Flags: needinfo?(ssuh)
Florian's planning to publish his blog post tomorrow, which will serve as the analysis that'll close out this bug (keeping the needinfo active to remind me to add a link to the post!)
The blog post is published! https://florian.github.io/federated-learning-firefox/

And the full notebook: https://dbc-caf9527b-e073.cloud.databricks.com/#notebook/25009/command/25716

I think there's more that'd be great to analyze but that should suffice for the phd criteria
Flags: needinfo?(ssuh)
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: