Closed Bug 1219505 Opened 9 years ago Closed 8 years ago

Telemetry experiment for unified urlbar-searchbar with one-off search buttons

Categories

(Firefox :: Address Bar, defect, P1)

defect

Tracking

()

RESOLVED FIXED

People

(Reporter: adw, Assigned: mak)

References

()

Details

(Whiteboard: [consistency][fxsearch])

Attachments

(3 files, 7 obsolete files)

Attached patch initial patch (obsolete) — Splinter Review
I'm not sure what the best way to review this would be, but here's a patch that contains all the add-on code.  It's feature-complete, but it doesn't add telemetry yet, and technically it's still a normal bootstrapped add-on, not a telemetry experiment.  I'll post follow-up patches that do those things.

Let me know if I can do anything to make review easier.

Here's an explanation of the code:

bootstrap.js - Calls browser.listen() to start watching for browser windows
Browser.jsm - Watches for browser windows, modifies each browser urlbar
Panel.jsm - Modifies each urlbar panel, has all the one-off code
style.css - All style rules
gear.svg - The settings button icon, taken from newtab's controls.svg

Some helpful links:

https://wiki.mozilla.org/Telemetry/Experiments
http://hg.mozilla.org/webtools/telemetry-experiment-server/file/tip
Attachment #8680341 - Flags: review?(mak77)
Attached patch bootstrap.js (obsolete) — Splinter Review
Benjamin, would you mind giving feedback?  This is my first experiment.  Now that we have opt-in search suggestions in the urlbar, we want to test removing the searchbar and adding the one-off search buttons from the searchbar's panel to the urlbar's panel.  These are the questions we want to answer:

* How is search volume/engagement affected by using a unified url/searchbar?
* Is there any benefit to the unified bar?
* How many times do people use the unified bar vs. control?
* Is there a correlation between people searching more with the unified bar vs. control?
* Are people opting in to search suggestions in the urlbar?  Because that might influence how people view the utility of a unified bar.

I'm planning on using telemetry to gather data.  Felipe tells me that we can correlate telemetry data with the experiment.

This patch is the bootstrap.js from the add-on.  I'm thinking we need three experiment branches:

* control: No UI changes, but telemeter urlbar and searchbar usage.
* nonstandard searchbar: For users who've moved their searchbars out of the navbar.  Again, no UI changes, but telemeter urlbar and searchbar usage.
* unified bar: For users who have their searchbars in the navbar.  Change the UI by removing the searchbar and modifying the urlbar panel to add the one-off search buttons.  Also telemeter urlbar usage.

For all three branches, we also telemeter their urlbar search suggestions opt-in choice.

Am I doing this right?
Attachment #8680993 - Flags: feedback?(benjamin)
My feedback will come in several flavors.

First: the problem statement and your measurement plan

* Do you have a specific plan for what metrics you're going to monitor to measure your questions? In particular you mention "engagement" but I'm not sure what that means in specific.
* What is "any benefit"?
* What does "people use" mean? Just opening the bar, or actually searching in it? (Or both)
* Do you or a product manager have criteria set up in advance for what the data will tell you? This is generally the recommended practice, because it's easy to reinterpret data

Second: You can correlate telemetry data with the experiment, but not in the prebuilt dashboards: you'll need to run queries directly. It's not super-hard, but you might need some training. Mark Reid and Roberto Vitillo can help train.

Third: I'm going to delegate the technical review of bootstrap.js to Felipe, who has all the expertise.
Attachment #8680993 - Attachment mime type: application/x-javascript → text/plain
Comment on attachment 8680993 [details] [diff] [review]
bootstrap.js

I'll also note that this should have extensive testing not only of the functionality, but that the telemetry data you need is being collected properly. That each interaction (search, whatever) is attributed to the correct location.
Attachment #8680993 - Flags: feedback?(benjamin) → feedback?(felipc)
Priority: -- → P1
Drew is on PTO, so I'll continue his work here.
I'll make sure we get answers for most questions in previous comments.
Assignee: adw → mak77
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #3)
> I'll also note that this should have extensive testing not only of the
> functionality, but that the telemetry data you need is being collected
> properly. That each interaction (search, whatever) is attributed to the
> correct location.

Which kind of testing do we usually execute on these experiments? do we have some specific automated tests harness, or do we usually proceed manually?

I've read QA sets special testing sessions before experiments are shipped, at https://wiki.mozilla.org/Telemetry/Experiments. Does this still happen? Should I contact Kamil Jozwiak (assuming he is still the lead QA on experiments) once we are ready?
Flags: needinfo?(benjamin)
The testing typically makes sure that the experiment is activated correctly, has the correct effects, and that any data collection is working correctly.

Kamil was in charge of that previously, but since he's now security QA our new Firefox QA team will be taking that over. Please coordinate with RyanVM.
Flags: needinfo?(benjamin)
Comment on attachment 8680993 [details] [diff] [review]
bootstrap.js

There's a known bug about startup() being called twice during experiment installation, so just guard against it like here: http://hg.mozilla.org/webtools/telemetry-experiment-server/file/0032fbee999a/experiments/e10s-enabled-aurora/code/bootstrap.js#l22
Attachment #8680993 - Attachment is patch: true
Attachment #8680993 - Flags: feedback?(felipc) → feedback+
Relating to QA.. Marco, would you be able to create a short test plan that includes all the different scenarios that you would like tested? Something similar to the ones that have been created for past experiments [1][2]. This way, the person that tests this doesn't have to comb through the bug(s) and and guess what exactly needs to be tested. I've also created a QA template [3] that can be used by RyanVM's team to get started whenever they're ready. Added two links of past examples of the testing that I've done and documented [4][5].

[1] https://public.etherpad-mozilla.org/p/e10s_experiment
[2] https://public.etherpad-mozilla.org/p/bug_1174937_search_experiment_qa_plan
[3] https://public.etherpad-mozilla.org/p/TelemetryExperimentQATemplate
[4] https://public.etherpad-mozilla.org/p/e10sExperimentTesting
[5] https://public.etherpad-mozilla.org/p/ShareSearchesTelemetryExperimentQA
Rank: 10
Depends on: 1226192
Comment on attachment 8680341 [details] [diff] [review]
initial patch

My review is coming in form of an updated "patch".

Most of the code was quite good and solid, I fixed some keyboard interaction issues and did some code cleanups to simplify some code paths.
Fixed the double startup issue (thanks Felipe).
Added a filter and a manifest too.
I'm currently implementing the telemetry part, I'm reusing the existing telemetry probes so we can do direct comparisons with the control group.
Attachment #8680341 - Flags: review?(mak77) → review+
(In reply to Kamil Jozwiak [:kjozwiak] from comment #8)
> Relating to QA.. Marco, would you be able to create a short test plan that
> includes all the different scenarios that you would like tested? Something
> similar to the ones that have been created for past experiments [1][2]. 

Yes, as soon as I have a testable add-on (likely in the next days) I will start working on such a test plan, thank you for the references.
Depends on: 1226629
Attached file manifest.json (obsolete) —
initial version, some values may change, asking for sanity feedback.
Attachment #8690372 - Flags: feedback?(felipc)
Attachment #8690372 - Attachment is patch: true
Attachment #8690372 - Attachment is patch: false
Attached patch manifest.json (obsolete) — Splinter Review
Attachment #8690372 - Attachment is obsolete: true
Attachment #8690372 - Flags: feedback?(felipc)
Attachment #8690373 - Flags: feedback?(felipc)
Attached patch filter.js (obsolete) — Splinter Review
Attachment #8690374 - Flags: review?(felipc)
Attachment #8690373 - Attachment is obsolete: true
Attachment #8690373 - Flags: feedback?(felipc)
Attached patch addon code (obsolete) — Splinter Review
Please review bootstrap.js and Telemetry.jsm, the other files are r=me
Attachment #8680341 - Attachment is obsolete: true
Attachment #8680993 - Attachment is obsolete: true
Attachment #8690385 - Flags: review?(felipc)
Attached patch manifest.diff (obsolete) — Splinter Review
sorry for the attachments noise.

Data here is not finalized, asking for general feedback on the format and current values.
Attachment #8690386 - Flags: feedback?(felipc)
Comment on attachment 8690385 [details] [diff] [review]
addon code

Review of attachment 8690385 [details] [diff] [review]:
-----------------------------------------------------------------

feedback because I haven't finished reviewing it yet, but wanted to post these comments.


Is there chance of the style.css changes badly conflicting with popular add-ons and breaking things?  Maybe during QA we should test the most popular URLbar-customizations add-ons

::: root/bootstrap.js
@@ +37,5 @@
> + * Ensures that the experiment branch is set and returns it.
> + *
> + * @return Promise<String> Resolved with the branch.
> + */
> +function ensureExperimentBranch() {

After the branch is defined in the first startup, it is never changed later.  But the startup procedure performs some UI customizations based on the branch as it was defined.  Will there be any problems if the user performs some UI customization while the experiment is running?

An example: branch is defined as "unified" on first run. That will remove the search-box from the UI.  Will that put it in the customization panel allowing the user to re-add it?  Should the experiment for that user be halted, or at least be marked to a different group?

::: root/install.rdf
@@ +3,5 @@
> +     xmlns:em="http://www.mozilla.org/2004/em-rdf#">
> +  <Description about="urn:mozilla:install-manifest">
> +    <em:id>unified-urlbar@experiments.mozilla.org</em:id>
> +    <em:version>1.0.0</em:version>
> +    <em:type>128</em:type>

just as a tip, in searchtest-turkey-release35/code/Makefile I had a Makefile that helps with packaging the xpi with either type=2 (`make addon`) or type=128 (`make experiment`). You may find it useful while testing
Attachment #8690385 - Flags: review?(felipc) → feedback+
Attachment #8690374 - Flags: review?(felipc) → review+
Comment on attachment 8690386 [details] [diff] [review]
manifest.diff

Review of attachment 8690386 [details] [diff] [review]:
-----------------------------------------------------------------

::: manifest.json
@@ +9,5 @@
> +    "startTime" : 1421276400,
> +    "endTime" : 1425164400,
> +    "maxActiveSeconds" : 864000,
> +    "appName" : ["Firefox"],
> +    "channel" : ["nightly", "aurora", "beta", "release"],

Can we wait to run this experiment on release until we release FF 43? We need to wait for users to receive bug 1220198

If we start on beta sooner than that, we can use the minBuildID field to restrict it to 42b2 or later: http://hg.mozilla.org/webtools/telemetry-experiment-server/file/ce604a104dc2/experiments/serp-fraction-counts/manifest.json#l14
Attachment #8690386 - Flags: feedback?(felipc) → feedback+
(In reply to :Felipe Gomes from comment #16)
> Is there chance of the style.css changes badly conflicting with popular
> add-ons and breaking things?  Maybe during QA we should test the most
> popular URLbar-customizations add-ons

I think most of the rules are well namespaced and thus unlikely to clash, but I'll have a second look for that. I'll also add your suggestion to the QA test plan.

> An example: branch is defined as "unified" on first run. That will remove
> the search-box from the UI.  Will that put it in the customization panel
> allowing the user to re-add it?  Should the experiment for that user be
> halted, or at least be marked to a different group?

We discussed this and I think we came to the conclusion we want to know if that happens.
As you said, a user could move back the search bar to the navbar after we remove it. By not changing the branch, we could create a query to check how many users in the "unified" branch have the search bar visible (they put it back), while if we'd move the user to another branch the information would be lost, unless we have a way to count how many users switched branch, that seems less probable.
The main scope should globally be unaffected regardless, we just care about the global search volume, it's mostly for UX interest.
(In reply to :Felipe Gomes from comment #17)
> Can we wait to run this experiment on release until we release FF 43? We
> need to wait for users to receive bug 1220198

That's the plan, versions is set to 43 to 46... is not that enough?
(In reply to Marco Bonardo [::mak] from comment #19)
> (In reply to :Felipe Gomes from comment #17)
> > Can we wait to run this experiment on release until we release FF 43? We
> > need to wait for users to receive bug 1220198
> 
> That's the plan, versions is set to 43 to 46... is not that enough?

I mean:

"minVersion" : "43.0",
"maxVersion" : "46.*",

I thought these were driving, along with the dates and channels.
Yep, that should be enough.

Since the startDate/endDate in the manifest are not defined yet, I wasn't sure if this minVersion/maxVersion were the intention or just placeholders.
start/end date should be 15th of January to 1st of March (unless I did something wrong), that should be the right window for 43 to 46.
Component: Client: Desktop → Location Bar
Product: Firefox Health Report → Firefox
Depends on: 1228359
No longer depends on: 1228359
Depends on: 1228359
This is directly a diff against the Telemetry server repo (http://hg.mozilla.org/webtools/telemetry-experiment-server/) that can also be used to deploy to the staging server for QA and testing purposes.
I tested it with the staging server to ensure the experiment can be properly installed.
It is updated with the latest values we discussed, but some of them still need a final signoff.
Attachment #8690374 - Attachment is obsolete: true
Attachment #8690385 - Attachment is obsolete: true
Attachment #8690386 - Attachment is obsolete: true
Ryan, I think it's a good time to start syncing up with QA.
The url document has a Test Plan, I'd like someone to take a look at it and express any doubts or misses. Plus, if you don't have anyone with experience testing experiments, there's some stuff to setup that may take time to get right.

Before starting QA though, we still need to fix the dependencies bugs here, and signoff the experiment manifest values. I'll ping you or the person in charge if you can tell me who will be, once those pieces are ready.
Feel free to ni? me for anything related.
Flags: needinfo?(ryanvm)
Depends on: 1225540
Comment on attachment 8692588 [details] [diff] [review]
telemetry server deployment patch v1

ok, let's get final review on this, so we can then ask for xpi signing.
In the meanwhile we'll get final values for the manifest.
Attachment #8692588 - Flags: review?(felipc)
Attachment #8692588 - Flags: review?(felipc) → review+
Comment on attachment 8695268 [details]
unsigned xpi

Ok, the only missing thing for QA is uplift of bug 1226629 to Aurora, but I expect that to happen very soon.
QA can either use this xpi or the previous diff against the telemetry-server repository (I'd suggest that).

I'll wait for QA before asking for add-on signing, in case we need to make changes to the add-on.
Ryan, do you have an ETA for QA work? Considered the summit and the holidays, can we fit the expected timeline?
Flags: needinfo?(ryanvm)
Attachment #8695268 - Flags: feedback?
Attachment #8695268 - Flags: feedback? → feedback?(ryanvm)
I'll work on getting things ready so that our team can dive into this immediately after the work week.
Two quick points relating to QAing this experiment:

Point #1: We're currently running a A/B e10s experiment on BETA that started yesterday and runs up until Dec. 23rd. If I remember correctly, you can't run multiple experiments at the same time. That means if you're going to test this experiment before the 23rd, the e10s experiment will be the first one to get installed. You can do the following:

- remove the A/B e10s experiment from about:addons once it's installed, restart fx and re-trigger the experiment server to pull this experiment
- before staging your experiment locally, remove the A/B e10s experiment folder (/telemetry-experiment-server/experiments/e10s-enabled-beta) so you'll only receive this experiment once you start pinging your server rather than both

Point #2: If you're planning on testing this experiment before Jan. 7th, you'll need to change the "startTime" from the manifest before staging your experiment. Before running the python script to build your staging server, change startTime to something earlier in "/telemetry-experiment-server/experiments/unified-urlbar/manifest.json"
An individual user can't have multiple experiments at the same time, but since experiments all have samples, it's quite possible to run multiple experiments at the same time across a population.
> An individual user can't have multiple experiments at the same time, but
> since experiments all have samples, it's quite possible to run multiple
> experiments at the same time across a population.

Yup, I assumed the QA team will be changing the experiments.force-sample-value to make it easier to pull the experiment which will increase the chance of the first experiment being pulled as well which will make testing a bit more difficult.
Comment on attachment 8695268 [details]
unsigned xpi

QA has sent the formal sign-off email for this experiment. Detailed results can be seen in the doc below:
https://docs.google.com/spreadsheets/d/1YW9LA5NdZDZZkiXePz5foksaNhrVaRZmuQM4u-nFYrw/edit#gid=1541734641
Attachment #8695268 - Flags: feedback?(ryanvm) → feedback+
Comment on attachment 8695268 [details]
unsigned xpi

Hi Jason, could you please attach a Mozilla signed xpi that we can use to ship the experiment?
Attachment #8695268 - Flags: feedback?(jthomas)
Once the signed xpi is attached, Ryan's team should quickly double check and make sure that the experiment is being installed without having to use the xpinstall.signatures.required pref.
Attached file signed xpi
Please see attached.
Attachment #8695268 - Flags: feedback?(jthomas)
(In reply to Kamil Jozwiak [:kjozwiak] from comment #34)
> Once the signed xpi is attached, Ryan's team should quickly double check and
> make sure that the experiment is being installed without having to use the
> xpinstall.signatures.required pref.

Michelle, can your team please try out the signed XPI today to make sure it installs properly on Beta44/Aurora45? Thanks!
Flags: needinfo?(mfunches)
Yes, Done
44.0b4 - Install successful
45.0a2 - Install successful

1) Unified Urlbar Experiment has been installed successfully.
2) about:addons > Experiments accurately displays Unified Urlbar Experiment with Remove button
3) about:support Experimental Features displays correct information
Name Unified Urlbar Experiment 
ID unified-urlbar@experiments.mozilla.org 
Description An experimental add-on testing the unified urlbar experience.
Active false
End Date 1451950705295
Homepage
Branch unified
Test worksheet updated with this signed XPI results.
Flags: needinfo?(mfunches)
I will try to land to the telemetry server today, but then we still have to deploy a server update (I guess Felipe can do that)

I think we should delay the beginning of the experiment by a few days cause:
1. We may need some more days for the deployment
2. The week ends in general have a skewed distribution
3. Most are still "recovering" from vacation

I'd suggest to begin on 11th instead of 7th. Nothing changes from QA point of view, since Firefox 46 will hit Beta on 8th of March and the experiment will end on 25 of February.
Felipe, I pushed to the experiment server repo, could you please check everything is as expected and let me know what is needed to complete the deployment to the server?
https://hg.mozilla.org/webtools/telemetry-experiment-server/rev/c6764956fa70

Note: I moved the dates by 4 days per comment 39, is it better to notify release drivers about this small change?
Flags: needinfo?(felipc)
Mak, it looks good in the staging server. To deploy to the server we file a new bug. See 1232544 as the latest example.
Flags: needinfo?(felipc)
(In reply to Marco Bonardo [::mak] from comment #40)
> Note: I moved the dates by 4 days per comment 39, is it better to notify
> release drivers about this small change?

I think it's not necessary
Depends on: 1237468
Thanks Felipe, I filed bug 1237468 based on the bug you cited.
Marco/Drew, is there anything left to do in this bug or is it now fixed?
I don't have anything left in the TODO list... good job everyone!
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: