Last Comment Bug 1052714 - Run a dummy experiment in nightly to verify bug 1038174
: Run a dummy experiment in nightly to verify bug 1038174
Status: VERIFIED FIXED
:
Product: Firefox Health Report
Classification: Client Software
Component: Client: Desktop (show other bugs)
: 31 Branch
: All All
-- normal
: ---
Assigned To: Benjamin Smedberg
: Kamil Jozwiak [:kjozwiak]
:
Mentors:
Depends on: 1057464
Blocks: 1038174
  Show dependency treegraph
 
Reported: 2014-08-12 12:29 PDT by Benjamin Smedberg
Modified: 2014-08-22 10:46 PDT (History)
5 users (show)
benjamin: firefox‑backlog+
mmucci: qe‑verify+
See Also:
QA Whiteboard:
Iteration: 34.3
Points: 5


Attachments
bug1052714 (5.83 KB, patch)
2014-08-20 08:38 PDT, Benjamin Smedberg
no flags Details | Diff | Splinter Review
bug1052714, rev2 (5.71 KB, patch)
2014-08-20 09:56 PDT, Benjamin Smedberg
felipc: review+
Details | Diff | Splinter Review

Description User image Benjamin Smedberg 2014-08-12 12:29:04 PDT
In bug 1038174 we fixed a race condition where we were changing experiment branches. There is no easy automated or manual test for this because it's a race, but it's still quite important to have some confidence in the fix before we deploy more experiments.

I propose that we run a dummy experiment in the nightly channel which does nothing but set an experiment branch using the same code we used in the search experiment. Then we can verify using FHR queries whether the fix was effective.
Comment 1 User image Benjamin Smedberg 2014-08-14 08:47:14 PDT
Turns out I needed to code this to test bug 1052545.
Comment 2 User image Benjamin Smedberg 2014-08-20 08:38:13 PDT
Created attachment 8475967 [details] [diff] [review]
bug1052714
Comment 3 User image :Felipe Gomes (needinfo me!) 2014-08-20 09:45:38 PDT
How are you gonna evaluate if the fix worked? By looking at the FHR data from users to see if they aren't changing branches across days? Sounds fair, but I wonder if this is enough to be sure, since it's going to the Nightly population (which might restart more often in a day and thus only carry a possible second branch in the daily payload).

Maybe it would be good to store the information somewhere else (like in a pref), and compare: if the pref is set but the branch is not, you could record a TelemetryLog event. If none of these show up in the logs then I think we can confidently say it's fixed. I'll leave it up to you.

And why is the delayedstartup necessary? I expect other experiments will want to set/get branch directly on startup, and it would be nice to be testing that here.

As for the patch, it looks good. I'll wait for the above replies before r+. The left-over REASON array can be removed, and you might want to define the missing install/uninstall/shutdown functions to avoid logging a warning http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/extensions/internal/XPIProvider.jsm#4283
Comment 4 User image Benjamin Smedberg 2014-08-20 09:55:52 PDT
> How are you gonna evaluate if the fix worked? By looking at the FHR data
> from users to see if they aren't changing branches across days?

That was my plan, yes.

> but I wonder if this is enough to be sure, since it's going to the Nightly
> population (which might restart more often in a day and thus only carry a
> possible second branch in the daily payload).

The nightly population doesn't actually restart that much more than other populations; I think there will be enough coverage to get a good answer.

> Maybe it would be good to store the information somewhere else (like in a
> pref), and compare: if the pref is set but the branch is not, you could
> record a TelemetryLog event. If none of these show up in the logs then I
> think we can confidently say it's fixed. I'll leave it up to you.

This is a neat idea, but we know that pref writes aren't always reliable, so I think we'd be introducing a different kind of complexity, so I'd like to avoid it for now.

> And why is the delayedstartup necessary? I expect other experiments will
> want to set/get branch directly on startup, and it would be nice to be
> testing that here.

There is no race condition in that case. If the experiment branch is set immediately from startup(), it is set here: http://hg.mozilla.org/mozilla-central/annotate/cbbc380f1e1c/browser/experiments/Experiments.jsm#l1938 which is before the cache is saved.

The problem with the search experiment is that since we were using the addon SDK, the bootstrap.js startup() method asynchronously loads main.js, so we don't set the experiment branch until the next tick of the event loop, which can be during this yield:

http://hg.mozilla.org/mozilla-central/annotate/cbbc380f1e1c/browser/experiments/Experiments.jsm#l1004

I'll put up a new patch with the shutdown() install() uninstall() methods.
Comment 5 User image Benjamin Smedberg 2014-08-20 09:56:51 PDT
Created attachment 8476023 [details] [diff] [review]
bug1052714, rev2
Comment 6 User image Benjamin Smedberg 2014-08-20 10:36:33 PDT
http://hg.mozilla.org/webtools/telemetry-experiment-server/rev/eedfeaeb2fd8

Kamil, this should ready for basic verification; basically just looking for that the experiment is installed into nightly and shows up with a branch in the FHR payload.
Comment 7 User image Kamil Jozwiak [:kjozwiak] 2014-08-20 11:53:19 PDT
I think there's an issue with the current manifest:

--> minVersion": "34.0" should be minVersion": "34.0a"

You'll get the following error message in the browser console when attempting to install the experiment with the current manifest: (changing the value to 34.0a fixes the problem)

1408560684347 Browser.Experiments.Experiments TRACE Experiments #0::evaluateExperiments() - added EXPERIMENT_ACTIVATION to TelemetryLog: ["REJECTED","experiment-branch-test-nightly@experiments.mozilla.org","minVersion"]
Comment 8 User image Benjamin Smedberg 2014-08-20 12:30:09 PDT
You're right, I was testing with an older tree. http://hg.mozilla.org/webtools/telemetry-experiment-server/rev/c6f8f535e02e pushed and should be on staging within 15 minutes.
Comment 9 User image Kamil Jozwiak [:kjozwiak] 2014-08-21 12:19:20 PDT
Went through verification using the following build:
- http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014-08-21-03-02-01-mozilla-central/

- ensured when the current installed build < minBuildID, the following error appears in the browser console:

1408635829690 Browser.Experiments.Experiments TRACE Experiments #0::evaluateExperiments() - added EXPERIMENT_ACTIVATION to TelemetryLog: ["REJECTED","experiment-branch-test-nightly@experiments.mozilla.org","minBuildID"]

- ensured when the incorrect channel is being used, the following error appears in the browser console:
-> m-c [Passed]
-> aurora [Passed]
-> beta [Passed]

1408636243377 Browser.Experiments.Experiments TRACE Experiments #0::evaluateExperiments() - added EXPERIMENT_ACTIVATION to TelemetryLog: ["REJECTED","experiment-branch-test-nightly@experiments.mozilla.org","channel"]

- ensured that the sample rate of 50% was working correctly and made use the error was being displayed in the browser console:

1408641686988 Browser.Experiments.Experiments TRACEExperiments #0::evaluateExperiments() - added EXPERIMENT_ACTIVATION to TelemetryLog: ["REJECTED","experiment-branch-test-nightly@experiments.mozilla.org","sample"]

- ensured that once the experiment has been installed, the following information appears under about:healthreport. Also ensured that the same data is appearing in the payload when extracting the data via the browser console:

"org.mozilla.experiments.info": { 
  "_v": 2,
  "lastActive": "experiment-branch-test-nightly@experiments.mozilla.org",
  "lastActiveBranch": "0"
},

Ensured that several branches are being installed:
- No pre-existing branch, setting to: 0 (Found pre-existing branch: 0)
- No pre-existing branch, setting to: 2 (Found pre-existing branch: 2)
- No pre-existing branch, setting to: 4 (Found pre-existing branch: 4)

General test cases:

- ensured that the branch stays the same after restarting fx several times (browser console: Found pre-existing branch: 0)
- ensured that the branch isn't being changed when moving the computers date ahead by several days (browser console: Found pre-existing branch: 0)
- ensured that the correct information is displayed in about:addons, about:config and about:support relating to the current installed experiment
- ensured that the experiment expired correctly once endTime has been reached (ensured it appeared expired in about:addons, about:config and about:support)
- ensured that the experiment is correctly removed/disabled when a user removes it manually using about:addons
- ensured that the experiment is correctly removed when a user disables "Telemetry" under preferences
- tried reproducing the issue from bug # 1052545
- ensured the experiment stays removed/expired when closing/re-opening fx several times

Possible issues:

** Issue #1: **

The branch information is being saved for every day when visiting the raw data page under about:healthreport. However, it doesn't appear in the payload that is extracted through the browser console, example:

(used the following command to extract the telemetry payload via the browser console: http://pastebin.mozilla.org/6085730)

1) install nightly/experiment
2) once installed, visit about:healthreport -> Raw Data
3) move the computers date ahead by a single day
4) re-launch fx and visit about:healthreport -> Raw Data (you'll see another entry from the day before, these don't appear in the payload that I extract from the browser console)

Once you remove/disable the experiment, the "branch" information is removed from the payload but not from about:healthreport -> Raw Data

** Issue #2 **

Once the experiment is installed, restarting fx will create the following pref: (just making sure as I've never seen this appear with the other experiments)

extensions.installCache;[{"name":"app-global","addons":{"{972ce4c6-7e08-4474-a285-3208198ce6fd}":{"descriptor":"C:\\Program Files (x86)\\Nightly\\browser\\extensions\\{972ce4c6-7e08-4474-a285-3208198ce6fd}","mtime":1408646189405,"rdfTime":1408619692000}}},{"name":"app-profile","addons":{"experiment-branchtest-nightly@experiments.mozilla.org":{"descriptor":"C:\\Users\\kjozQA\\AppData\\Roaming\\Mozilla\\Firefox\\Profiles\\3jpfq99t.default\\extensions\\experiment-branchtest-nightly@experiments.mozilla.org.xpi","mtime":1408646248128}}}]

Other than those two possible issues listed above, everything else looks pretty good. Let me know if I missed something Benjamin.
Comment 10 User image Benjamin Smedberg 2014-08-21 12:24:42 PDT
> The branch information is being saved for every day when visiting the raw
> data page under about:healthreport. However, it doesn't appear in the
> payload that is extracted through the browser console, example:
> 
> (used the following command to extract the telemetry payload via the browser
> console: http://pastebin.mozilla.org/6085730)
> 
> 1) install nightly/experiment
> 2) once installed, visit about:healthreport -> Raw Data
> 3) move the computers date ahead by a single day
> 4) re-launch fx and visit about:healthreport -> Raw Data (you'll see another
> entry from the day before, these don't appear in the payload that I extract
> from the browser console)

The pastebin here is for getting telemetry data, not FHR data. So whenever the experiment is active, you should see "activeExperiment": "..." and "activeExperimentBranch": "N" in the telemetry payload. After the experiment stops, you should see nothing in the telemetry payload. Does this match what you're seeing?

> Once the experiment is installed, restarting fx will create the following
> pref: (just making sure as I've never seen this appear with the other
> experiments)
> 
> extensions.installCache;[{"name":"app-global","addons":{"{972ce4c6-7e08-4474-

I don't know what this is but I believe it's harmless.
Comment 11 User image Kamil Jozwiak [:kjozwiak] 2014-08-21 12:33:28 PDT
Yup(In reply to Benjamin Smedberg  [:bsmedberg] from comment #10)
> The pastebin here is for getting telemetry data, not FHR data. So whenever
> the experiment is active, you should see "activeExperiment": "..." and
> "activeExperimentBranch": "N" in the telemetry payload. After the experiment
> stops, you should see nothing in the telemetry payload. Does this match what
> you're seeing?

Yup, that's correct, the FHR data is still present and the branch data from the telemetry payload is removed once the experiment is either disabled or expired. Benjamin mentioned via IRC that FHR data is saved for 180 days which clears everything up. Marking this as verified.

Note You need to log in before you can comment on or make changes to this bug.