Run a dummy experiment in nightly to verify bug 1038174

VERIFIED FIXED

Status

Firefox Health Report
Client: Desktop
VERIFIED FIXED
3 years ago
3 years ago

People

(Reporter: bsmedberg, Assigned: bsmedberg)

Tracking

31 Branch
Points:
5
Dependency tree / graph
Bug Flags:
firefox-backlog +
qe-verify +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Assignee)

Description

3 years ago
In bug 1038174 we fixed a race condition where we were changing experiment branches. There is no easy automated or manual test for this because it's a race, but it's still quite important to have some confidence in the fix before we deploy more experiments.

I propose that we run a dummy experiment in the nightly channel which does nothing but set an experiment branch using the same code we used in the search experiment. Then we can verify using FHR queries whether the fix was effective.
(Assignee)

Updated

3 years ago
Flags: firefox-backlog+
(Assignee)

Comment 1

3 years ago
Turns out I needed to code this to test bug 1052545.
Assignee: nobody → benjamin

Updated

3 years ago
QA Whiteboard: [qa?]
(Assignee)

Updated

3 years ago
QA Whiteboard: [qa?] → [qa+]

Updated

3 years ago
Status: NEW → ASSIGNED
Iteration: --- → 34.3
QA Whiteboard: [qa+]
Flags: qe-verify+
QA Contact: kamiljoz
(Assignee)

Comment 2

3 years ago
Created attachment 8475967 [details] [diff] [review]
bug1052714
Attachment #8475967 - Flags: review?(felipc)
How are you gonna evaluate if the fix worked? By looking at the FHR data from users to see if they aren't changing branches across days? Sounds fair, but I wonder if this is enough to be sure, since it's going to the Nightly population (which might restart more often in a day and thus only carry a possible second branch in the daily payload).

Maybe it would be good to store the information somewhere else (like in a pref), and compare: if the pref is set but the branch is not, you could record a TelemetryLog event. If none of these show up in the logs then I think we can confidently say it's fixed. I'll leave it up to you.

And why is the delayedstartup necessary? I expect other experiments will want to set/get branch directly on startup, and it would be nice to be testing that here.

As for the patch, it looks good. I'll wait for the above replies before r+. The left-over REASON array can be removed, and you might want to define the missing install/uninstall/shutdown functions to avoid logging a warning http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/extensions/internal/XPIProvider.jsm#4283
(Assignee)

Comment 4

3 years ago
> How are you gonna evaluate if the fix worked? By looking at the FHR data
> from users to see if they aren't changing branches across days?

That was my plan, yes.

> but I wonder if this is enough to be sure, since it's going to the Nightly
> population (which might restart more often in a day and thus only carry a
> possible second branch in the daily payload).

The nightly population doesn't actually restart that much more than other populations; I think there will be enough coverage to get a good answer.

> Maybe it would be good to store the information somewhere else (like in a
> pref), and compare: if the pref is set but the branch is not, you could
> record a TelemetryLog event. If none of these show up in the logs then I
> think we can confidently say it's fixed. I'll leave it up to you.

This is a neat idea, but we know that pref writes aren't always reliable, so I think we'd be introducing a different kind of complexity, so I'd like to avoid it for now.

> And why is the delayedstartup necessary? I expect other experiments will
> want to set/get branch directly on startup, and it would be nice to be
> testing that here.

There is no race condition in that case. If the experiment branch is set immediately from startup(), it is set here: http://hg.mozilla.org/mozilla-central/annotate/cbbc380f1e1c/browser/experiments/Experiments.jsm#l1938 which is before the cache is saved.

The problem with the search experiment is that since we were using the addon SDK, the bootstrap.js startup() method asynchronously loads main.js, so we don't set the experiment branch until the next tick of the event loop, which can be during this yield:

http://hg.mozilla.org/mozilla-central/annotate/cbbc380f1e1c/browser/experiments/Experiments.jsm#l1004

I'll put up a new patch with the shutdown() install() uninstall() methods.
(Assignee)

Comment 5

3 years ago
Created attachment 8476023 [details] [diff] [review]
bug1052714, rev2
Attachment #8475967 - Attachment is obsolete: true
Attachment #8475967 - Flags: review?(felipc)
Attachment #8476023 - Flags: review?(felipc)
Attachment #8476023 - Flags: review?(felipc) → review+
(Assignee)

Comment 6

3 years ago
http://hg.mozilla.org/webtools/telemetry-experiment-server/rev/eedfeaeb2fd8

Kamil, this should ready for basic verification; basically just looking for that the experiment is installed into nightly and shows up with a branch in the FHR payload.
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
I think there's an issue with the current manifest:

--> minVersion": "34.0" should be minVersion": "34.0a"

You'll get the following error message in the browser console when attempting to install the experiment with the current manifest: (changing the value to 34.0a fixes the problem)

1408560684347 Browser.Experiments.Experiments TRACE Experiments #0::evaluateExperiments() - added EXPERIMENT_ACTIVATION to TelemetryLog: ["REJECTED","experiment-branch-test-nightly@experiments.mozilla.org","minVersion"]
Flags: needinfo?(benjamin)
(Assignee)

Comment 8

3 years ago
You're right, I was testing with an older tree. http://hg.mozilla.org/webtools/telemetry-experiment-server/rev/c6f8f535e02e pushed and should be on staging within 15 minutes.
Flags: needinfo?(benjamin)
Went through verification using the following build:
- http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014-08-21-03-02-01-mozilla-central/

- ensured when the current installed build < minBuildID, the following error appears in the browser console:

1408635829690 Browser.Experiments.Experiments TRACE Experiments #0::evaluateExperiments() - added EXPERIMENT_ACTIVATION to TelemetryLog: ["REJECTED","experiment-branch-test-nightly@experiments.mozilla.org","minBuildID"]

- ensured when the incorrect channel is being used, the following error appears in the browser console:
-> m-c [Passed]
-> aurora [Passed]
-> beta [Passed]

1408636243377 Browser.Experiments.Experiments TRACE Experiments #0::evaluateExperiments() - added EXPERIMENT_ACTIVATION to TelemetryLog: ["REJECTED","experiment-branch-test-nightly@experiments.mozilla.org","channel"]

- ensured that the sample rate of 50% was working correctly and made use the error was being displayed in the browser console:

1408641686988 Browser.Experiments.Experiments TRACEExperiments #0::evaluateExperiments() - added EXPERIMENT_ACTIVATION to TelemetryLog: ["REJECTED","experiment-branch-test-nightly@experiments.mozilla.org","sample"]

- ensured that once the experiment has been installed, the following information appears under about:healthreport. Also ensured that the same data is appearing in the payload when extracting the data via the browser console:

"org.mozilla.experiments.info": { 
  "_v": 2,
  "lastActive": "experiment-branch-test-nightly@experiments.mozilla.org",
  "lastActiveBranch": "0"
},

Ensured that several branches are being installed:
- No pre-existing branch, setting to: 0 (Found pre-existing branch: 0)
- No pre-existing branch, setting to: 2 (Found pre-existing branch: 2)
- No pre-existing branch, setting to: 4 (Found pre-existing branch: 4)

General test cases:

- ensured that the branch stays the same after restarting fx several times (browser console: Found pre-existing branch: 0)
- ensured that the branch isn't being changed when moving the computers date ahead by several days (browser console: Found pre-existing branch: 0)
- ensured that the correct information is displayed in about:addons, about:config and about:support relating to the current installed experiment
- ensured that the experiment expired correctly once endTime has been reached (ensured it appeared expired in about:addons, about:config and about:support)
- ensured that the experiment is correctly removed/disabled when a user removes it manually using about:addons
- ensured that the experiment is correctly removed when a user disables "Telemetry" under preferences
- tried reproducing the issue from bug # 1052545
- ensured the experiment stays removed/expired when closing/re-opening fx several times

Possible issues:

** Issue #1: **

The branch information is being saved for every day when visiting the raw data page under about:healthreport. However, it doesn't appear in the payload that is extracted through the browser console, example:

(used the following command to extract the telemetry payload via the browser console: http://pastebin.mozilla.org/6085730)

1) install nightly/experiment
2) once installed, visit about:healthreport -> Raw Data
3) move the computers date ahead by a single day
4) re-launch fx and visit about:healthreport -> Raw Data (you'll see another entry from the day before, these don't appear in the payload that I extract from the browser console)

Once you remove/disable the experiment, the "branch" information is removed from the payload but not from about:healthreport -> Raw Data

** Issue #2 **

Once the experiment is installed, restarting fx will create the following pref: (just making sure as I've never seen this appear with the other experiments)

extensions.installCache;[{"name":"app-global","addons":{"{972ce4c6-7e08-4474-a285-3208198ce6fd}":{"descriptor":"C:\\Program Files (x86)\\Nightly\\browser\\extensions\\{972ce4c6-7e08-4474-a285-3208198ce6fd}","mtime":1408646189405,"rdfTime":1408619692000}}},{"name":"app-profile","addons":{"experiment-branchtest-nightly@experiments.mozilla.org":{"descriptor":"C:\\Users\\kjozQA\\AppData\\Roaming\\Mozilla\\Firefox\\Profiles\\3jpfq99t.default\\extensions\\experiment-branchtest-nightly@experiments.mozilla.org.xpi","mtime":1408646248128}}}]

Other than those two possible issues listed above, everything else looks pretty good. Let me know if I missed something Benjamin.
Flags: needinfo?(benjamin)
(Assignee)

Comment 10

3 years ago
> The branch information is being saved for every day when visiting the raw
> data page under about:healthreport. However, it doesn't appear in the
> payload that is extracted through the browser console, example:
> 
> (used the following command to extract the telemetry payload via the browser
> console: http://pastebin.mozilla.org/6085730)
> 
> 1) install nightly/experiment
> 2) once installed, visit about:healthreport -> Raw Data
> 3) move the computers date ahead by a single day
> 4) re-launch fx and visit about:healthreport -> Raw Data (you'll see another
> entry from the day before, these don't appear in the payload that I extract
> from the browser console)

The pastebin here is for getting telemetry data, not FHR data. So whenever the experiment is active, you should see "activeExperiment": "..." and "activeExperimentBranch": "N" in the telemetry payload. After the experiment stops, you should see nothing in the telemetry payload. Does this match what you're seeing?

> Once the experiment is installed, restarting fx will create the following
> pref: (just making sure as I've never seen this appear with the other
> experiments)
> 
> extensions.installCache;[{"name":"app-global","addons":{"{972ce4c6-7e08-4474-

I don't know what this is but I believe it's harmless.
Flags: needinfo?(benjamin) → needinfo?(kamiljoz)
Yup(In reply to Benjamin Smedberg  [:bsmedberg] from comment #10)
> The pastebin here is for getting telemetry data, not FHR data. So whenever
> the experiment is active, you should see "activeExperiment": "..." and
> "activeExperimentBranch": "N" in the telemetry payload. After the experiment
> stops, you should see nothing in the telemetry payload. Does this match what
> you're seeing?

Yup, that's correct, the FHR data is still present and the branch data from the telemetry payload is removed once the experiment is either disabled or expired. Benjamin mentioned via IRC that FHR data is saved for 180 days which clears everything up. Marking this as verified.
Status: RESOLVED → VERIFIED
Flags: needinfo?(kamiljoz)
(Assignee)

Updated

3 years ago
Depends on: 1057464
You need to log in before you can comment on or make changes to this bug.