Closed Bug 1246130 Opened 8 years ago Closed 8 years ago

Gather onboarding telemetry experiments separately from other active experiments

Categories

(Firefox for Android Graveyard :: First Run, defect)

defect
Not set
normal

Tracking

(firefox47 fixed, fennec46+)

RESOLVED FIXED
Firefox 47
Tracking Status
firefox47 --- fixed
fennec 46+ ---

People

(Reporter: Margaret, Assigned: liuche)

References

Details

Attachments

(1 file, 1 obsolete file)

Onboarding experiments are a special snowflake. Because we need to know whether a user is in an experiment at startup time, we have local logic to determine the user's switchboard bucket:
http://hg.mozilla.org/mozilla-central/annotate/5f9ba76eb3b1/mobile/android/base/java/org/mozilla/gecko/firstrun/FirstrunPagerConfig.java#l59

However, we use Switchboard.getActiveExperiments to decide which experiments to include in our telemetry pings:
https://hg.mozilla.org/mozilla-central/annotate/77c38dd031d4/mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryPingGenerator.java#l98

To make sure we're sending pings with the correct experiment names, we've been mirroring the onboarding experiment logic in the switchboard-experiments config. This means that we need to be really careful to keep the client aligned with the server, and I want to argue that this is way too fragile and an impossible task to get right, given that all that matters is what state the client was in on first run.

I think we should remove all onboarding experiments from the switchboard-experiments config, and instead expose some helper method to let us include the user's local onboarding experiment name in our telemetry ping. This will ensure that we're always tracking the onboarding experiment that the client actually saw, and avoid footguns and confusion.

Mike, Chenxia, what do you think?
Flags: needinfo?(michael.l.comella)
Flags: needinfo?(liuche)
If you do this, which I am still not sure is a good idea, you should probably add a switchboard-config.json file into our build. Next, add support to SwitchBoard.java to look for the file (maybe pass the path into the SB initialization). Then, handle merging the local config with the remote config. Finally, make sure the SwitchBoard code (like getActiveExperiments) just work.

This would allow us to centralize the local SB config to a single JSON file instead of spreading it around Java files.
Yes, we need to move this logic out of FirstRunPagerConfig to a place that's more obviously associated with SB experiments.
I can't speak to the SwitchBoard side, but we have a TelemetryPingGenerator.getActiveExperiments method [1] where we can either merge the Switchboard list with the first run list or, if the configs are merged into the Switchboard config, just keep doing what we're doing and call Switchboard.getActiveExperiments.

So pretty easy on the telemetry side.

[1]: https://mxr.mozilla.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryPingGenerator.java?rev=3e53d5cf6400#94
Flags: needinfo?(michael.l.comella)
Assignee: nobody → liuche
Flags: needinfo?(liuche)
I wonder how many local experiments we'd be running. If it's *just* onboarding (because of the startup latency), we could add something in code, just for onboarding, because onboarding is actually its own extra-special snowflake, beyond just being a local experiment.

As rnewman pointed out, what we want to include in the telemetry ping isn't the current version of the experiment/config shipped in the code but rather the version of onboarding that the user actually saw. This patch writes the onboarding version shown into a pref and include that in the telemetry ping bundle (which looks to be as easy as adding the experiment to a list).

I could also move isInExperimentLocal to the Experiments.java file, if we want, it's easy enough, but I left it where it is because this is an onboarding-specific thing.

I think that we could add local experiments through a local file config that Switchboard reads, but that doesn't work for this case of onboarding.
What about if we consolidate this logic in Experiments.java? I still don't think we should have experiment-related logic in the first run code, and I also don't think it makes sense to add more experiment logic in the telemetry code.
Comment on attachment 8718623 [details]
MozReview Request: Bug 1246130 - Gather onboarding telemetry experiments separately from other active experiments. r=margaret

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/34643/diff/1-2/
Attachment #8718623 - Attachment description: MozReview Request: Bug 1246130 - Gather onboarding telemetry experiments separately from other active experiments. r=mcomella → MozReview Request: Bug 1246130 - Gather onboarding telemetry experiments separately from other active experiments. r=margaret
Attachment #8718623 - Flags: review?(michael.l.comella) → review?(margaret.leibovic)
The first patch moves the Switchboard-like-API calls to Experiments.java.

The second patch takes that refactor further and moves all the panel-selection logic as well. This is in a separate patch because it's not what we'd actually do if we were do this as a Switchboard experiment - for Switchboard experiments, the only calls we make to Switchboard are to check if the current build is in a particular experiment, and the logic for what to do for different experiments is still in the local code.

Perhaps in the long run, we'd want to move all the logic code to Experiments, but at the moment, that's in each component. I can see benefits for each approach, but I think in the end, I think it'd be too complicated to try to isolate all the experiment logic in Experiments.java.
(In reply to Chenxia Liu [:liuche] from comment #5)
> I wonder how many local experiments we'd be running. If it's *just*
> onboarding (because of the startup latency), we could add something in code,
> just for onboarding, because onboarding is actually its own extra-special
> snowflake, beyond just being a local experiment.

Let's look for potential small changes that keep us from locking too much on onboarding as a snowflake.

> As rnewman pointed out, what we want to include in the telemetry ping isn't
> the current version of the experiment/config shipped in the code but rather
> the version of onboarding that the user actually saw. This patch writes the
> onboarding version shown into a pref and include that in the telemetry ping
> bundle (which looks to be as easy as adding the experiment to a list).

This might be an ideal way to track any experiment (only track experiments we _know_ the user was exposed to), but it also leads to complicated, messy code which is harder to maintain and edge cases will blur the definition of "actually saw". Instead, we can already use "profileDate" and the known start/end dates for an onboarding experiment to handle answering "what users saw a particular onboarding experiment".

> I could also move isInExperimentLocal to the Experiments.java file, if we
> want, it's easy enough, but I left it where it is because this is an
> onboarding-specific thing.

For now, but onboarding is not the only potential use case for local config.

> I think that we could add local experiments through a local file config that
> Switchboard reads, but that doesn't work for this case of onboarding.

A bundled JSON file in the application's RAW folder should work fine, but I think we could delay adding that code for now. Let's just clean this up a little more.

Based on your ideas, I started considering a different approach. In the first part, you add Experiments.isInExperimentLocal, which cleanly moved the local config into Experiments. Great, but I think you also showed a nice direction to make things even better.

The part I didn't like about the first patch was the shared pref and adding onboarding dependencies into the telemetry code. Why not create Experiments.getActiveExperiments which grabs the list from SwitchBoard and then adds the active onboarding? That centralizes server and local experiments into Experiments.

We could even quickly change Experiments.isInExperimentLocal to Experiments.isInExperiment, using the same idea of passing to SwitchBoard.isInExperiment and doing the local checks too.

That means we could use Experiments to handle all application code checks.

I would also suggest undoing the Experiments.getPanels stuff you did in the second part. As you mention, that seems to go too far.
Comment on attachment 8718623 [details]
MozReview Request: Bug 1246130 - Gather onboarding telemetry experiments separately from other active experiments. r=margaret

https://reviewboard.mozilla.org/r/34643/#review31755

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryPingGenerator.java:102
(Diff revision 2)
> +            }

This seems a bit out of scope, but you do raise a good point that the local experiments could change on browser update, but that we really only want to know which onboarding the user saw.

However, I don't think mfinkle is a fan of this approach, since it puts experiment-specific logic in the telemetry code.

Could we create a new method `Experimetns.getActiveExperiments` that returns this joined list of experiment names?

::: mobile/android/base/java/org/mozilla/gecko/util/Experiments.java:39
(Diff revision 2)
> +            }

This method is designed to only work with the onbaording experiment names... maybe we should rename it to `isInOnboardingExperiment` to make it really explict. That's the only time I imagine us wanting to use a local experiment anyway.
Attachment #8718623 - Flags: review?(margaret.leibovic)
Comment on attachment 8719068 [details]
MozReview Request: Move experiment logic to Experiments.java. r=margaret

https://reviewboard.mozilla.org/r/34863/#review31759

::: mobile/android/base/java/org/mozilla/gecko/util/Experiments.java:97
(Diff revision 1)
> +    }

I don't think we should do this. As you said in your comment, I don't think it's scalabale to keep all experiment-realted logic in here, so let's just not put any in here.

Similarly to how we use release branch and feature flags throughout the product code, we should use experiment checks where the real logic lives.
Attachment #8719068 - Flags: review?(margaret.leibovic)
Comment on attachment 8718623 [details]
MozReview Request: Bug 1246130 - Gather onboarding telemetry experiments separately from other active experiments. r=margaret

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/34643/diff/2-3/
Attachment #8718623 - Flags: review?(margaret.leibovic)
Attachment #8719068 - Attachment is obsolete: true
Good point, both of you - I removed the experiment details from TelemetryPingGenerator and instead combined them into a single call in Experiments that handles local and remote Switchboard config experiments.

I kept the SharedPreferences code where I saved the onboarding experiment. My reasoning is this - we really just want to know which onboarding the user saw (which only happens once!). Let's not complicate this by having to calculate when the profile was created and triangulating what onboarding the user actually saw from which bucket of the current experiments the user is in - that's so much more complexity than just saving the one piece of data the one time onboarding is triggered. Unlike other Switchboard experiments which are ongoing, onboarding is only shown *once*, and if the user updates to a version of Firefox where we show a different set of experiments, we don't want to mix up our data by sending the wrong experiment.

If we want to add local long-running experiments later, we shouldn't use SharedPrefs, but for this case of a single experiment shown once, it makes sense.
(In reply to Chenxia Liu [:liuche] from comment #14)

> I kept the SharedPreferences code where I saved the onboarding experiment.
> My reasoning is this - we really just want to know which onboarding the user
> saw (which only happens once!). Let's not complicate this by having to
> calculate when the profile was created and triangulating what onboarding the
> user actually saw from which bucket of the current experiments the user is
> in - that's so much more complexity than just saving the one piece of data
> the one time onboarding is triggered. Unlike other Switchboard experiments
> which are ongoing, onboarding is only shown *once*, and if the user updates
> to a version of Firefox where we show a different set of experiments, we
> don't want to mix up our data by sending the wrong experiment.

I agree with this, let's keep it simple right now. If we find we need to do other "which experiment did the user actually see" calculations, we can figure that out later. But right now we don't need that, and shared preferences does the job well.
Blocks: 1238780
tracking-fennec: --- → 46+
Comment on attachment 8718623 [details]
MozReview Request: Bug 1246130 - Gather onboarding telemetry experiments separately from other active experiments. r=margaret

Whoops, need to do a little more work than just moving the code from TelemetryPingGenerator straight to Experiments. Unflagging for now, I'll update this request.
Attachment #8718623 - Flags: review?(margaret.leibovic)
Comment on attachment 8718623 [details]
MozReview Request: Bug 1246130 - Gather onboarding telemetry experiments separately from other active experiments. r=margaret

Hm, that's weird - I saw a bunch of errors building from mach but this seems to work fine when built from IntelliJ. Reflagging for review.
Attachment #8718623 - Flags: review?(margaret.leibovic)
Comment on attachment 8718623 [details]
MozReview Request: Bug 1246130 - Gather onboarding telemetry experiments separately from other active experiments. r=margaret

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/34643/diff/3-4/
https://reviewboard.mozilla.org/r/34643/#review32109

::: mobile/android/base/java/org/mozilla/gecko/firstrun/FirstrunPagerConfig.java:33
(Diff revision 4)
> -        } else if (isInExperimentLocal(context, Experiments.ONBOARDING2_B)) {
> +            GeckoSharedPrefs.forProfile(context).edit().putString(Experiments.PREF_ONBOARDING_VERSION, Experiments.ONBOARDING2_A);

Don't you need to call .apply() on the shared pref editor?

I'm still not a fan of putting these in a shared pref, but I guess we'll see if it makes analysis easier.
(In reply to :Margaret Leibovic from comment #15)

> I agree with this, let's keep it simple right now. If we find we need to do
> other "which experiment did the user actually see" calculations, we can
> figure that out later. But right now we don't need that, and shared
> preferences does the job well.

We can already figure out "which experiment did the user actually see" using the Telemetry sessions and the event stream data. That's what I'm doing in the current onboarding exploration analyses.

It's possible we may be able to use the Telemetry sessions to build the "which experiment did the user actually see" in all cases. We'll learn more as we continue to explore the analyses.
(In reply to Mark Finkle (:mfinkle) from comment #20)
> (In reply to :Margaret Leibovic from comment #15)
> 
> > I agree with this, let's keep it simple right now. If we find we need to do
> > other "which experiment did the user actually see" calculations, we can
> > figure that out later. But right now we don't need that, and shared
> > preferences does the job well.
> 
> We can already figure out "which experiment did the user actually see" using
> the Telemetry sessions and the event stream data. That's what I'm doing in
> the current onboarding exploration analyses.
> 
> It's possible we may be able to use the Telemetry sessions to build the
> "which experiment did the user actually see" in all cases. We'll learn more
> as we continue to explore the analyses.

But can we track retention with that? Isn't the point of including these experiments in the core ping that we don't need to rely on opt-in UI telemetry to figure these things out?
Flags: needinfo?(mark.finkle)
(In reply to :Margaret Leibovic from comment #21)

> > It's possible we may be able to use the Telemetry sessions to build the
> > "which experiment did the user actually see" in all cases. We'll learn more
> > as we continue to explore the analyses.
> 
> But can we track retention with that? Isn't the point of including these
> experiments in the core ping that we don't need to rely on opt-in UI
> telemetry to figure these things out?

Good point. The core ping is opt-out, telemetry is opt-in. We can make parts of telemetry opt-out (unified telemetry) but that is on it's own timeline.
Flags: needinfo?(mark.finkle)
Attachment #8718623 - Flags: review?(margaret.leibovic)
Comment on attachment 8718623 [details]
MozReview Request: Bug 1246130 - Gather onboarding telemetry experiments separately from other active experiments. r=margaret

https://reviewboard.mozilla.org/r/34643/#review32149

::: mobile/android/base/java/org/mozilla/gecko/firstrun/FirstrunPagerConfig.java:33
(Diff revision 4)
> -        } else if (isInExperimentLocal(context, Experiments.ONBOARDING2_B)) {
> +            GeckoSharedPrefs.forProfile(context).edit().putString(Experiments.PREF_ONBOARDING_VERSION, Experiments.ONBOARDING2_A);

I agree with mfinkle that we need an apply() here. Good catch.

::: mobile/android/base/java/org/mozilla/gecko/telemetry/TelemetryPingGenerator.java:91
(Diff revision 4)
>          }

This isn't related to this patch, but I'm curious - if switchboard is disabled, will it cause problems that this field is missing from the ping? Or should we just have an empty experiments array in that case?

If we do want the empty array, we could avoid this MOZ_SWITCHBOARD check altogether, since `getActiveExperiments` should fall back to returning an empty list if switchboard is disabled.

::: mobile/android/base/java/org/mozilla/gecko/util/Experiments.java:54
(Diff revision 4)
> +        if (AppConstants.MOZ_SWITCHBOARD) {

Similar to my comment above, does it make sense to have a MOZ_SWITCHBOARD check here? If switchboard is disabled, we shouldn't ever even set the onboarding version pref, so this wouldn't be an issue.
https://reviewboard.mozilla.org/r/34643/#review32149

> Similar to my comment above, does it make sense to have a MOZ_SWITCHBOARD check here? If switchboard is disabled, we shouldn't ever even set the onboarding version pref, so this wouldn't be an issue.

Fair enough - it looks like Switchboard would return an empty list if MOZ_SWITCHBOARD is not turned on.

Looking at some of our other telemetry (like UITelemetry), it seems like we're fine returning empty pings, so I could remove that check too.
Comment on attachment 8718623 [details]
MozReview Request: Bug 1246130 - Gather onboarding telemetry experiments separately from other active experiments. r=margaret

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/34643/diff/4-5/
Attachment #8718623 - Flags: review?(margaret.leibovic)
Comment on attachment 8718623 [details]
MozReview Request: Bug 1246130 - Gather onboarding telemetry experiments separately from other active experiments. r=margaret

https://reviewboard.mozilla.org/r/34643/#review32297

::: mobile/android/base/java/org/mozilla/gecko/firstrun/FirstrunPagerConfig.java:33
(Diff revision 5)
> -        } else if (isInExperimentLocal(context, Experiments.ONBOARDING2_B)) {
> +            GeckoSharedPrefs.forProfile(context).edit().putString(Experiments.PREF_ONBOARDING_VERSION, Experiments.ONBOARDING2_A).apply();

One thing I just realized with this approach... is it a problem that we're writing these prefs on the main thread? Should we be doing this on a background thread instead? This is also a concern because this is happening during startup.

::: mobile/android/base/java/org/mozilla/gecko/util/Experiments.java:76
(Diff revision 5)
> +            return Experiments.ONBOARDING2_A.equals(experiment);

You can drop the reference to `Experiments` here (and elsewhere in this patch), now that these methods are inside the `Experiments` class.

::: mobile/android/base/java/org/mozilla/gecko/util/Experiments.java:84
(Diff revision 5)
> +    }

I don't want to spend more time debating this patch, but taking a step back to look at what we have here, wouldn't it be simpler to have a method that just returns the onboarding experiment for the user, then control the logic in FirstrunPagerConfig with that value (and just write that value to the pref directly)? We could name it something like `getOnboardingExperiment` and it would return `ONBOARDING2_A`, etc.

I understand we created this `isInExperimentLocal` method modeled after the Switchboard `isInExperiment` method, but since this method is only designed to handle the onboarding case, I feel like we might as well just create a specific method for that.

No need to address this now, just something to think about.
Attachment #8718623 - Flags: review?(margaret.leibovic) → review+
> One thing I just realized with this approach... is it a problem that we're
> writing these prefs on the main thread? Should we be doing this on a
> background thread instead? This is also a concern because this is happening
> during startup.
> 

Nope, apply() makes this an async call anyways, and it's probably as much work to make a runnable and assign it to a thread.

> ::: mobile/android/base/java/org/mozilla/gecko/util/Experiments.java:84
> (Diff revision 5)
> > +    }
> 
> I don't want to spend more time debating this patch, but taking a step back
> to look at what we have here, wouldn't it be simpler to have a method that
> just returns the onboarding experiment for the user, then control the logic
> in FirstrunPagerConfig with that value (and just write that value to the
> pref directly)? We could name it something like `getOnboardingExperiment`
> and it would return `ONBOARDING2_A`, etc.
> 
> I understand we created this `isInExperimentLocal` method modeled after the
> Switchboard `isInExperiment` method, but since this method is only designed
> to handle the onboarding case, I feel like we might as well just create a
> specific method for that.
> 
> No need to address this now, just something to think about.

I agree that it makes the logic a little more complicated, and I'm ambivalent about doing it either way but I'll keep it in mind.
Hm, backed this out - apparently this builds in IntelliJ, but breaks when built by the buildbots. I'll have to investigate next week.

https://treeherder.mozilla.org/#/jobs?repo=fx-team&revision=a47d5f97fca8

backout: https://hg.mozilla.org/integration/fx-team/rev/ba617d51a0fd
Nick, NI-ing you on this "builds with gradle but not otherwise" - any thoughts?
Flags: needinfo?(nalexander)
Talked to Nick about this - I also needed to add some dependency changes to moz.build because util/Experiments.java depends on both thirdparty and GeckoSharedPrefs.
Flags: needinfo?(nalexander)
Comment on attachment 8718623 [details]
MozReview Request: Bug 1246130 - Gather onboarding telemetry experiments separately from other active experiments. r=margaret

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/34643/diff/5-6/
https://hg.mozilla.org/mozilla-central/rev/84a0355fb1c8
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 47
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: