Closed Bug 976689 Opened 8 years ago Closed 8 years ago

Scope serving funnelcake builds via updates

Categories

(Release Engineering :: Release Automation: Other, defect)

x86_64
Windows 8.1
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: jjensen, Unassigned)

Details

Product and feature owners and marketing team members need the ability to configure AUS to direct random subsets of Firefox Nightly/Aurora/Beta Desktop installations (eg 10% of installations in a particular locale or geographic region) to non-default "funnelcake" builds of Firefox in order to test the quality and effectiveness of new features or campaigns.

Funnelcakes have proven to be quite valuable when used with new downloads, but we need to be able to do the same with *updates*.
This would be great since all of the funnelcake testing we have been doing is for people who download manually and those are only new users or existing users who are "paving over" their install. 

Is there any reason why wouldn't want to have this functionality for the release channel too? We have done most of our funnelcake tests on release channel since the sample sizes are much bigger.
Moving this to the Release Engineering::Release Automation because this would be a larger project than the update server.

jjensen, cmore, bhearsum, and myself met today to discuss background and scope of this work, which was very useful. bhearsum+nthomas are on the hook to provide a rough description of changes needed by Fri 14 March, to inform discussions about this being the right way to go or not.
Before that, lets make sure we have understood some of the possible studies, please correct/widen/narrow as needed. The overall goal is to measure how effective our efforts to retain and engage users are. 

1, Website based
eg Australis - The ability to show both updating and installing users a tour (specific to their existing or new user status). We need to direct a proportion of people to particular URL; and tag them (in a non-identifying way) with a modified update channel to follow usage over time. 

2, In app changes
eg - does a green button work better than a red one ? Is this 'new-feature' advantageous for this locale ? This implies that we package different files (eg inside omni.ja), possibly for some subset of locales. Possibly some feedback via FHR or improved Telemetry, in addition the the update channel technique described above.

In both cases we would want to limit a user to a single study at a time, and repatriate to vanilla afterwards, to keeps things manageable for analysis (ie only one customization at a time). There's an issue for someone to investigate about the channel not changing if the distribution does, possibly due to that moving into the profile (cmore/anurag spoke about this recently). Pave-over installs are an edge-case, ideally we don't wipe out an existing customization but no known mechanism to help www.mozilla.org know what to serve.
Component: General → Release Automation
Product: AUS → Release Engineering
QA Contact: bhearsum
Summary: Allow AUS to direct updates to funnelcake builds → Scope serving funnelcake builds via updates
Version: 3.0 → unspecified
We've talked about what would be needed to implement this request, detail is included below. The overall response is that this adds a lot of complexity to a system that needs to run smoothly, often on a very short lead time. By adding new update paths, it will be a lot harder to keep all the permutations in your head, and for QA to test them all. In particular, the QA load per release goes up for every study added. Anyway, here goes ...

1, Website based
1.1 Platform/Firefox 
* Duplicate distribution logic to support studies in a different directory eg, distribution-mozilla/
** This lets us safely clean up studies without risk of deleting partner customizations
** leave out of update/blocklist URLs, just use channel for metrics, eg release-cck-studyN, release-cck-studyN-cck-bing for partner build

1.2 MARs
* Need to repack MARs once for each study as part of the release process.
** Adds hours (1-3) to release process, for each study that will be active (new or persisting)
** Additional time to run update verification
** Impacts on turnaround for chemspills, unless we elect to wipe all studies
** If we want this for Beta, it would hinder our move to rapid-betas (currently 2/week)
* Modify update generation to always wipe distribution-mozilla (lets us remove old study and start new one in same update)

1.3 Update server
* Need to be able to serve users on the same channel different contents  (one-to-many mapping instead of one-to-one)
** Big change to current assumptions of update server
** Can't be done until new update server is fully deployed to beta/release users (rough ETA: Q3)
* Geo-ip support
** Probably not hugely complex if we can find a geo-ip library for Python
* Add wildcard matching for channel to rules (required to persist studies over updates, eg chemspill)

2. Bigger changes
If we limit to either:
* case 1 plus an extension
or 
* packaging all changes for all users (eg in omni.ja)
* runtime preference controls Firefox behaviour
* distribution-mozilla sets the preference
then we can do this with case1. No additional complexity from RelEng.

If we want to ship different bits to different users then it's a redo of much of the release automation - produce multiple builds/updates (particularly partials), update rules, bouncer. This would be a massive change, at least a quarters worth of work. We'd have to scope it properly.
John, I believe that it would be a mistake to do these kinds of experiments using the update system. Can we WONTFIX this?
Flags: needinfo?(jjensen)
Hi Benjamin, Nick,

This does indeed seem somewhat ... daunting. The Telemetry/FHR work that Benjamin is driving will help with a lot of test use cases in the Beta channel, but AFAICT it would not, for example, allow us to, for example, do the following for significant changes:

In the Beta channel, have X% of Beta users update to a release with Australis, and Y% get one without.

Or could it? Am I wrong?
Flags: needinfo?(jjensen)
> In the Beta channel, have X% of Beta users update to a release with
> Australis, and Y% get one without.
> 
> Or could it? Am I wrong?

This would indeed be very difficult: you'd either have to make all of Australis controllable via pref, or you'd have to package two entire Firefox frontends into the build and somehow switch between them.

But also, we clearly aren't going to do this with Australis now, and I'm skeptical that we should do it ever. Can we maybe just mark this INCOMPLETE until we have this kind of problem that we can't solve any other way?
(In reply to Nick Thomas [:nthomas] from comment #3)
> 2. Bigger changes
> If we limit to either:
> * case 1 plus an extension
> or 
> * packaging all changes for all users (eg in omni.ja)
> * runtime preference controls Firefox behaviour
> * distribution-mozilla sets the preference
> then we can do this with case1. No additional complexity from RelEng.
> 
> If we want to ship different bits to different users then it's a redo of
> much of the release automation - produce multiple builds/updates
> (particularly partials), update rules, bouncer. This would be a massive
> change, at least a quarters worth of work. We'd have to scope it properly.

While rereading this bug I realized another big thing we'd have to do here: we'd need continuous integration (at least on mozilla-beta, but probably on aurora and maybe central as well) and probably beta builds for all the different configs we intend to ship. This wouldn't add a huge amount of technical difficult (though they would be painful), but they would require a significant amount of extra machine time.
jjensen and I discussed this a while back; with Telemetry Experiments landed and with no clear use-case for this, we're going to mark this INCOMPLETE. If we end up with a clear need, we'll reopen for estimation, but as this is a big project with many complications we're going to try to avoid it if at all possible.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.