Closed Bug 1222258 Opened 9 years ago Closed 8 years ago

[breakdown] Allow for attribution of acquired Firefox Desktop users

Categories

(Firefox :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: cmore, Unassigned)

References

Details

(Whiteboard: [fxgrowth])

User Story

User stories: https://docs.google.com/a/mozilla.com/document/d/102y4dThQJ2DPAxI6H7M8gyT8FoR8s8ao9XdGkvV_9sU/edit?usp=sharing
A continuous challenge that we have with Firefox desktop is that it is nearly impossible to understand how we acquire active users. I am constantly asked "Hey, we hear downloads are up, why aren't MAUs?". While we have Google Analytics on our website experiences, that only tells us how users land on www.mozilla.org and the rate that they click on the download button. What we don't know is further down the funnel and how those users become active users.

This is how we are thinking about the acquisition funnel:

1) Non-Firefox user floating through the web
2) Person searches for Firefox or see a Firefox advertisement
3) Click on result or ad
4) Lands on a www.mozilla.org page 
5) Clicks download button
6) Downloads Firefox
7) Installs Firefox
8) Runs Firefox
9) Uses Firefox

Currently, we can understand steps 1-6 via Google Analytics and we have done a lot of optimization to ensure that top of the funnel is efficient and converts well to download. The issue is that downloads are a bit of a vanity metric and high download rates don't always translate to high product usage.

The only way that we can understand the entire funnel is by using a funnelcake. A funnelcake is simply a version of Firefox that has the funnel tagged with a unique cohort number that allows for additional analysis.

For example, this is how we currently use funnelcake builds to measure the efficiency of the complete funnel. This will be funnelcake cohort ID 1234.

1) Non-Firefox user floating through the web
2) Searches for Firefox or see a Firefox advertisement
3) Click on result or ad
4) Lands on a www.mozilla.org page (Link to download is dynamically changed to funnelcake 1234 build for a specific audience)
5) Clicks download button
6) Downloads Firefox (Funnelcake downloaded)
7) Installs Firefox
8) Runs Firefox (The /firstrun/ URL includes a ?f=1234 parameter referring to the funnelcake ID)
9) Uses Firefox (The funnelcake cohort ID is also included in the channel as as in release-cck-mozilla1234)

While funnelcakes are useful in being able to attribute product usage and retention to specific audiences, it does not scale. For all of the potential sources, mediums, and campaigns on how users are acquired, we would have to create a funnelcake build for every combination. This is not realistic and it would mean that we would have to be automatically creating hundreds of separate builds of Firefox for every language and platform. This is a bit crazy and should be avoided.

We do have an idea on how to do attribution without having to create a funnelcake for every potential way a new Firefox users is acquired.

Here's the idea:

* Dynamically change the downloaded file name to include a cohort ID in the exe, dmg, or bz2. For example, Firefox-42.0-1234.exe. This can be for either the stub or full installer.

* When Firefox is installing, parse the file name and extract the cohort ID if exists. 

* Installer stores the cohort ID in the base set payload in specific key/value pair within Unified Telemetry (FHR v4)

* When Unified Telemetry sends the daily payload of browser meta data, the cohort ID is passed along.

As for specifically how to generate the cohort ID value, we could use the same source/medium/campaign attributes when looking at web metrics. For example.

source = google.com
medium = organic
campaign = none

or

source = facebook.com
medium = display
campaign = fx-spring-16

We could automatically take the source+medium+campaign and hash the values and come up with a short string that would uniquely describe how a visitor landed on our website. If the URL has the Google Analytics utm_(source|medium|campaign) parameters, we can manually set these three values instead of automatically trying to figure it out from the referring URL.

As for privacy, we are not wrapping Google Analytics in a DNT condition via bug 1217896. We could also do the same thing with attribution and make sure we don't do attribution if we have a valid DNT signal.

The most challenging part of this bug will be to do dynamic installer file names on our CDN hosted installers. 

Article here related to it: https://www.quora.com/Content-Delivery-Networks/What-CDN-supports-dynamic-file-naming

There is also a method on how to do it via HTML, but it is not completely cross-browser: http://www.w3schools.com/tags/att_a_download.asp (Thanks adavis for this find)
Whiteboard: [fxgrowth]
It's not realistic to create a complete separate build for each situation, but it might be realistic to dynamically generate a stub installer with some embedded data at the end. Altering the user-downloaded filename is unlikely to be a good idea because we know that affects what the user sees and user-friendly filenames improve install rates significantly.

Anyway, we can't prioritize this for 2015, but will revisit for 2016Q1.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #1)
> It's not realistic to create a complete separate build for each situation,
> but it might be realistic to dynamically generate a stub installer with some
> embedded data at the end.

Yes, that is what I was thinking. Dynamic, as generating a build for each variation is not realistic.

> Altering the user-downloaded filename is unlikely
> to be a good idea because we know that affects what the user sees and
> user-friendly filenames improve install rates significantly.

Yeah, we could do a test to see if this is the case by doing a funnelcake and making one variation have a number at the end of the installer and see if it impacts install rates. It's hard to say if normal users would even notice, but it surely should be tested from both the qual and quan side.
 
> Anyway, we can't prioritize this for 2015, but will revisit for 2016Q1.

Agree. Q4 2015 will be over very soon.
Another reason why this bug would be important is that it would help "unpack" our MAU numbers. When we see changes in our MAU, we could go backwards and understand if there were anything changes upstream in our funnel that contributed to the changes. Currently, it feels like we are flying blind when MAU goes up and down and knowing what the main segments driving the trend. Attribution won't be 100% of the understanding of the MAU, but it will be one important aspect given so much focus on acquiring new users.
Heather: let's use this as an example of the new BI process.
Flags: needinfo?(hcrince)
Flags: needinfo?(hcrince)
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #1)
> It's not realistic to create a complete separate build for each situation,
> but it might be realistic to dynamically generate a stub installer with some
> embedded data at the end.

We've talked about wanting to do this kind of thing for other reasons as well... I remember talking about it not long ago in the context of automatically setting the default browser (inspired by how Chrome asks the user _before_ downloading). Also came up recently as a possible way to show a data-choices privacy opt-out before first-run. So being able to stash some extra data into a dynamic installer could be useful in a number of ways.

http://blogs.msdn.com/b/ieinternals/archive/2014/09/04/personalizing-installers-using-unauthenticated-data-inside-authenticode-signed-binaries.aspx describes a few details of a technique for modifying a signed binary without breaking the signature. The post mentions that this has been used by Opera, Dropbox, and others.
This is long overdue. It's how Chrome does it, by including a temporary tag along with the stub installer, which seemingly runs almost entirely via Javascript in the browser. See https://www.google.com/chrome/assets/common/js/chrome-installer.min.js
Being able to do automatic and adhoc attribution from the top of the funnel (download) to the bottom of the funnel (product usage) will help us better understand two areas:

1) Product retention and usage by acquisition channel (e.g. organic, paid search, advertising, referral, direct, etc.)

2) To more easily create ad hoc cohorts to test variations of the funnel. (e.g. trying different variations of the onboarding web flows/in-product comms to see when it is best to talk about the features of Firefox and/or Mozilla's mission)

Understanding these areas would help:

a) Create a more delightful onboarding for our users. We are defining onboarding as when there is a gap between the capabilities of the product and what a user takes advantage of.

b) Understand the quality (retention and usage) of the new users by channel to inform where Mozilla should be making future acquisition investments.

If we could do this, we wouldn't need funnelcake builds of Firefox for all cohort tests. We could reserve funnelcakes for when we actually need to test something new within the product.
(In reply to John Jensen from comment #6)
> This is long overdue. It's how Chrome does it, by including a temporary tag
> along with the stub installer, which seemingly runs almost entirely via
> Javascript in the browser. See
> https://www.google.com/chrome/assets/common/js/chrome-installer.min.js

Still trying to figure out how Chrome is doing it given and how their .exe or .dmg is ingesting the cohort ID/temp tag into a binary on the OS. I downloaded Chrome with both Firefox and Chrome and did an md5 on the installer and it was exactly the same file. So, the temp ID is not within the installer so it must some how consume it later.

Unless.... they set a cookie with the cohort ID in it and then when Chrome imports the settings, they re-read that cookie and set the cohort ID within Chrome. We could do that within Firefox, but then the only cohorts that we would see would be people who import the settings from their old browser and the cookies are still readable.
Just a quick note that will attribution idea will only work with Windows users given that Mac and Linux don't have real installers as we just have compressed disk images that contain the product. That should be fine given the large Windows population, but just wanted to be sure we mention it. We could reserve funnelcakes for Mac/Linux funnel/retention testing too.
In an upcoming sprint, the desktop retention durable team will be writing user stories for how we would use attribution and we will include the people in this thread and Heather Crince as reviewers of the stories.
User stories doc linked in the user story section up top.
User Story: (updated)
(In reply to Chris More [:cmore] from comment #11)
> User stories doc linked in the user story section up top.

Thanks Chris for posting user stories, will review and refine the data collection reqs and gaps.  Once completed, I'll set up review.
Gregg: check out the user stories at the top of the convo and the technical cohort ID idea in comment 0.
Flags: needinfo?(glind)
I talked to :ckprice about when the Firefox Measurement team could delivery this technology and the estimate is around March 14th, we'll have an engineering estimate. Right now, the closest estimate for delivery is the end of Q2 2016.
Flags: needinfo?(glind)
Heather/Benjamin/Cory:

I wrote some pseudocode with help of Gareth on how how the cohortID would need to be calculated:

https://docs.google.com/a/mozilla.com/document/d/1DzIg19kAdtYEzS_waQNCQBfi8CSGj4cl9N8T24WSiyc/edit?usp=sharing

It is important to note that these are *cohortIDs* and not *clientID*. We currently don't have a use for clientID and don't need to uniquely identify any one client as it is more useful to just bucket people anonymously into cohorts based off of their top-of-funnel attributes. Also, for privacy reasons, anonymous cohortID seems to be enough granularity.
Flags: needinfo?(hcrince)
Flags: needinfo?(cprice)
Flags: needinfo?(benjamin)
We're discussing the attribution project next week. Clearing NI's.
Flags: needinfo?(hcrince)
Flags: needinfo?(cprice)
Flags: needinfo?(benjamin)
Renaming this as the initial breakdown for this work, and closing bug. Meta tracker is bug 1259607.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Summary: [tracking] Allow for attribution of acquired Firefox Desktop users → [breakdown] Allow for attribution of acquired Firefox Desktop users
You need to log in before you can comment on or make changes to this bug.