accept non-utm params in attributions
Categories
(Firefox :: General, enhancement, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox70 | --- | fixed |
People
(Reporter: hoosteeno, Assigned: mixedpuppy)
References
Details
Attachments
(3 files)
3.22 KB,
text/plain
|
chutten
:
data-review+
|
Details |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
46.73 KB,
image/png
|
Details |
Reporter | ||
Updated•6 years ago
|
Reporter | ||
Comment 1•6 years ago
|
||
:mixedpuppy, are you still up for helping to make this happen?
A lot of teams hope to run experiments that affect install rates. It's impossible to measure install rates without stub attribution, but stub attribution depends on overloading utm parameters, which are there to attribute some marketing further up the funnel (such as an ad). In other words, we can't run websites experiments on install rates without clobbering attribution for someone else.
If we had the option to add an additional parameter to stub attribution, this problem would go away.
+1 We really need an experiment identifier and a variation identifier passed through stub attribution, for us to optimize the funnel in a meaningful way. Let me know how I can help.
Assignee | ||
Comment 3•6 years ago
|
||
I'm actually just looking at the code for other reasons, but really anyone can add the additional fields[1] easily enough. Also add to the tests (also under browser/components/attribution/test).
Updated•6 years ago
|
Reporter | ||
Comment 4•6 years ago
|
||
Attaching a data steward review form.
This is a small extension to a feature that has had extensive review: https://bugzilla.mozilla.org/show_bug.cgi?id=1259607
Comment 5•6 years ago
|
||
Reporter | ||
Comment 6•6 years ago
|
||
- By what means will this be reported?
This will be reported using the stub installer, using Firefox Telemetry.
- The addition of identifiers requires a little extra scrutiny. Is there a
minimum cohort size? Which datasets will have these identifiers/Will this
introduce any new linkages between data sources?
To be clear, there is nothing preventing anyone on the internet from putting any valid string[1] into a link to our download page, and thereby injecting a valid string into our attribution data in telemetry. We do not seek to change the validation criteria currently applied to non-source parameters; we'd want the same validation applied to these strings, too.
We have not designed a minimum cohort size, but generally achieving any kind of statistical confidence in an experiment requires thousands or tens of thousands of participants per variation.
An example of both of the above is found in a currently running experiment, "How To Install"[2], where we are adding utm parameters (only to sessions that lack them) to track experimental branches. Each branch must get more than 16,000 visits to achieve the confidence level we seek.
The data go into 'environment.settings', I believe. They expire after 13 weeks, I believe. More info in https://bugzilla.mozilla.org/show_bug.cgi?id=1292360#c5.
These parameters can be cross-referenced in Google Analytics, just like the current attribution data. That is by design; the data are intended to pass through a limited amount of information about the pre-download journey (on www.mozilla.org) to the post-download instance of Firefox.
- The first question I need to answer in the Data Review Response is where
this will be documented. (this also ties into #1) Can you point me at
documentation that users could use to find out about this collection?
Attribution is documented in environment docs: https://firefox-source-docs.mozilla.org/toolkit/components/telemetry/telemetry/data/environment.html?highlight=attribution
[1] https://github.com/mozilla-services/stubattribution/blob/master/attributioncode/validator.go#L73
[2] https://mana.mozilla.org/wiki/pages/viewpage.action?pageId=90734715
Comment 7•6 years ago
|
||
Great. Thank you for your answers.
(( Near as I can tell from docs and code, environment.settings.attribution
does not expire. Which is fine, it's serving a useful purpose across the profile lifetime. ))
So this will be adding fields to the Telemetry Environment to collect the data. This will require some schema changes in addition to updating the Environment documentation for the new fields. We also recommend automated tests for permanent data collections to ensure nothing breaks it while we're not looking.
I'll proceed with the review now.
Comment 8•6 years ago
|
||
Reporter | ||
Comment 9•6 years ago
|
||
:RT, is this something you can help prioritize?
Comment 10•6 years ago
|
||
Moving NI to Jim since we agreed he will own prioritization for growth tactics
Comment 11•6 years ago
|
||
Anything we can do to help with this or additional information we can provide on why this is important? Would love to understand when/if we might see this land. Thanks!
Assignee | ||
Comment 12•6 years ago
|
||
Landing the change in Firefox is the easy part, we just need agreement (probably from those on this bug) and spec of what params should be added. When that is ready if you need someone to do it ping me.
There have been plenty of issues around this over time, so making sure the end-to-end is solid takes a bit of hand holding. Note this will only work on windows stub installers.
Comment 13•6 years ago
|
||
As RT mentioned, I'm looking at at this and several other related items as part of a larger attribution / top-of-funnel growth strategy. This seems like it may be an easy win, but the biggest question in my mind is how much we need to future-proof this as an important piece of our growth/experimentation infrastructure. I'm currently working to clarify that narrative so everyone can be on the same page. That would include a spec of these params and any necessary signoffs (although from what I understand most of that has already been settled).
Shane, when you mention making sure it's "end-to-end" solid, is there someone I need to connect with to ensure that's the case? Data Engineering maybe?
Reporter | ||
Comment 14•6 years ago
|
||
I'm linking to this GitHub issue because it offers context on a remarkably similar effort underway for FxA. The website needs to handle/preserve upstream referrers as it communicates with downstream channels about both upstream referrers and website conversion points. FxA will support this by adding 2 parameters to its inputs, so we don't have to overload/override utm parameters.
Assignee | ||
Comment 15•6 years ago
|
||
(In reply to Jim T from comment #13)
Shane, when you mention making sure it's "end-to-end" solid, is there someone I need to connect with to ensure that's the case? Data Engineering maybe?
Make sure you know the entire flow of the attributions, each system and who's responsible in that area, so that there is some assurance that everyone knows what's going on.
Our experience with RTAMO has had a number of hickups, but the most recent being a server change that essentially prevented utm params getting into the stub installer. We scrambled to figure out why, and once we found out, realized that particular team had no knowledge of our project, as well as no tests for the attributions getting into the stub installer. The server change in this case was for another project altering some use of the utm params.
Comment 16•6 years ago
|
||
(In reply to Shane Caraveo (:mixedpuppy) from comment #15)
Make sure you know the entire flow of the attributions, each system and who's responsible in that area, so that there is some assurance that everyone knows what's going on.
Thanks Shane! I'm coordinating these type of funnel telemetry efforts in an internal doc:
https://docs.google.com/document/d/1xScPda69dRBTSGbQhiVcPwLPDbQo2ZhvatGkSfoO_f0/edit#
Feedback is appreciated!
Comment 17•6 years ago
|
||
Hi Shane,
Wanted to update you on this. We're reaching agreement on the spec in the document I mentioned above and I've confirmed this will actually help address the issues seen in the RTAMO project. I have a meeting with Data Science tomorrow to review, but that's mostly to understand the implications for future analysis.
What would you need to see in terms of a go/no-go decision on this work?
Jim
Comment 18•6 years ago
|
||
Hey all, this looks good as far as I can tell from a Data Science perspective. One question I'd have is regarding tracking installs through the experiment parameters.
Am I correct that we can only track installs made through stub installers? Would the stub installer data include these new experiment parameters? Do we expect that most download experiments that we would run would be for download of stub installers only?
I realize that having the parameters in the telemetry attribution data is probably adequate, but having it in the installation data as well would be helpful.
Assignee | ||
Comment 19•6 years ago
|
||
Hi Jim,
All I need are the final params that need to be accepted into firefox, and that there is a go on this project overall. This looks like funnel_experiment and funnel_variation. I vaguely recall a length limit on the url parameters somewhere, can we shorten the parameter names? Maybe f_exp and f_var?
Shane
Comment 20•6 years ago
|
||
(In reply to Jesse McCrosky from comment #18)
Hey all, this looks good as far as I can tell from a Data Science perspective. One question I'd have is regarding tracking installs through the experiment parameters.
Am I correct that we can only track installs made through stub installers? Would the stub installer data include these new experiment parameters? Do we expect that most download experiments that we would run would be for download of stub installers only?
Matt, can you help answer Jesse's question about the stub installer data?
(In reply to Shane Caraveo (:mixedpuppy) from comment #19)
Hi Jim,
All I need are the final params that need to be accepted into firefox, and that there is a go on this project overall. This looks like funnel_experiment and funnel_variation. I vaguely recall a length limit on the url parameters somewhere, can we shorten the parameter names? Maybe f_exp and f_var?
Shane
We're a go on the overall project. In fact, it turns out it may end up being helpful for measuring some upcoming initiatives.
In terms of naming, I'd prefer something a little more descriptive to avoid confusion. Any idea what the length limit is? What about funnel_exp
and funnel_var
?
Comment 21•6 years ago
|
||
(In reply to Jim T from comment #20)
Matt, can you help answer Jesse's question about the stub installer data?
(In reply to Jesse McCrosky from comment #18)
Am I correct that we can only track installs made through stub installers?
Right, we don't currently have attribution for full installers. We do have it for Mac now though, so it's not just the Windows stub installer.
Would the stub installer data include these new experiment parameters?
Yes. They would be right next to the current utm parameters everywhere those appear, including in the install ping.
(In reply to Jim T from comment #20)
In terms of naming, I'd prefer something a little more descriptive to avoid confusion. Any idea what the length limit is? What aboutfunnel_exp
andfunnel_var
?
There aren't any specific limits on the names. We have a total of roughly 1000 bytes that the whole string with all the parameter names and values has to fit into, but we aren't currently anywhere near that limit and it could be extended anyway if needed, so using a few bytes to make these names more descriptive is a good idea.
On the telemetry side, we do also have a limit of 200 characters per value, but again no limit on names.
Assignee | ||
Comment 22•6 years ago
|
||
(In reply to Matt Howell (he/him) [:mhowell] from comment #21)
(In reply to Jim T from comment #20)
Matt, can you help answer Jesse's question about the stub installer data?(In reply to Jesse McCrosky from comment #18)
Am I correct that we can only track installs made through stub installers?Right, we don't currently have attribution for full installers. We do have it for Mac now though, so it's not just the Windows stub installer.
Actually we do not have it for mac. While the code is there, and the data is in the mac quarantine database, there is a quirk that doesn't allow us to get the data under normal install circumstances.
Specifically the issue is Bug 1525076 comment 2
Comment 23•6 years ago
|
||
Thanks for that info Matt!
Shane, given that there's no limit on names, let's stick with the original parameters:
funnel_experiment: name/id of the enrolled experiment
funnel_variation: name/id of the variation cohort used in the enrolled experiment
Reporter | ||
Comment 24•6 years ago
|
||
:merwin, would you mind considering the scope of this bug and sharing any feedback or concerns you have with the proposal (comment 0) and the general specs that have emerged here (comment 16)?
Thanks!
Comment 25•6 years ago
|
||
HI Justin,
Clearing NI on Marshall's behalf. Based on the experiment nature of this week and the cohort sizes needed to get the metrics needed, Trust is ok with the addition of the experiment and variation identifier. No additional risk to users identified.
Please let us know if you need anything else and thanks,
Alicia
Comment 26•6 years ago
•
|
||
Hi Shane, do you need anything else to work on this? From my end we're good to go.
(In reply to Jim T from comment #23)
Thanks for that info Matt!
Shane, given that there's no limit on names, let's stick with the original parameters:
funnel_experiment: name/id of the enrolled experiment
funnel_variation: name/id of the variation cohort used in the enrolled experiment
(In reply to Shane Caraveo (:mixedpuppy) from comment #19)
Hi Jim,
All I need are the final params that need to be accepted into firefox, and that there is a go on this project overall. This looks like funnel_experiment and funnel_variation. I vaguely recall a length limit on the url parameters somewhere, can we shorten the parameter names? Maybe f_exp and f_var?
Shane
Assignee | ||
Comment 27•6 years ago
|
||
You'll need to modify the attribution code[1] to allow other attrs. Then add tests for it[2]. If someone isn't available to do this I'll try to get to it, just let me know. And what is the target/deadline?
Assignee | ||
Updated•6 years ago
|
Comment 28•6 years ago
•
|
||
Hey Shane,
I'm hoping to get this into Fx70 at this point - in fact we just had another discussion today where it came up as a blocker for some desired testing. I don't have anyone else identified to do this work, so if you can help out that would be great! If not, let me know and I'll have to try to see who else I can find.
Thanks again for your help!
Assignee | ||
Comment 29•6 years ago
|
||
I'll try to get to it. IIUC you need funnel_experiment and funnel_variation to be allowed through attributions. Let me know if I missed anything.
Assignee | ||
Comment 30•6 years ago
|
||
I'd also like to know for sure that funnel_experiment and funnel_variation are actually what will end up in the stub installer. Currently, "utm_" is stripped and the stub installer gets "source" rather than "utm_source".
Comment 31•6 years ago
|
||
:jcrawford are you aware of any other parameter manipulation? Do we need to coordinate with Romain or Matt on that? I'll cc them just in case.
Assignee | ||
Comment 32•6 years ago
|
||
Assignee | ||
Comment 33•6 years ago
|
||
Assignee | ||
Comment 34•6 years ago
|
||
Assuming that the attributions will have the full name (e.g. funnel_experiment) the patch should be sufficient. If "funnel_" is stripped, it will just be a minor change to the patch.
Assignee | ||
Comment 35•6 years ago
|
||
I need an answer on the param handling in order to land this patch.
Reporter | ||
Comment 36•6 years ago
|
||
This bug involves 4 major components:
- Bedrock sends parameters to
- the stub attribution service, which puts data into the installer for
- Firefox, which pings data to
- Telemetry
This bug relates to #3, Firefox. The blocking bugs just filed cover the other three items on the list. All of the pieces need to be in place before this work is done.
I'd also like to know for sure that funnel_experiment and funnel_variation are actually what will end up in the stub installer. Currently, "utm_" is stripped and the stub installer gets "source" rather than "utm_source".
I believe the Bedrock work and the StubAttribution work can be specified to send the entire parameter name.
Assignee | ||
Comment 37•6 years ago
|
||
Given the comment at https://github.com/mozilla/bedrock/issues/7474#issuecomment-513291225 I suggest we shorten these to experiment and variation.
The URL parameters may be funnel_experiment, but "funnel_" would be stripped in the same manor that "utm_" is stripped for the utm parameters.
Comment 38•6 years ago
|
||
Comment 39•6 years ago
|
||
Backed out changeset cee7b065455b (bug 1515172) for xpcshell failures at browser/components/attribution/test/xpcshell/test_AttributionCode.js
Backout: https://hg.mozilla.org/integration/autoland/rev/29458adca9045857ded140b356d5701d8f2bc559
Failure push: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=cee7b065455bb93c4c395fb9468c5e20d5197b38
Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=257430043&repo=autoland&lineNumber=5104
18:24:11 INFO - TEST-START | browser/components/attribution/test/xpcshell/test_AttributionCode.js
18:24:12 WARNING - TEST-UNEXPECTED-FAIL | browser/components/attribution/test/xpcshell/test_AttributionCode.js | xpcshell return code: 0
18:24:12 INFO - TEST-INFO took 383ms
18:24:12 INFO - >>>>>>>
18:24:12 INFO - PID 11244 | Unable to load \untrusted-startup-test-dll.dll; LoadLibraryW failed: 126[11244, Main Thread] WARNING: Failed to get directory to cache.: file z:/build/build/src/security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp, line 82
18:24:12 INFO - PID 11244 | [11244, Main Thread] WARNING: Failed to get directory to cache.: file z:/build/build/src/security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp, line 82
18:24:12 INFO - PID 11244 | [11244, Main Thread] WARNING: Failed to get directory to cache.: file z:/build/build/src/security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp, line 82
18:24:12 INFO - PID 11244 | [11244, Main Thread] WARNING: Failed to get directory to cache.: file z:/build/build/src/security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp, line 82
18:24:12 INFO - PID 11244 | [11244, Main Thread] WARNING: Failed to get directory to cache.: file z:/build/build/src/security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp, line 82
18:24:12 INFO - PID 11244 | [11244, Main Thread] WARNING: Couldn't get the user appdata directory. Crash events may not be produced.: file z:/build/build/src/toolkit/crashreporter/nsExceptionHandler.cpp, line 2634
18:24:12 INFO - (xpcshell/head.js) | test MAIN run_test pending (1)
18:24:12 INFO - (xpcshell/head.js) | test run_next_test 0 pending (2)
18:24:12 INFO - (xpcshell/head.js) | test MAIN run_test finished (2)
18:24:12 INFO - running event loop
18:24:12 INFO - browser/components/attribution/test/xpcshell/test_AttributionCode.js | Starting testValidAttrCodes
18:24:12 INFO - (xpcshell/head.js) | test testValidAttrCodes pending (2)
18:24:12 INFO - (xpcshell/head.js) | test run_next_test 0 finished (2)
18:24:12 INFO - PID 11244 | [11244, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, kKnownEsrVersion) failed with result 0x80004002: file z:/build/build/src/toolkit/components/resistfingerprinting/nsRFPService.cpp, line 662
Comment 40•6 years ago
•
|
||
Assignee | ||
Comment 41•6 years ago
|
||
Assignee | ||
Comment 42•6 years ago
|
||
Comment 43•6 years ago
|
||
Comment 44•6 years ago
|
||
bugherder |
Assignee | ||
Updated•6 years ago
|
Comment 45•6 years ago
•
|
||
(In reply to Justin Crawford [:hoosteeno] [:jcrawford] from comment #36)
- Bedrock sends parameters to
- the stub attribution service, which puts data into the installer for
- Firefox, which pings data to
- Telemetry
This bug relates to #3, Firefox. The blocking bugs just filed cover the other three items on the list. All of the pieces need to be in place before this work is done.
Looks like this bug also covered #4. Here's a screenshot of about:telemetry after faking attribution (on mac with a referrer url with various url params).
(Also tested setting other params on the url, and they did not appear in the telemetry environment data.)
Comment 46•6 years ago
|
||
How soon is this attribution data needed? This is currently supported on nightly 70 -> release october 22nd. Does it want to go to beta 69 -> release september 3rd (or even release 68)? The server pieces haven't landed yet but don't ride the trains.
Reporter | ||
Comment 47•6 years ago
|
||
:mardak, thanks for asking. We can wait for 70.
Description
•