Closed Bug 1453613 Opened 3 years ago Closed 3 years ago

Implement full installer telemetry

Categories

(Firefox :: Installer, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
Firefox 65
Tracking Status
firefox65 --- fixed

People

(Reporter: RT, Assigned: mhowell)

References

(Blocks 1 open bug)

Details

User Story

As a product manager I want to understand the share of dark funnel installs attributed to the Mozilla full installer so I can understand the real size of the dark funnel.
As a product manager I want to understand the full installer install volume, install success rate and install error distribution, so I can make sure the install success rate is optimized.

Acceptance criteria:
- The full and stub installer implament a new field clarifying the installer type. 0: neither Mozilla stub nor Mozilla full installer, 1: Mozilla stub installer, 2: Mozilla full installer
- If a field is not applicable to a specific installer it is set to ‘null’
- Proposal for all fields per installer type: https://docs.google.com/spreadsheets/d/1z9IOWZrjI6fZ_vfMqErh-dzJBAz6YNz3D-oDI4vfsPE/edit#gid=0
- Some areas of the stub which are now deprecated (default_path, set_default, new_default, intro_time, disk_space_error, no_write_access) would still apply to the full installer
- Telemetry pipe-line and DSMO-RS updates to expose the data on redash
- No opt-out available in the wizard per stub implementation
- Ability to identify MSI installs (the MSI installer wraps the full installers)
 

Out of scope:
- full installer silent mode
- No need to implement telemetry on the full installer for the installation type (standard VS custom) or whether the user left the shortcuts selected 

Constraints and assumptions: 
- Do not increase download/installation time
- Do not increase failure rate- 
- Full installers can be used offline, we assume telemetry from these installations won’t be collected

Attachments

(2 files)

No description provided.
User Story: (updated)
Blocks: 1429889
Blocks: 1316136
I've started working on this based on the list of fields linked from the user story.

We should discuss how the opt-out for this is going to work. There definitely needs to be an option for silent installs (so command line, INI, MSI, all those). But I'm not sure where to put the option in the GUI for interactive installs. All I can easily add is checkboxes, so I think it has to be one of those, I just don't know where best to put it. The three basic options are the intro page, the summary page (the last page before the actual installation runs), or the finish page. One advantage of having it on the intro page would be that we can send a ping when the user cancels the install, because they've already had the chance to opt-out. Otherwise we can't really send anything in that case. But other than that I'm not sure what the meaningful differences are. Romain, what do you think?
Assignee: nobody → mhowell
Status: NEW → ASSIGNED
Flags: needinfo?(rtestard)
Priority: -- → P1
Apologies for the delay there Matt.
Having install cancellation data sounds like it could be valuable and I see no issues with not having a checkbox on the intro page.
Michael, do you have any strong feelings on this?
Flags: needinfo?(rtestard) → needinfo?(mverdi)
Hi, I just talked with verdi to understand this better.  

@mhowell: is the proposal to collect the same telemetry data for the Full Installer as we currently do now for the stub installer? 

If the data is all category 1 and 2, then we already explain this in the Firefox Privacy Notice (which applies to both the stub and full installer).  No further notice or opt-out is required for additional category 1 and 2 pings in the full installer (especially where we already collect this information in the stub installer).
(In reply to Romain Testard [:RT] from comment #2)
> Michael, do you have any strong feelings on this?
I don't think we should an opt-out if we're only collecting type 1 & 2 data.
Flags: needinfo?(mverdi)
Yes, this would be the same data the stub currently collects; it is mostly category 1 with a handful of category 2 probes.

In that case I'll leave out the opt-out from the GUI. I'd really prefer to keep the option for silent installs though, both because I want to keep it available for enterprise use but also because it's the easiest way to prevent the full installer from sending a ping when it's being executed by the stub (which will still be sending its own ping).
(In reply to Matt Howell [:mhowell] from comment #5)
> Yes, this would be the same data the stub currently collects; it is mostly
> category 1 with a handful of category 2 probes.
> 
> In that case I'll leave out the opt-out from the GUI. I'd really prefer to
> keep the option for silent installs though, both because I want to keep it
> available for enterprise use 

I don't think we should do something different because the UI is different. I don't see why we should add an option here for enterprise. I would think the expectation is that we already collect this information in order to make a quality product.
Well. Here's the thing from the purely engineering side. The full installer is going to need some way to tell that it was launched by the stub so that we don't send two pings (the full install ping won't have any interesting data that's not in the stub ping). The only other way I can think of for it to do that is unnecessarily complicated and brittle; it involves walking the process tree and see if an instance of the stub is one of its ancestors. So that leaves me with having to add a command-line flag, which will necessarily look exactly like the ones that we document and expose to users. It would be called something like "LaunchedFromStub", not something that makes it obvious that it's going to disable the full install ping. So, all I can really do is just not document this flag. That would make it kind of like a pref that can only be changed from about:config; I can't actually stop anyone from using it, but they'd have to go looking to be able to know that it exists. Does that make sense?
Romain, would you be able to fill out the data review request form [https://github.com/mozilla/data-review/blob/master/request.md] for this bug? I thought I could do it myself, but now that I look at the actual questions they're a lot more product-oriented than technical. I can fill in any required details.
Flags: needinfo?(rtestard)
(In reply to Matt Howell [:mhowell] from comment #7)
> Does that make sense?

So what you're saying is that you'd make the opt-out a hidden, undocumented (except for here), pref? I guess that would be the better of two less than ideal options. Could there be other solutions like filtering the pings out of the data afterwards?
(In reply to Verdi [:verdi] from comment #9)
> So what you're saying is that you'd make the opt-out a hidden, undocumented
> (except for here), pref?

The pref thing was just an analogy. I was having a hard time thinking of how to explain how this would work and I didn't get all the way there; sorry about that. It would be a hidden and undocumented command-line parameter. Also, it would not be specifically an opt-out for this ping, we could end up using it for other things. It would only be clear that it had that purpose from actually reading the code.

> Could there be other solutions like filtering the pings out
> of the data afterwards?

There isn't really enough info in these pings to be able to deduplicate them that way; we couldn't be sure just from looking at them whether a stub and a full ping received at about the same time came from the same machine or not. The other options I can think of all end up being equivalent to what I've proposed.
User Story: (updated)
Flags: needinfo?(rtestard)
(In reply to Matt Howell [:mhowell] from comment #8)
> Romain, would you be able to fill out the data review request form
> [https://github.com/mozilla/data-review/blob/master/request.md] for this
> bug? I thought I could do it myself, but now that I look at the actual
> questions they're a lot more product-oriented than technical. I can fill in
> any required details.

Sure, here it is: 

What questions will you answer with this data?
- How many full installer installs happen daily?
- What is the share of dark funnel installs attributed to the Mozilla full installer
- Are there technical issues preventing successful installations in certain environments?

Why does Mozilla need to answer these questions? Are there benefits for users? Do we need this information to address product or business requirements?
- The dark funnel team is tasked with identifying dark funnel sources. We belive the full installer drives a significant share of dark funnel new profiles.
- We are blind regarding full installer install success rate. We may have critical issues on the full installer we’re not aware of. This will help quantify install errors by type to then prioritize fixing.

What alternative methods did you consider to answer these questions? Why were they not sufficient?
- We looked at downloads from mozilla.org although the full installer is often used in corporate environments where 1 download equals thousands of installs

Can current instrumentation answer these questions?
- No

List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox data collection categories on the Mozilla wiki.
- List: https://docs.google.com/spreadsheets/d/1z9IOWZrjI6fZ_vfMqErh-dzJBAz6YNz3D-oDI4vfsPE/edit#gid=0

How long will this data be collected? Choose one of the following:
- I want to permanently monitor this data. (Romain Testard)

What populations will you measure?
All channels, all countries, all locales

If this data collection is default on, what is the opt-out mechanism for users?
- There will be a a hidden and undocumented command-line parameter.

Please provide a general description of how you will analyze this data.
- Dashboard similar to the one that currently exists for the stub installer:
https://sql.telemetry.mozilla.org/dashboard/stub-installer---key-indicators

Where do you intend to share the results of your analysis?
- Through a re-dash dashboard similar to the stub dashboard: https://sql.telemetry.mozilla.org/dashboard/stub-installer---key-indicators

Hi Chutten, can you please either review or help nominate someone who could review this for us?
Flags: needinfo?(chutten)
DATA COLLECTION REVIEW RESPONSE:

    Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate? 

Yes, this spreadsheet is world-readable: https://docs.google.com/spreadsheets/d/1z9IOWZrjI6fZ_vfMqErh-dzJBAz6YNz3D-oDI4vfsPE/edit#gid=0

    Is there a control mechanism that allows the user to turn the data collection on and off? 

Yes, an undocumented command-line parameter.

    If the request is for permanent data collection, is there someone who will monitor the data over time?

Romain Testard.

    Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 2, Interaction.

    Is the data collection request for default-on or default-off?

Default-on.

    Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)?

No, though it does include attribution information.

    Is the data collection covered by the existing Firefox privacy notice? 

I am unsure, given the lack of documentation of the parameter.

    Does there need to be a check-in in the future to determine whether to renew the data? (Yes/No) (If yes, set a todo reminder or file a bug if appropriate)

N/A, permanent collection.

---
Result: datareview- since I'm unsure that we're meeting all the conditions. Forwarding up to Marshall.

:merwin, I'm requesting confirmation that a "hidden and undocumented command-line parameter" is adequate opt-out for installer Telemetry. Everything else meets requirements for Firefox Data Collection.
Flags: needinfo?(chutten) → needinfo?(merwin)
Just as a clarifying comment - we're already doing this on the stub installer and we're just looking at extending the stub installer capability to the full installer.
Please see Comment 3 "If the data is all category 1 and 2, then we already explain this in the Firefox Privacy Notice (which applies to both the stub and full installer).  No further notice or opt-out is required for additional category 1 and 2 pings in the full installer (especially where we already collect this information in the stub installer).
Mika, can I take your Comment 3 as explicit confirmation that the collections as implemented adhere to the privacy notice? If so, this will be datareview+ and we can clear ni?merwin
Flags: needinfo?(udevi)
Hi chutten: yes, the Firefox Privacy Notice covers the default collection of Category 1 and 2 data (which is what is being proposed here).  I'm clearing the ni for merwin.
Flags: needinfo?(udevi)
Flags: needinfo?(merwin)
Okay, I'm interpreting comment 14 and comment 15 to mean that we have datareview+ with the undocumented command-line parameter. I'm following the documentation at https://docs.telemetry.mozilla.org/cookbooks/new_ping.html which says the next step is to submit a PR to mozilla-pipeline-schemas, so I'll be doing that shortly.
We need to be able to send POST data that's encoded as UTF-8, but the older
version of nsJSON we currently have only support UTF-16.
Pushed by mhowell@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/1df447ff4c77
Part 1 - Update nsJSON to 1.1.1.0. r=agashlin
Pushed by mhowell@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/77fabbff45e0
Part 2 - Add a full installer telemetry ping. r=agashlin
Backed out 2 changesets (Bug 1453613) for Windows MinGW build bustages.
https://treeherder.mozilla.org/#/jobs?repo=autoland&searchStr=win64-mingwclang&fromchange=1df447ff4c775115522c4c70366db19e59b0e12c&tochange=1944104ccd9dfe6f8db7a698a5847240274611ee

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=77fabbff45e0d2ede394a205b7e787360c7ad9fc

Backout link: https://hg.mozilla.org/integration/autoland/rev/1944104ccd9dfe6f8db7a698a5847240274611ee

Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=207277588&repo=autoland&lineNumber=49934


[task 2018-10-23T15:47:48.171Z] 15:47:48     INFO -  package> Processing config: /builds/worker/workspace/build/src/mingw32//etc/nsisconf.nsh
[task 2018-10-23T15:47:48.171Z] 15:47:48     INFO -  package> Processing script file: "installer.nsi" (UTF8)
[task 2018-10-23T15:47:48.172Z] 15:47:48     INFO -  package> Plugin not found, cannot call nsJSON::Set
[task 2018-10-23T15:47:48.172Z] 15:47:48     INFO -  package> Error in script "installer.nsi" on line 953 -- aborting creation process
[task 2018-10-23T15:47:48.172Z] 15:47:48     INFO -  package> /builds/worker/workspace/build/src/toolkit/mozapps/installer/windows/nsis/makensis.mk:46: recipe for target 'instgen/setup.exe' failed
[task 2018-10-23T15:47:48.172Z] 15:47:48     INFO -  package> make[5]: *** [instgen/setup.exe] Error 1
[task 2018-10-23T15:47:48.172Z] 15:47:48     INFO -  package> make[5]: Leaving directory '/builds/worker/workspace/build/src/obj-firefox/browser/installer/windows'
[task 2018-10-23T15:47:48.173Z] 15:47:48     INFO -  package> /builds/worker/workspace/build/src/toolkit/mozapps/installer/packager.mk:90: recipe for target 'make-package' failed
[task 2018-10-23T15:47:48.173Z] 15:47:48     INFO -  package> make[4]: *** [make-package] Error 2
[task 2018-10-23T15:47:48.173Z] 15:47:48     INFO -  package> /builds/worker/workspace/build/src/config/rules.mk:431: recipe for target 'default' failed
[task 2018-10-23T15:47:48.173Z] 15:47:48     INFO -  package> make[3]: *** [default] Error 2
[task 2018-10-23T15:47:48.173Z] 15:47:48     INFO -  package> /builds/worker/workspace/build/src/browser/build.mk:6: recipe for target 'package' failed
[task 2018-10-23T15:47:48.174Z] 15:47:48     INFO -  package> make[2]: *** [package] Error 2
[task 2018-10-23T15:47:48.174Z] 15:47:48     INFO -  /builds/worker/workspace/build/src/build/moz-automation.mk:84: recipe for target 'automation/package' failed
[task 2018-10-23T15:47:48.174Z] 15:47:48     INFO -  make[1]: *** [automation/package] Error 2
[task 2018-10-23T15:47:48.174Z] 15:47:48     INFO -  client.mk:129: recipe for target 'build' failed
[task 2018-10-23T15:47:48.174Z] 15:47:48     INFO -  make: *** [build] Error 2
[task 2018-10-23T15:47:48.194Z] 15:47:48     INFO -  1040 compiler warnings present.
[task 2018-10-23T15:47:48.271Z] 15:47:48     INFO -  Notification center failed: Install notify-send (usually part of the libnotify package) to get a notification when the build finishes.
[task 2018-10-23T15:47:48.317Z] 15:47:48    ERROR - Return code: 2
[task 2018-10-23T15:47:48.318Z] 15:47:48  WARNING - setting return code to 2
[task 2018-10-23T15:47:48.318Z] 15:47:48    FATAL - 'mach build -v' did not run successfully. Please check log for errors.
[task 2018-10-23T15:47:48.318Z] 15:47:48    FATAL - Running post_fatal callback...
[task 2018-10-23T15:47:48.318Z] 15:47:48    FATAL - Exiting -1
[task 2018-10-23T15:47:48.319Z] 15:47:48     INFO - [mozharness: 2018-10-23 15:47:48.319002Z] Finished build step (failed)
[task 2018-10-23T15:47:48.319Z] 15:47:48     INFO - Running post-run listener: _shutdown_sccache
Flags: needinfo?(mhowell)
Since it's only the MinGW build that's failing, I suspect a case-sensitivity issue causing the NSIS compiler to be unable to find nsJSON.dll. We were only using nsJSON in the stub installer before, and the MinGW job doesn't appear to build the stub, so this would be the first time it's seen this file. I suppose I'll try renaming the DLL to all lower case; not sure what else to do.
Flags: needinfo?(mhowell)
Okay, I don't have the slightest idea what's happening. I think I was wrong about case-sensitivity being an issue because renaming the file didn't help and in fact broke the native Windows builds somehow (see my try push [0]). The file that makensis complains about being missing is not missing, it is exactly where it is supposed to be.

Tom, is there any way at all that I can set up a local Linux system to run the MinGW build? I am tearing my hair out trying to debug this in the Taskcluster shell, plus I think I'm going to need something like strace to make any more progress (to see where makensis is looking for nsJSON.dll that apparently doesn't match where it's supposed to look) and that doesn't seem to be available in that environment.

[0] https://treeherder.mozilla.org/#/jobs?repo=try&revision=153136a30d19710acfa305dbbc51442e38728d85&selectedJob=207330616
Flags: needinfo?(tom)
Tom was extremely helpful and got me exactly what I needed over Slack.

The problem is that somehow I managed to upload the diff for part 2 in such a way that bug 1486026 was triggered on the binary file from part 1, which erased its contents. I separated the binary file into its own patch specifically in order to avoid this. Somehow it happened anyway when I ran arc diff to reupload part 2, which I only needed to do because Lando wasn't able to figure out who its author was for some reason. I'll not be using Phabricator again for anything in the near future, since apparently I am not capable of interacting with it without causing catastrophic but invisible damage. I'll likely even push this patch to inbound myself instead of attempting to fix the version that's in Phabricator, because available evidence shows that I can't do that reliably.
Flags: needinfo?(tom)
Pushed by mhowell@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/279926d68bfc
Part 1 - Update nsJSON to 1.1.1.0. r=agashlin
https://hg.mozilla.org/integration/mozilla-inbound/rev/32d8d8c0d44e
Part 2 - Add a full installer telemetry ping. r=agashlin
https://hg.mozilla.org/mozilla-central/rev/279926d68bfc
https://hg.mozilla.org/mozilla-central/rev/32d8d8c0d44e
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 65
FYI for anyone watching this bug, this data is now being collected and is available to be queried on sql.telemetry.mozilla.org in the Athena table called telemetry.firefox_installer_install_parquet.
See Also: → 1598111
You need to log in before you can comment on or make changes to this bug.