Closed Bug 1526267 Opened 7 years ago Closed 6 years ago

Quantify the number of full installer downloads happening on our servers

Categories

(Data Science :: Investigation, task)

x86_64
Unspecified
task
Not set
normal
Points:
3

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: RT, Assigned: ccd)

References

Details

Brief description of the request:
We recently shipped full installer telemetry (see https://sql.telemetry.mozilla.org/dashboard/full-installer-telemetry) and we were surprised to see that full install numbers are equivalent to stub install numbers.
We assume this may relate to either (1) third party websites pointing downloads to our servers or (2) third party websites copying and serving the full installer from their own CDN (the build_id of the installs are right and the fact that we get full installer pings tell us that they must be using our installer directly).
What is needed: provider the number of daily downloads of the full installer from our servers (to include downloads from https://www.mozilla.org/en-US/firefox/all/ , https://www.mozilla.org/en-US/firefox/organizations/all/ and any other known location where third parties can access our full installers (currently assuming the only such location is archive.mozilla.org).
After discussion with oremj we have access logs with the fields listed in this format: https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/AccessLogs.html, which we can query against using a few different tools.

Link to any assets:
Full installer telemetry dashboard: https://sql.telemetry.mozilla.org/dashboard/full-installer-telemetry

Is there a specific data scientist you would like or someone who has helped to triage this request:
No

Hi Romain, Could you give us your desired timeline for this request?

Flags: needinfo?(rtestard)

This is not critical - 6 weeks timeline would be OK

FYI oremj confirmed that the logs are in sitting in S3 in their raw format and can be queried with Athena.
These logs only cover the cloudfront part of downloads, Akamai download volumes will be provided separately through the Akamai interface.

Flags: needinfo?(rtestard)
Component: General → Investigation
Points: --- → 3
Assignee: nobody → cdowhygelund
Status: NEW → ASSIGNED

Hi, I'm doing some background work on new-user measures and would like to understand this better. Can someone provide more context?

Why do we need to look at CloudFront and Akamai? Is the full installer ping coming through CDNs?

And can you give some context on why your proposed reasons explain why stub numbers would match full install numbers? "(1) third party websites pointing downloads to our servers or (2) third party websites copying and serving the full installer from their own CDN"

Thanks!

Depends on: 1530772

Waiting for access to Cloudfront and Akamai installer logs.

Depends on: 1531418

As per Jeremy Orem, the Akamai full-installer downloads are initiated from the stub installer. The initial full installer requests go through Cloudfront. Awaiting access to Cloudfront logs from Athena.

Initial dashboard implemented from Cloudfront Installer logs: https://sql.telemetry.mozilla.org/dashboard/cloundfront-downloads-full-installers

It appears that the full installer downloads accountable on Cloudfront are ~20% what is observed in full installer ping.

The report regarding the rationale behind the queries used in the dashboard: https://docs.google.com/document/d/1pjFqAA_NkbkWlGBgRZpdFxYvJHmeD-X8vLnzaWsMsHg/edit?usp=sharing

Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Work complete for request. Additional bug filed for next steps in the determination of the source of excessive installers.

Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.