Open Bug 1210841 Opened 9 years ago Updated 6 years ago

Create regular report of Addons distribution amongst Firefox clients using Unified Telemetry

Categories

(Cloud Services :: Metrics: Product Metrics, defect, P2)

defect

Tracking

(Not tracked)

People

(Reporter: rweiss, Assigned: rweiss, NeedInfo)

References

Details

(Whiteboard: [dataset])

Understanding the distribution of addons by release channel is a high priority. Currently, we have dzeber's table here: https://docs.google.com/spreadsheets/d/1hY7qY2FZQ3sY1J0rfnTFsFcPVwO0WrkO_jdDya9kk48/edit#gid=1883596368 This table must be maintained by hand by dzeber. Instead, we should aim for a dashboard populated by a canonical dataset that should be created via heka. We could update this dataset according to the MAU timeframe. Additionally, the introduction of addon signing will need to be observed in the wild. The introduction of signedState into the environment/addons field of v4 allows for this calculation.
Priority: -- → P2
Changed title of bug to be more generic to the request. :dzeber, can you make the code that produced the raw data for the table available somewhere? :dzeber and :kparlante, please advise as to what approach would be the most efficient towards creating a regularly-updated version of the addon table above: 1) Heka-based filtered dataset (:kparlante, you would likely be the owner of this) 2) Shiny app that is updated monthly (:dzeber, you would likely own this)
Summary: Create heka filter for Addons canonical dataset → Create regular report of Addons distribution amongst Firefox clients using Unified Telemetry
Assignee: nobody → rweiss
Priority: P2 → P1
new process is P1's are for current sprint. will prioritize when we pack into a sprint. Needs or sized as well.
Priority: P1 → P3
Whiteboard: [dataset]
oops, should not have scrubbed this - reverting to its previous state.
Priority: P3 → P1
dzeber has pushed this code here: https://github.com/mozilla/addons-dashboard-v2 There is growing need to understand the state of top addons across the user population on an ongoing basis. Many initiatives (e.g. understanding e10s and interaction with addons) require the ability to identify top addons and estimate their client-share so that we can assess e10s impact on our user base. We also need to identify those addons that we can safely turn off if they are not e10s-compatible, and understanding which addons are relatively unpopular (and thus safe to "turn off") is high priority. :canuckistani and :kev, can I get an assessment of your level of priority on this table?
Flags: needinfo?(kev)
Flags: needinfo?(jgriffiths)
IMO understanding the add-on user population is a high priority that blocks our ability to make decisions shipping e10s to add-ons users. P1.
Flags: needinfo?(jgriffiths)
Component: Metrics: Pipeline → Metrics: Product Metrics
I did a one-off version of this: https://gist.github.com/bsmedberg/5a2c23cb2d5f2cb143af Jeff and Kev, could you review and let me know what else you might need?
Flags: needinfo?(jgriffiths)
This looks great to me. cc'ing Shell to see if she needs anything that isn't here. Shell: open the above gist and scroll WAAAAY down ( you might need to be signed into github to view, I had to log in )
Flags: needinfo?(jgriffiths) → needinfo?(sescalante)
Looking at the results the numbers seem off when comparing them to ADU numbers we're seeing on AMO (measured by update pings for individual addons). As examples, ABP, Ghostery, uBlock Origin, and Cliqz all seem under-reported (by 50-100%) in comparison to AMO, which is surprising as we've always felt AMO also under-reports. The Mozilla Online addons, which are only available in Mozilla China editions, have very high install rates (they're possible, given the marketing around them, but it'd mean that the China edition accounts for a very high percentage of CN ADUs). How confident are we that we're seeing a representative sample globally, as these don't look right. I'll comment separately on format and info reported.
There are some small differences between bsmedberg's report and my table (linked in the description) that are expected wrt the change in dataset but may be adding some bias to the results. The report in Comment 6 is computed from a 1% sample of release users that are submitting UT (v43+) and were active in a month-long period mid-Jan to mid-Feb. By comparison, the table was based on a 10% sample of legacy FHR profiles that were active in a 6-month period. Also, some things are not clear to me around addon data in UT which may affect the results: - If addons are listed in environment.addons, are they specifically *enabled* or just *installed*? The doc suggests they are enabled. This would explain some differences between the UT report and the FHRv2 table, which counts *installed* addons and lists the enabled proportion. I noticed that some of the top-ranking addons in the table which are widely disabled are farther down the UT top 100 (eg. Microsoft .NET Framework Assistant, Norton Toolbar), while the highest-ranked addons in the top 100 are generally enabled when installed, according to the table. - If a profile has no installed/enabled addons, what does the activeAddons field look like - is it null, or non-null but empty? The denominator for computing %s in the UT report is "# active profiles with a non-null but possibly empty addons field", and 48% of those had an empty field. In the table, the denominator was "# profiles active in the period", and IIRC some 40% of those had no addon info. This would affect the size of the %s but not the ordering of the top 100.
> - If addons are listed in environment.addons, are they specifically > *enabled* or just *installed*? It's environment.activeAddons and it is explicitly those which are enabled. > - If a profile has no installed/enabled addons, what does the activeAddons > field look like - is it null, or non-null but empty? It is an empty map ({}) If the field is missing or null, that means that we haven't yet recorded that data, which can happen early in startup or if the addon manager throws an exception and is broken. So I have pretty good confidence in the data: very few people have null.
removing me as Kev will be the approver for this. I believe we were looking at add-ons with over a million users rather than top 100. not sure how large a distinction that is - but the list was under 70 rather than 100, so i think the tail drops off quickly. Not sure if there is a way to sort the list by actively used (not installed and disabled) - from largest to smallest to have priority as well. aside from that i'd ask Kev
Flags: needinfo?(sescalante)
Picking this back up in Q2.
Priority: P1 → P2
See Also: → 1266898
Putting this back on the radar for Q2, and should probably incorporate bug 1177960 here as well. rweiss, do we want to use the method in comment #6 for now, and get the requirements fleshed out in the next couple of weeks?
Flags: needinfo?(kev) → needinfo?(rweiss)
:kev, we should probably get the requirements fleshed out first, as that will determine if the method in the notebook created by :bsmedberg above suits the requirements we need. If we are thinking about ">1 MM" user addons rather than top 100, the script would need changes.
Flags: needinfo?(rweiss) → needinfo?(kev)
You need to log in before you can comment on or make changes to this bug.