Closed
Bug 698848
Opened 13 years ago
Closed 12 years ago
Determine how representative telemetry pings are of general population based on startup data in amo ping
Categories
(Mozilla Metrics :: Data/Backend Reports, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
Backlogged - BZ
People
(Reporter: taras.mozilla, Unassigned)
Details
(Whiteboard: Telemetry -- needs PM project priority )
Attachments
(1 file)
949.26 KB,
application/pdf
|
Details |
No description provided.
Comment 1•13 years ago
|
||
Since this seems to be based on my idea, I'll relate the comment I made to Gilbert: There is an AMO service ping which consists of a comma separated list of installed add-ons, and three startup time metrics. This ping is not opt-in. I believe there is a config parameter that could turn it off, but we can expect nearly 100% representation of users of later versions of Firefox in it. Those same three startup metrics are also collected in the opt-in Telemetry program. My thought was that we could estimate the bias of the Telemetry opt-in by comparing those measures across the two data sources. Anurag can provide details and access to the AMO service ping data.
Comment 2•13 years ago
|
||
hostname: hm-admin03.scl2.mozilla.com table: addons_pings time string guid string src string appos string appversion string tmain string tfirstpaint string tsessionrestored string user_agent string ds string domain string Please file a bug for SSH access to hm-admin03.scl2.mozilla.com and you should be able to access the Hive table to run queries..
Comment 4•13 years ago
|
||
N
Comment 5•13 years ago
|
||
Grr.. CC fail. :) Not yet. This is something that I hope Chris Jung or Saptarshi might be able to analyze before next Wednesday. CCing Gilbert to figure out scheduling.
Comment 6•13 years ago
|
||
Yeah, working on it. huge data, but should get results by eod
Comment 7•13 years ago
|
||
It may be of interest that we are having a discussion about enabling Telemetry by default on Nightly and Aurora in bug 699806.
Comment 8•13 years ago
|
||
taras, we are comparing startup times btw telemetry and the general population now. does telemetry not collect num. of extensions? this would be helpful in doing this comparison.
Reporter | ||
Comment 9•13 years ago
|
||
(In reply to Christopher Jung from comment #8) > taras, we are comparing startup times btw telemetry and the general > population now. does telemetry not collect num. of extensions? this would be > helpful in doing this comparison. Yes, we report extensions/persona in Firefox 9+. It would be good to be able to rerun this test with Firefox 7,8,9,10(once they hit the stable channel). In Firefox 11, we'll take out startup info out of AMO ping.
Comment 10•13 years ago
|
||
We have the comparitive distributions of startup time (tmain) for 1. WINNT, FFox 7.0 for telemetry vs. All (as obtained from s.a.m.o ping) for all data points in Sep'11 2. WINNT, FFox 8.0{a1,a2} for telemetry vs. All (as obtained from s.a.m.o ping) for all data points in Sep'11. There are some significant differences and the pattern is different in (1) and (2). I'll upload PDFs soon. Joy
Comment 11•13 years ago
|
||
Comment 12•13 years ago
|
||
Hello, I have attached a PDF with 3 pages describing some differences between Telemetry samples and the population (firefox users) in regards to SessionRestored startup time. The Telemetry data packets contain a sessionrestored time, the AMO services ping also contains a session restored ping. We make the assumption that the AMO services ping is representative ot the general Firefox installation. We computed the quantiles for all of the Telemetry sessionrestored values for Firefox 7, 8 and 9 for the month 2011-09. The sessionrestored times were pulled from the s.a.m.o logs for Sep, 2011. Page 1: Proportion of observations vs. Log_e of SessionRestored conditioned on version. Small vertical differences are quite large (since the x-axis is logged). The pattern for version 7 is different from 8 and 9. Page 2: Relative Difference := (Startup time for Telemetry - Startup time for All)/All *100 8 and 9 are similar, but different from 7. 7 is definitely centered around 30% less for Telemetry, though 8 and 9 seem to have a center of 0. Page 3: Log of Abs Difference. Backs up Page 2. Essentially for larger startup times, the difference increases. Here large means ~13 seconds (which is ~60% percentile for 7 and 8) nd the difference is 1.3 seconds. see http://sguha.pastebin.mozilla.org/1382401 for the quantiles for Telemetry vs All | version see http://sguha.pastebin.mozilla.org/1382404 for the quantiles of absolute differences betwen Tel and All startuptimes | version In essense, 8 is similar to 9 in that both Telemetry and All are similar except at extremes. For 7, Telemetry and All are different. Moreover, Page 4 indicates longer startup times are getting shorter with increasing version. Saptarshi
Comment 13•13 years ago
|
||
This looks great Saptarshi. Could you write up some layman summaries related to this analysis so anyone who sees this bug can learn a bit about what we found? Something that explains in what ways we have discovered the population of Telemetry users differs from the general population by version and the change trend we are seeing.
Comment 14•12 years ago
|
||
Saptarshi - Did you write up a layman summary as Daniel suggested?
Comment 15•12 years ago
|
||
Lawrence - thanks so much for following up on this. I'll update this bug with a easy to read summary of the blog post.
Comment 16•12 years ago
|
||
Hello, The blog entry: http://blog.mozilla.com/metrics/2011/12/13/comparing-the-bias-in-telemetry-data-vs-the-typical-firefox-user/ has a description of what we did. In summary: We collected start up times for Firefox 7,8 and 9 for November, 2011 from the log files of services.addons.mozilla.org (SAMO). We also took the same information for the same period from the Telemetry data contained in HBase. 1. Visual inspection (barplots, quantile of difference in deciles and confidence intervals for the median startup times) indicated *no difference* 2. We ran anova tests (anova attempts to attribute the variance in startup times to the version and source (telemetry or all). These tests indicated that neither version nor source proved a difference in startup times. Hence insofar startup times are concerned there is no difference between Telemetry and the General population. -- Saptarshi
Comment 17•12 years ago
|
||
grouping for triage
Status: NEW → ASSIGNED
Whiteboard: Telemetry -- needs PM project priority
Comment 19•12 years ago
|
||
This analysis was completed several months ago. Resolving.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•