Closed Bug 1132683 Opened 10 years ago Closed 10 years ago

Compare ComScore and site impression data


(Content Services Graveyard :: Tiles: Data Processing, defect)

Not set


(Not tracked)

38.3 - 23 Feb


(Reporter: mzhilyaev, Assigned: mzhilyaev)



(Whiteboard: .003)

Compare ComScore data and site impression data collected via telemetry experience.
To give some more context, we want to see how similar/different our ranking of top sites/eTLD+2s compared to comscore top sites (or alexa top sites) with the caveat that our data is from beta users.
Whiteboard: .? → .003
The results of the study here:

It appears that comScore data and newtab data is linearly correlated for limited number of highly ranked sites.,1,171692125,13881430292,49652118190,1,628325,1,2,125237635,5732688362,22355545954,3,292863,5,3,120146016,18525487778,88929943137,2,372984,2,4,118471644,3296905907,26591404465,4,231808,3,5,82526038,658666828,4415708476,5,162044,4,6,62115338,388430018,978910136,23,23429,6,7,59762401,1051968905,2684591826,29,18533,19,8,52081255,855487945,4900159299,7,102241,8,9,46629165,394548733,1325507133,9,60323,7,10,46256922,959666316,1742366673,10,51997,17

It is expected that number of users having these sites in their newtab will be proportional to the number of unique visitors to these sites.  Which gives a direct relationship between newtab impressions and unique visitors.

As site ranks are decreasing (falling below 50), this relation disappears.  Utility sites like, craigslist,,, etc.. gradually disappear from newtab, as users do go there routinely, but not frequent enough to push them into newtab visibility area.
It’s entirely possible that when 100 most fresent urls are collected, the correlation between comScroe data will become more pronounced.
Closed: 10 years ago
Resolution: --- → FIXED
Thanks for checking the newtab impressions vs comscore and alexa ranks. Do you have the scripts you used to analyze this somewhere? We'll probably want to rerun them when we have single impressions from unique users and see if the site rank relations change.
no scripts this time. Data is joined in one file loaded in octave, and octave has the functionality to do stats and graphs and the like.  Perhaps README to protocol what had been done?
Depends on: 1062708
You need to log in before you can comment on or make changes to this bug.