Closed Bug 733616 Opened 12 years ago Closed 7 years ago

Request for add-on data from crash reports

Categories

(Socorro :: Data request, task)

x86_64
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: justin.lebar+bug, Unassigned)

References

Details

You may have seen the massive dev-planning thread about add-ons.  We've reached the point where we'd like to analyze which add-ons are popular, but the only data we have at the moment lumps together enabled and disabled add-ons.

AFAICT, the list of active add-ons is sent in each crash report.  I'd like to access this data from three separate date ranges:

 1) The past week, and
 2) The week immediately before Firefox 8 was released
 3) The week immediately after Firefox 8 was released

In particular,

  for each date range R,
   for each of the top few hundred add-ons A in R,
    I'd like to know what fraction of crashes in range R occurred with add-on A enabled.

For (2) and (3), I only need data from Firefox 7 and 8 crashes, respectively.  For (1), I'd like data from all builds.  Separating each of these reports by platform (and by build id, for (1)) would be interesting, but I'd also like the combined reports.
These reports will likely underestimate the scope of the problem, because it seems that the list of add-ons is not always instantiated before Firefox crashes, if Firefox crashes during startup.

However, we can adjust our numbers upwards by using known data.  We know, for example, how many active daily users Adblock Plus has, and we'll know what fraction of crashes that corresponds to.  So we can compute a multiplier to increase all our install proportions.

If necessary, we can also look at crashes which happen after some amount of uptime -- these should have a fully-populated list of add-ons.
I'm guessing this kind of request will need to come out of hbase.

we probably also want to know how many crashes happen with no addons enabled and if addon compatibility checking has been enabled or disabled in those older releases.

here a possible format for the report

for the given range of dates for the release being studied.

  .XXX  no addons enabled

  .YYY  addon A was present  XX% compat checking enabled, 
                             YY% compat checking disabled or not set

  .ZZZ  addon B was present  XX% compat checking enabled, 
                             YY% compat checking disabled or not set

as jlebar mentioned no addons enabled will probably make up a high pct. of reports, then I'd suspect the ranking to basically follow the popularity of the addon.  where we see deviations from the popluarity ranking that addon would be suspect for more investigation as a contributor to crashes.   the contribution to crashes might also be direct (crashy code in the addon), or it might also be indirect where the addon just contributes to extra memory use.  not sure there is a way to decipher the direct or indirect crashes but maybe some things in the siganature or concentration of number of signatures might give us some clues.  maybe another interesing piece of data would be the total number of signatures where the addon is present.
Also note that what you get here is also heavily skewed by the fact that crashes might not be evenly spread throughout our audience but are biased due to some add-ons possibly *causing* crashes and therefore being over-represented in this sample.

Still, another possibility to remove the "startup and therefore no add-ons loaded yet" bias is to look for the UUID of the "default" theme add-on, which is always installed.
dbaron's reports have quite a bit of the data suggested for the report format in comment 2.

reformated the start of the report might look like these reports that I'll let run each day if they turn out to be useful.

https://crash-analysis.mozilla.com/chofmann/20120306/topaddon-crashes-20120306_Firefox_9.0.1-interesting-addons

https://crash-analysis.mozilla.com/chofmann/20120306/topaddon-crashes-20120306_Firefox_10.0.2-interesting-addons.txt

https://crash-analysis.mozilla.com/chofmann/20120306/topaddon-crashes-20120306_Firefox_11.0-interesting-addons.txt

https://crash-analysis.mozilla.com/chofmann/20120306/topaddon-crashes-20120306_Firefox_12.0a2-interesting-addons

https://crash-analysis.mozilla.com/chofmann/20120306/topaddon-crashes-20120306_Firefox_13.0a1-interesting-addons

kiaro,  it might be interesting to hook up your "explosiveness" algorithm to these kind of ranked lists to help flag addons that seem to be rising rapidly in crash volume.   let watch the reports for a bit to see how much volatility there is in the ranking and volume.

it was interesting to note that the default addon is only around 59% pct of the time on 9.0.2, whereas it was around 84-95% in the other releases.
I also wonder why the langpacks show up so frequently in 10 and 11, but not so much in 9, 12, and 13.
It looks like someone is still playing with these files, because they keep changing on me.  :)

I looked at the top add-ons on Windows (there's a long tail).  I've marked the add-ons I either know aren't on AMO or don't recognize with a *.

This is better than I'd expected.  The likely drive-by installed stuff is about equal with the user-installed stuff.  But I wouldn't call this good...

  66283	{972ce4c6-7e08-4474-a285-3208198ce6fd} Default theme
  7438	jqs@sun.com Java Quick Start
  7312	{d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d} Adblock Plus
* 5645	yasearch@yandex.ru Yandex.Bar
  4612	{37964A3C-4EE8-47b1-8321-34DE2C39BA4D} Download Statusbar
* 4220	{635abd67-4fe9-1b23-4f01-e679fa7484c1} Yahoo! Toolbar,
* 3888	ffxtlbr@babylon.com
  3728	{b9db16a4-6edc-47ec-a1f4-b86292ed211d} Video DownloadHelper,
* 3587	wrc@avast.com
* 3233	{20a82645-c095-46ed-80e3-08825760534b} Microsoft .NET
* 2937	{CAFEEFAC-0016-0000-0031-ABCDEFFEDCBA} Java QuickStart
* 2264	toolbar@ask.com
  2081	{e4a8a97b-f2ed-450b-b12d-ee082ba24781} Greasemonkey
* 2081	{1E73965B-8B48-48be-9C8D-68B920ABC1C4} AVG Safe Search
* 1999	avg@toolbar
  1933	firebug@software.joehewitt.com Firebug
  1794	{D4DD63FA-01E4-46a7-B6B1-EDAB7D6AD389} Download Statusbar (again)
  1525	personas@christopher.beard Personas
  1479	{19503e42-ca3c-4c27-b1e2-9cdb2170ee34} FlashGot, https:
* 1416	wtxpcom@mybrowserbar.com  
  1282	{1018e4d6-728f-4b20-ad56-37578a4de76b} Flagfox
* 1253	{82AF8DCA-6DE9-405D-BD5E-43525BDAD38A} Skype
* 1231	plugin@yontoo.com
  1205	{a0d7ccb3-214d-498b-b4aa-0e8fda9a7bf7} WOT
* 1187	{EB9394A3-4AD6-4918-9537-31A1FD8E8EDF} DealPly (malware?)
  1171	{dc572301-7619-498c-a57d-39143191b318} Tab Mix
  1139	{DDC359D1-844A-42a7-9AA1-88A850A938A8} DownThemAll!
  1132	elemhidehelper@adblockplus.org Adblock Plus element hiding helper
* 1128	bbrs_002@blabbers.com
* 1065	{EEE6C361-6118-11DC-9C72-001320C79847} SweetIM toolbar
* 1029	mozilla_cc@internetdownloadmanager.com IDM CC
* 1015	{23fcfd51-4958-4f00-80a3-ae97e717ed8b} DivX Plus Web Player
* 1005	{bf7380fa-e3b4-4db2-af3e-9d8783a45bfc} uTorrentBar
>  I've marked the add-ons I either know aren't on AMO or don't recognize with a *.

I don't mean to imply that all of these add-ons are bad.  But they're at least not getting any kind of QA from Mozilla.

My key for deciphering GUIDs is at https://etherpad.mozilla.org/BdtQX2tCKS  The files currently have names for many add-ons, but not all...
> It looks like someone is still playing with these files, because they keep changing on me.  :)

I added a bit more name info where it was available.  should be done playing for a while.  lets watch an see how this runs for a few days.
I added another report that is a bit harder to look at, but collects together all the signatures for any given addon like adblock plus or other...

these "details" reports like 

https://crash-analysis.mozilla.com/chofmann/20120306/topaddon-crash-details-20120306_Firefox_11.0-interesting-addons.txt

should also show up every day.
(In reply to chris hofmann from comment #5)
> I also wonder why the langpacks show up so frequently in 10 and 11, but not
> so much in 9, 12, and 13.

That's probably because Linux distros might install langpacks by default, and 10/11 might be the most common installed versions there.
we have better data sources now
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.