Closed Bug 519039 Opened 15 years ago Closed 15 years ago

CoolIris Top Crasher [@ cooliris19.dll@0x351f2 ] and [@ cooliris19.dll@0x351a2 ] and [@ libcooliris19.dylib@0x31ea2 ]

Categories

(Firefox :: Extension Compatibility, defect, P1)

3.5 Branch
defect

Tracking

()

VERIFIED WORKSFORME

People

(Reporter: damons, Unassigned)

Details

(Keywords: crash, topcrash)

Crash Data

Attachments

(5 files)

Assignee: nobody → jst
blocking1.9.1: --- → ?
Keywords: crash, qawanted, topcrash
First crash was 200909131722
Bob, can you generate a list of the top URLs that users are hitting this on? Or does someone else have better tools for that? The CoolIris guys are already looking into this, but they're having a hard time reproducing this crash. We should do what we can to get them as much help on this one as we can.
I have a list now, but many of them appear to be on myspace.com which isn't a supported cooliris site for some reason. Facebook is, but I've been unable to reproduce a crash at the moment. I can share the list of urls privately, but we can't share them with cooliris unless we scrub them of personally identifiable information. I've looked at the user comments and they were unhelpful other than to tell me people are frustrated.

I'm doing a local scan using 3.5/winxp and the most recent cooliris xpi at the moment. I did see a crash during restart right after installing but for some reason the report isn't showing up. More news as I have it.
We are currently unable to reproduce the issue in-house on any of our machines, but are definitely continuing to seek and create a repro case. 

On Thursday and Friday we pushed back-end code which affects the way that Cooliris is called when entering a Cooliris enabled site. As of ~4:30 PST we rolled back those changes. Within 24 hours, the code should be rolled out to all Cooliris users, and should take effect immediately for all new Cooliris installs.

We will be monitoring the rate of crash reports to ensure that our changes are reducing the number of crash reports being submitted. Going through the reports, it appears that the crash is occurring on Cooliris enabled sites:
google, google images, facebook, youtube, myspace, picasa, craigslist, deviantart, and flickr.

Thank you for the information regarding myspace, as any information is incredibly useful at this point.
chofmann, the counts I get for grepping for cooliris19.dll@0x351f2 in the crash logs are much much lower (2 orders of magnitude) than what is showing up in the crash-stats query. I don't even come close when I just grep for cooliris19.dll in the logs. Do you know why the logs would have such different counts than crash-stats? Is this bump in crashes only really picked up today and will show up in tomorrow's logs?

fyi, from 7/16 through 9/25 I show 622 unique cooliris19.dll signatures out of 35237 crashes.
> chofmann, the counts I get for grepping for cooliris19.dll@0x351f2 in the crash
> logs are much much lower (2 orders of magnitude) than what is showing up in the
> crash-stats query.

the only thing I can think of is a bug in the way one or both of the reports are generated.   we should probably open another investigation on that.
breakdown of cooliris19.dll@0x351f2 crashes by file name (dates of file name lag actual crash dates by 1 or 2 days)

20090915-crashdata.csv.gz:243
20090916-crashdata.csv.gz:372
20090917-crashdata.csv.gz:179
20090918-crashdata.csv.gz:156
20090919-crashdata.csv.gz:138
20090920-crashdata.csv.gz:136
20090921-crashdata.csv.gz:127
20090922-crashdata.csv.gz:114
20090923-crashdata.csv.gz:104
20090924-crashdata.csv.gz:103
20090925-crashdata.csv.gz:422
20090926-crashdata.csv.gz:14935
ok, thats what I see too.  bit of an explosion to 422 on sept 24th (recorded in the report for 20090925) and a hellish explosion to 14,935 on sept 25 (recorded in the report for 20090926).

its also interesting that from sept 1-14 there were zero reports for this stack signature.

AMO reports the last version for coolris ( 1.11.1 ) was released July 6, 2009. https://addons.mozilla.org/en-US/firefox/addon/5579 

Any chance there have been silent updates of cooliris or any of the components around the 15th

I only looked at on report from the 15th and see
cooliris19.dll  	1.11.1.26925 
npcoolirisplugin.dll 	1.11.0.0

involved in that crash.

In that report I also see some suspects that have been know to cause problems on their update.   
avglvex.dll 	8.5.0.401 	
DealioToolbarFF.dll 	1.0.2.1 		
		
there might be some three or multiway way interaction where a bunch of addons/plugins/addins try's to get their hands on content/URI and cooliris is left hanging with the bag when someone made a mess eariler.  dbaron's addon distribution script might tell us more if this is the case.
the high pct. of start up crashes and the low number of minutes since last crash indicates that a lot of people might be hitting the condition then crashing again on restart.

14935 total crashes for cooliris19.dll@0x351f2 on 20090926-crashdata.csv
8728 start up crashes inside 3 minutes 
34.4105 (minutes) avg time since last crash 


the crash appears present on a wide variety of windows versions so nothing

5647 cooliris19.dll@0x351f2 Windows NT 5.1.2600 Service Pack 3
3708 cooliris19.dll@0x351f2 Windows NT 6.0.6001 Service Pack 1
3010 cooliris19.dll@0x351f2 Windows NT 5.1.2600 Service Pack 2
1229 cooliris19.dll@0x351f2 Windows NT 6.0.6002 Service Pack 2
 799 cooliris19.dll@0x351f2 Windows NT 6.0.6000
 101 cooliris19.dll@0x351f2 Windows NT 6.1.7100
  74 cooliris19.dll@0x351f2 Windows NT 5.1.2600 Dodatek Service Pack 3
  61 cooliris19.dll@0x351f2 Windows NT 5.1.2600 Szervizcsomag 3
  52 cooliris19.dll@0x351f2 Windows NT 5.1.2600 Dodatek Service Pack 2
[and about 40 more various windows distros]

users seem slightly more likely to hit it on firefox 3.0.x than 3.5.x, but not by a significant margin

7880 Firefox 3.0.14
5102 Firefox 3.5.3
 351 Firefox 3.0.11
 321 Firefox 3.0.10
 311 Firefox 3.0.13
 176 Firefox 3.5.2
 103 Firefox 3.0.8
 101 Firefox 3.5
  93 Firefox 3.0.1
  83 Firefox 3.0.5
  70 Firefox 3.0.12
  67 Firefox 3.0.7
[and 14 more firefox versions]


types of sites
11701 http:
1062 \N
 871 [no url provided]
 737 about:blank
 245 about:sessionrestore
 189 https:
  79 chrome:
  38 wyciwyg:
   9 file:
   2 about:privatebrowsing
   1 about:newtab
   1 about:crashes
Attached file 20090926 site list
looks like we crash across about 2000 sites on the 25th (report from the 26th),  but the url data we get does not always map correctly to the one involved in the crash.  see https://bugzilla.mozilla.org/show_bug.cgi?id=411930

the fix for https://bugzilla.mozilla.org/show_bug.cgi?id=517497
  will also make bugs like this easier to talk about since the report names will match the dates of the crash.
there are some other signatures in cooliris19.dll that might also be connected to this problem, but also some  pre-existing signatures that don't seem to be connected.

cooliris19.dll@ signature breakdown
signature distribution
      44
signature list
14935 cooliris19.dll@0x351f2
2703 cooliris19.dll@0x351a2

that second one on the list also not present before sept 14th, then significant ramp in the last few days.

0   total crashes for cooliris19.dll@0x351a2 on 20090914-crashdata.csv
40   total crashes for cooliris19.dll@0x351a2 on 20090915-crashdata.csv
54   total crashes for cooliris19.dll@0x351a2 on 20090916-crashdata.csv
26   total crashes for cooliris19.dll@0x351a2 on 20090917-crashdata.csv
27   total crashes for cooliris19.dll@0x351a2 on 20090918-crashdata.csv
18   total crashes for cooliris19.dll@0x351a2 on 20090919-crashdata.csv
26   total crashes for cooliris19.dll@0x351a2 on 20090920-crashdata.csv
25   total crashes for cooliris19.dll@0x351a2 on 20090921-crashdata.csv
20   total crashes for cooliris19.dll@0x351a2 on 20090922-crashdata.csv
22   total crashes for cooliris19.dll@0x351a2 on 20090923-crashdata.csv
22   total crashes for cooliris19.dll@0x351a2 on 20090924-crashdata.csv
66   total crashes for cooliris19.dll@0x351a2 on 20090925-crashdata.csv
2703   total crashes for cooliris19.dll@0x351a2 on 20090926-crashdata.csv


most of these other signatures are more of a long standing problem with similar numbers of crashes for the past two months/.

183 cooliris19.dll@0x1b60ea       
  69 cooliris19.dll@0x448a
  47 cooliris19.dll@0x1b5f6a
  23 cooliris19.dll@0x17d133
  20 cooliris19.dll@0x2bfcd0
  12 cooliris19.dll@0x1c5635
   9 cooliris19.dll@0x445a
   9 cooliris19.dll@0x181650
   9 cooliris19.dll@0x17d263
[and 34 more signatures unchecked]
Summary: CoolIris Top Crasher [@cooliris19.dll@0x351f2 ] → CoolIris Top Crasher [@ cooliris19.dll@0x351f2 ] and [@ cooliris19.dll@0x351a2 ]
looks like we hit a peak of about 28 and 26 crashes per minute around these times, 

2009 09 25 1908	28
2009 09 25 1935	26

and then have been in decline.
checking just now it looks like 138 crashes for cooliris19.dll%400x351f2 in the last hour, or around two crashes per minute.  So the roll back mentioned in comment 4 might be working.

http://crash-stats.mozilla.com/query/query?version=ALL%3AALL&date=&range_value=1&range_unit=hours&query_search=signature&query_type=exact&query=cooliris19.dll%400x351f2&do_query=1
hi Chris, thank you for those numbers. There have been no updates to the Window Cooliris plugin since July 6th. We have however, made changes to client configuration files since then, and we have tracked down changes made on the 14th and the 25th. 

We have since rolled back those changes and it appears to be making a significant difference. The crash report query shows the rate of crashes to be decreasing for cooliris19.dll@0x351f2 to ~0.64 crashes per minute over the past 4 hours, which is down from 28 crashes per minute on 9/25 per Chris's report, and down from ~2 crashes per minute over the morning hours today (9/27). We expect the rate to slow down even further as the configuration files are picked up by all Cooliris installs, as this rollback seems to be the likely fix.

We have also noticed that this crash is also appearing on Mac FF, and appears to be tracked in this report: http://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A3.5.3&platform=mac&query_search=signature&query_type=exact&query=&date=&range_value=1&range_unit=weeks&do_query=1&signature=libcooliris19.dylib%400x31ea2. The roll back should also address this crash as the crash report data looks to have the same pattern at a lower volume (a large spike on the 25th).  If your team can verify that crash reports are also slowing down for this bug, that would be very helpful.

We will also be investigating if the Delio Toolbar could have played a factor in the crashes and will be updating this bug with our findings as we move forward, as we are still yet to have an in-house repro through testing today and yesterday.

As soon as we are 100% confident that the fix completely solves the issue, we'll be launching a technical post-mortem on our side to better understand what happened to prevent issues like this from happening in the future.
yes, it looks like that mac crash libcooliris19.dylib@0x31ea2 follows the same pattern. see the attachment.  hit peaks of about 7-9 crashes a minute on the 25th

200909251513	7
200909251842	7
200909252311	7
200909251749	8
200909251825	8
200909251939	9

but I just checked and no crashes for that signature in the last hour.
Summary: CoolIris Top Crasher [@ cooliris19.dll@0x351f2 ] and [@ cooliris19.dll@0x351a2 ] → CoolIris Top Crasher [@ cooliris19.dll@0x351f2 ] and [@ cooliris19.dll@0x351a2 ] and [@ libcooliris19.dylib@0x31ea2 ]
0   total crashes for libcooliris19.dylib@0x31ea2 on 20090913-crashdata.csv
0   total crashes for libcooliris19.dylib@0x31ea2 on 20090914-crashdata.csv
34   total crashes for libcooliris19.dylib@0x31ea2 on 20090915-crashdata.csv
46   total crashes for libcooliris19.dylib@0x31ea2 on 20090916-crashdata.csv
26   total crashes for libcooliris19.dylib@0x31ea2 on 20090917-crashdata.csv
26   total crashes for libcooliris19.dylib@0x31ea2 on 20090918-crashdata.csv
16   total crashes for libcooliris19.dylib@0x31ea2 on 20090919-crashdata.csv
21   total crashes for libcooliris19.dylib@0x31ea2 on 20090920-crashdata.csv
18   total crashes for libcooliris19.dylib@0x31ea2 on 20090921-crashdata.csv
24   total crashes for libcooliris19.dylib@0x31ea2 on 20090922-crashdata.csv
18   total crashes for libcooliris19.dylib@0x31ea2 on 20090923-crashdata.csv
12   total crashes for libcooliris19.dylib@0x31ea2 on 20090924-crashdata.csv
102   total crashes for libcooliris19.dylib@0x31ea2 on 20090925-crashdata.csv
2412   total crashes for libcooliris19.dylib@0x31ea2 on 20090926-crashdata.csv
Chris: when can we expect to see an update on how many crashes we're seeing on our side?
I put an annotated chart together showing the time line of events.  We actually hit 41 crashes a minute at the peak, but things are doing much better though Sunday night.   More data from Monday soon.
don't see any reports following that.
Yup, still nothing since the 27th, looks like this one's dealt with! Thanks a ton to Cooliris for jumping on this!
Yep, all good.  Thanks!
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Fixed externally by the add-on vendor. So it's more invalid on our side and an extension compatibility issue.
Assignee: jst → nobody
Severity: normal → critical
Component: Plug-ins → Extension Compatibility
Flags: blocking1.9.2+
OS: Windows Vista → All
Product: Core → Firefox
QA Contact: plugins → extension.compatibility
Hardware: x86 → All
Resolution: FIXED → INVALID
Version: 1.9.1 Branch → 3.5 Branch
invalid doesn't seem right to me.

our mission is to make the web better.  it's to make firefox crash less.

those things have be achieved here due to some good detection, analysis, and quick moves;  not due to discarding of as a bug as invalid.

worksforme and verified by the chart in comment 20 sounds a lot more about how we should treat bugs like this.

I also opened [Bug 519423] add tracking and alerts for "explosive" crash signatures, in an attempt to try and get a lot better and faster at detecting and fixing important crashes like this one that reached the top of the top crash list in just a few hours.
Status: RESOLVED → VERIFIED
Resolution: INVALID → WORKSFORME
INVALID is just a bad resolution all the way around, nearly everyone has a negative response to it even when it is accurate. We invented INCOMPLETE to soften the blow for inexperienced people who are trying to help us by filing bugs, the remaining uses should be renamed too. ASDESIGNED covers part of it, but we need something that can mean "someone else's problem/you're in the wrong place" without sounding like passing the bug.  "NOTABUG" is sometimes proposed, but that fails when it's clearly buggy behavior like a crash, just not ours (but it would work for, and be shorter than, the ASDESIGNED case).
blocking1.9.1: ? → ---
Crash Signature: [@ cooliris19.dll@0x351f2 ] [@ cooliris19.dll@0x351a2 ] [@ libcooliris19.dylib@0x31ea2 ]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: