Closed Bug 724129 Opened 13 years ago Closed 13 years ago

crash in nsXBLDocumentInfo::cycleCollection::Traverse (caused by addons?)

Categories

(Core :: XBL, defect)

10 Branch
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla10
Tracking Status
firefox10 + ---

People

(Reporter: mccr8, Unassigned)

References

Details

(Keywords: crash, reproducible, topcrash)

Crash Data

Attachments

(1 file)

NoteXPCOMChild is showing up as the top crash in 10, after the giant pile of empty stacks. If I am reading this right, it is about 18.7% of crashes in 10. It was about 2% of crashes in 10b3, 5% in 10 beta5 and 12% in beta 6. All the ones I've looked at look like this: 1 xul.dll GCGraphBuilder::NoteXPCOMChild xpcom/base/nsCycleCollector.cpp:1710 2 xul.dll TraverseBinding content/xbl/src/nsXBLPrototypeBinding.cpp:389 3 xul.dll hashEnumerate xpcom/ds/nsHashtable.cpp:130 4 xul.dll PL_DHashTableEnumerate obj-firefox/xpcom/build/pldhash.cpp:755 5 xul.dll TraverseProtos content/xbl/src/nsXBLDocumentInfo.cpp:435 6 xul.dll hashEnumerate xpcom/ds/nsHashtable.cpp:130 7 xul.dll PL_DHashTableEnumerate obj-firefox/xpcom/build/pldhash.cpp:755 8 xul.dll nsXBLDocumentInfo::cycleCollection::Traverse content/xbl/src/nsXBLDocumentInfo.cpp:473 Correlations suggest it may be addon related.
Crash Signature: [@ GCGraphBuilder::NoteXPCOMChild(nsISupports*) ]
NoteXPCOMChild is only about 2% of crashes in 11, which kind of suggests it isn't a code change. But who knows.
Looks like they are mostly happening within 15 seconds, so probably the first or second CC after startup.
Neal and bz: JST and I were looking over the XBL code, and it looks like Read/WritePrototypeBindings is new in 10 (bug 94199). Is it possible that this could cause some kind of startup crash after upgrading from 9?
Possible, sure... None of the crashes I see have the stack from comment 0, though. I do see some crashing from the NoteXPCOMChild(mBinding) call in nsXBLPrototypeBinding::Traverse. Andrew, were the crashes you were looking at with the stack from comment 0 null-derefs, or something else?
Also, is NoteXPCOMChild null-safe?
Most of the ones I see are like that, so maybe we're looking at different lists or something. Here are 3 out of 5 reports I looked at: https://crash-stats.mozilla.com/report/index/b25cc986-0967-4617-9579-5029b2120202 https://crash-stats.mozilla.com/report/index/0fa5099e-9e67-4625-a7ad-e23182120202 https://crash-stats.mozilla.com/report/index/73494659-9095-48fb-9cba-a14d62120202 The most common thing seems to be EXCEPTION_ACCESS_VIOLATION_EXEC on addresses that look kind of like 0x4246c83. Hmm. In fact, at least judging from the first page of crash reports, the bulk of the crashes are READs or EXECs of that exact address, 0x4246c83. Mostly EXECs. That seems... suspicious. NoteXPCOMChild it should return right off the bat if it is passed null.
gamesbar@oberon-media.com is also a very common extension for these crashes.
I was just looking at the list a search for NodeXPCOMChild gave me on crash-stats, and looking at the crashes for 10. The crash is on this line: if (!child || !(child = canonicalize(child))) Does this involve a virtual function call, perhaps? If we always pass in things with the same busted vtable that would explain EXEC on the same address...
Ah ok. I clicked on the link it gave for NoteXPCOMChild that it gives when you go to top crashes for 10. Canonicalize is basically just a wrapper around a QI: 1313 canonicalize(nsISupports *in) 1314 { 1315 nsISupports* child; 1316 in->QueryInterface(NS_GET_IID(nsCycleCollectionISupports), 1317 reinterpret_cast<void**>(&child)); 1318 return child; 1319 } It probably gets inlined. Does a QI involve a virtual call? I'm not really sure how that all works.
Ah, right, the in->QI is a virtual method invocation...
That's my best guess so far, then, though having the same exact value there is still pretty odd....
Tomer Cohen posted the following over in bug 724267: This bug was filed from the Socorro interface and is report bp-186ed3b4-f1df-49a3-a471-b99d02120204 . ============================================================= Since the last upgrade (9.0.1→10.0), Windows Firefox users are reporting on our community forum about a startup crash every time the browser starts. While we could not easily reproduce it on our own machines, we've found that we could fix it by giving the following instructions to users: a. Start Firefox in Safe-Mode (Please note that the usual routine of restarting in safe mode from the Help menu won't help because the users can't access Firefox UI) b. Tools→Addons c. Uninstall Greasemonkey My tests show that aftere re-installing Greasemonkey on the users machines, nothing wrong happened, so it is safe to remove and reinstall. See URL below for our forum thread and lists of crash ids. http://www.mozilla.org.il/board/viewtopic.php?t=10916
Summary: crash in nsXBLDocumentInfo::cycleCollection::Traverse → crash in nsXBLDocumentInfo::cycleCollection::Traverse (caused by addons?)
Juan and I both tested the Oberon toolbar you get from their site by downloading a game, but I noted in my test results that it didn't seem to be the same version that some was in some individual reports. Also another thing I noted while combing through individual reports is a number of people had more than one toolbar installed in their extension list (Obernon and Yahoo, Oberon and MSN, etc) (In reply to Andrew McCreight [:mccr8] from comment #7) > gamesbar@oberon-media.com is also a very common extension for these crashes.
Thanks, Tomer Cohen, that's very interesting! So it sounds like it is not the addon per se that is causing the problem, but the browser has some information associated with it that is causing problems in 10. Sounds like it could be related to bug 94199, but I don't know anything about how caching for addon-related information works. What kinds of things does the browser save along with an addon that would be deleted when the addon is uninstalled? Marcia, it sounds like it might be worth trying starting up the browser in 9 with some of these addons installed, using it for a little bit, then upgrading to 10.
Attaching some of the toolbar version correlations to help in the hunt. Juan, Anthony and I were posting our testing results here: https://etherpad.mozilla.org/Bug-724129-Testing So far we tried a number of combinations and have not been able to reproduce the crash. As I noted, I don't believe I had the version of the Oberon toolbar that seems to be highly correlated in the attached report. Will keep trying some combinations. In all cases I started with all the extensions in FF 9.0.1 and then moved to 10 via update.
Installing/unistalling an addon will invalidate the startup cache. If this is caused by 94199, you won't see the bug on the next startup (since there isn't anything cached any more) but you might see a crash on later startups.
Went back to my VM and I am now able to reproduce the issue consistently in a Win XP VM with the configuration I have - Neil was absolutely correct in that it did not show until later startups. Here are the addons I have installed: Add-on Compatibility Reporter 1.0.3 true compatibility@addons.mozilla.org Ant Video Downloader 2.4.5 true anttoolbar@ant.com ant.com Community Toolbar 3.9.0.3 true {60190dac-b475-4be9-a099-4ca691de0d4f} freeride games Community Toolbar 3.10.0.1 true {6c94176c-d88a-4a15-b840-703b4237f992} Music Player Minion 2 2.2.0 true Music_Player_Minion@code.google.com Yahoo! Toolbar 2.4.6.20120119024823 true {635abd67-4fe9-1b23-4f01-e679fa7484c1} DataMngr 1.0 false {1FD91A9C-410C-4090-BBCC-55D3450EF433} Microsoft .NET Framework Assistant 0.0.0 false {20a82645-c095-46ed-80e3-08825760534b} ZoneAlarm Security Engine 1.5.350.0 false {FFB96CC1-7EB3-449D-B827-DB661701C6BB} (In reply to Neil Deakin from comment #17) > Installing/unistalling an addon will invalidate the startup cache. If this > is caused by 94199, you won't see the bug on the next startup (since there > isn't anything cached any more) but you might see a crash on later startups.
I've backed up a profile directory of an affected computer, than run into uninstalling GreaseMonkey. After seeing everything went smooth, I reverted back to the old profile directory, and I am unable to reproduce the issue now. (I was unable to reproduce it on my own computer(s), so I borrowed a computer with a crashing browser)
I will try to narrow down and see which addon is the truly problematic one. https://crash-stats.mozilla.com/report/index/8ab9ca1d-04e6-4764-bf03-b5ed92120204 was one of my crash reports. After the crash you can relaunch, but just trying to open a new tab or do something in the URL bar seems to generate a crash quite easily.
Keywords: reproducible
Adding Juan and Anthony so they can track what the status is.
I have done some additional testing with the set of addons in the attachment. So far I tried doing the following: 1. Disabled Yahoo Toolbar - still crashed 2. Disabled Ant Community toolbar - still crashed 3. Disabled Music Minion Player - no crash yet I will keep trying to see if getting to Step 3 really prevents the crash and will play around with some other combinations. One additional note: Having zone alarm installed, it sometimes detects the browser as "unstable" and restarts it when you hit OK. So people that have that program installed may have less instances of the crash since it restarts the browser before the crash in some instances.
Please note that the issues I was facing was not involved these toolbars. It might be possible that it is caused by a common addons component, though.
Severity: normal → critical
Keywords: crash
I am still hearing about people facing this issue, including one who is saying that it appeared on 20 computers he is responsible for. Others are saying this is caused by Greasemonkey or Video Downloader. Should I publish here more user reports or we have enough of them?
As a workaround deleting the startupcache (wherever it is on Windows) should help.
> As a workaround deleting the startupcache (wherever it is on Windows) On Windows 2000 and Windows XP: %USERPROFILE%\Local Settings\Application Data\Mozilla\Firefox\Profiles\<ZZZZZZ>.default\startupCache\ On Windows Vista and later: %USERPROFILE%\AppData\Local\Mozilla\Firefox\Profiles\<ZZZZZZ>.default\startupCache\ Note: Local profile folder rather than the main roaming one.
People are reporting on our forum (mozilla-il) that this issue disappeared to them after updating to 10.0.1, and their workaround on 10.0 was to disable the greasemonkey addon. I'm not sure if this is the case as they had to workaround the problem in order to update, and if they did it probably the workaround still in affect. We have a lot of crash-ids there for further investigations.
10.0.1 included a fix for this issue. Thanks for the update! Good to know that it helped.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla10
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: