Closed Bug 687001 Opened 13 years ago Closed 11 years ago

Firefox Crash @ JSLinearString::mark

Categories

(Core :: JavaScript Engine, defect)

8 Branch
x86
macOS
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: marcia, Unassigned)

Details

(Keywords: crash)

Crash Data

Seen while looking at crash stats. This one only seems to happen with Aurora and FF 7, and so far only on Mac and Linux, but there is another bug which could be the Windows equivalent of the stack. https://crash-stats.mozilla.com/report/list?signature=JSLinearString::mark

Bug 686441 seems to be the Windows equivalent of the stack, and is heavily correlated to the Better Facebook addon - 82% (305/371) vs.   1% (729/110286) betterfacebook@mattkruse.com

Adding Luke for ideas.

https://crash-stats.mozilla.com/report/index/1a601606-c9df-4da2-a1a5-19ce32110914

Frame 	Module 	Signature [Expand] 	Source
0 	XUL 	JSLinearString::mark 	js/src/vm/String.cpp:78
1 	XUL 	js::GCMarker::drainMarkStack 	js/src/jsgcmark.cpp:548
2 	XUL 	XPCJSRuntime::TraceJS 	js/src/jsgc.h:1299
3 	XUL 	js::MarkRuntime 	js/src/jsgc.cpp:1859
4 	XUL 	GCCycle 	js/src/jsgc.cpp:2291
5 	XUL 	js_GC 	js/src/jsgc.cpp:2710
6 	XUL 	nsXPConnect::Collect 	js/src/xpconnect/src/nsXPConnect.cpp:412
7 	XUL 	nsXPConnect::GarbageCollect 	js/src/xpconnect/src/nsXPConnect.cpp:420
8 	XUL 	nsTimerImpl::Fire 	xpcom/threads/nsTimerImpl.cpp:424
9 	XUL 	nsTimerEvent::Run 	xpcom/threads/nsTimerImpl.cpp:520
10 	XUL 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:617
11 	XUL 	NS_ProcessPendingEvents_P 	obj-firefox/x86_64/xpcom/build/nsThreadUtils.cpp:195
12 	XUL 	nsBaseAppShell::NativeEventCallback 	widget/src/xpwidgets/nsBaseAppShell.cpp:130
13 	XUL 	nsAppShell::ProcessGeckoEvents 	widget/src/cocoa/nsAppShell.mm:422
14 	CoreFoundation 	CoreFoundation@0x11c50 	
15 	CoreFoundation 	CoreFoundation@0x114bc 	
16 	CoreFoundation 	CoreFoundation@0x382d8 	
17 	libsystem_c.dylib 	libsystem_c.dylib@0x4d15f 	
18 	libsystem_c.dylib 	libsystem_c.dylib@0x77ffe 	
19 	libsystem_c.dylib 	libsystem_c.dylib@0x6b061 	
20 	CoreFoundation 	CoreFoundation@0x8313 	
21 	CoreFoundation 	CoreFoundation@0x2164 	
22 	libobjc.A.dylib 	objc_memmove_collectable 	
23 	CoreFoundation 	CoreFoundation@0xe82f 	
24 	libnspr4.dylib 	PR_GetCurrentThread 	nsprpub/pr/src/pthreads/ptthread.c:614
25 	XUL 	nsXPCWrappedJS::CallMethod 	js/src/xpconnect/src/xpcwrappedjs.cpp:585
26 	XUL 	PrepareAndDispatch 	xpcom/reflect/xptcall/src/md/unix/xptcstubs_x86_64_darwin.cpp:153
27 		@0x106126fef 	
28 	libsystem_c.dylib 	libsystem_c.dylib@0x4d6aa 	
29 	AppKit 	AppKit@0x989e87 	
30 	Foundation 	Foundation@0x9cb2 	
31 	Foundation 	Foundation@0x99ee 	
32 	AppKit 	AppKit@0x3de73 	
33 	XUL 	nsLayoutUtils::GetCrossDocParentFrame 	layout/base/nsLayoutUtils.cpp:479
34 	CoreGraphics 	CoreGraphics@0x36ac3 	
35 	libsystem_c.dylib 	libsystem_c.dylib@0x6b02e 	
36 	XUL 	nsCocoaUtils::MenuBarScreenHeight 	widget/src/cocoa/nsCocoaUtils.mm:60
37 		@0x183de04bf 	
38 	AppKit 	AppKit@0x87bae7 	
39 	HIToolbox 	HIToolbox@0x2630f1 	
40 	AppKit 	AppKit@0x87bae7 	
41 	HIToolbox 	HIToolbox@0x2630f1 	
42 	XUL 	nsGenericElement::AddRef 	nsISupportsImpl.h:161
43 		@0x183de04f7 	
44 	XUL 	nsDOMEvent::DuplicatePrivateData 	nsAutoPtr.h:957
45 	XUL 	nsDOMUIEvent::DuplicatePrivateData 	content/events/src/nsDOMUIEvent.cpp:391
I looked at the dump.  It seems a dependent string's base is somehow getting set to null.  I audited the places where dependent strings are initialized and they definitely aren't producing null bases, so it would seem the string is getting clobbered somehow.

The correlation with Better Facebook seems strong.  Is that a binary addon?  Do you know if anyone has tried to mess around with this addon and repro a crash?
Luke: I will try to repro the crash today with that extension.
I have been playing around with this today on both Mac and Windows using the latest Aurora but no crash yet. I assume GC has to kick in at some point so I will see what happens after some uptime.
So there an older version of the Better Facebook addon on AMO compared to what the developer has on his site:

http://betterfacebook.net/ shows Version 5.931 while the version I installed in AMO is 5.911.

Adding Jorge so see if there a request in the queue to have that addon updated - ID=betterfacebook@mattkruse.com
The new version has just been approved, but I didn't see any big changes in it. I doubt this bug was fixed with this update.
I am the author of the Better Facebook add-on.

I have been receiving reports of crashes from users, and it seems to be mostly focused on FF 6.0.2 from what I have seen. I cannot reproduce it, nor have I been able to correlate it with specific options or conflicts with other add-ons, or OS, etc. As of right now, I really have no clue what is causing it. I would love to figure it out!

Since previous versions of the add-on and FF all worked fine together, I suspect that BFB is triggering some bug in FF. But I'm not sure of that, of course. It could also be that Facebook has some odd async code that is causing a conflict or something. My add-on is cross-browser Greasemonkey code, wrapped for specific browsers, and runs fine in most other browsers.

I originally used this to wrap my greasemonkey code into a FF add-on:
http://arantius.com/misc/greasemonkey/script-compiler

I'm not familiar with the guts of it, but I did make a few tweaks to handle UTF-8 correctly. It's been working fine ever since.

If you have any clue about what the root cause may be, or anything you would like me to follow up on, I would love to work with you to solve this. I have many users eager for an update to stop their crashes.
Matt: From what I can see in our crash stats, it may be hitting people on Firefox 7 as well - https://crash-stats.mozilla.com/report/list?signature=JSLinearString::mark%28JSTracer*%29 shows many Windows users crashing with Firefox 6.0 and 6.0.2 but also 7.0 betas.

https://crash-stats.mozilla.com/report/list?signature=JSLinearString::mark shows the Mac and Linux crashes which are smaller volume. 

The Windows stack has some references to garage collection so I am wondering if that might have something to do with it.
Users do often say that the crash only happens after leaving their page open for a while, so a garbage collection issue might make sense.

Facebook's internal code is quite complex and it does a lot of crazy things. I hook into that, which makes it worse.

One thing in particular that BFB does is to watch the DOMNodeInserted event. I catch inserted content and manipulate the objects, including the source. Could this be causing a problem somehow, with new FF6 code?

The frustrating part is that it's not consistent. Only some users have it happen, which leads me to believe that there is a specific feature of FB that is only visible on some accounts. That, combined with BFB, is triggering some FF problem. I just don't know how to dig into it any further without being able to reproduce it or even find out what line of code is causing the problem.
Is there anything I can do or any information I can provide to help move this along? I know nothing about the mozilla development process or code, so the stack trace is meaningless to me, other than looking like it's garbage-collection related.

Since this problem didn't exist in any previous versions of FF that I know, I suspect that my add-on is triggering some new bug or quirk in FF. But if there is something I can do to avoid that, I would love to find it. Thanks!
We'd like to fix this also but GC-related crashes are difficult to debug without any reproducible test-case.  The best way to move this forward is to find a way to make the test-case reproducible (e.g., by making your code run 100x more often).

Another avenue is to run a debug nightly and see if it hits any assertions with Better Facebook installed: http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2011-09-26-mozilla-central-debug.
Matt: We discussed this during the Beta triage meeting today - given the fact that this crash is showing pretty high in the crash stats for 7.0b6 (currently #18 top crash), can we bump the compatibility down ASAP so it does not show as compatible for Firefox 7 (as well as Firefox 6)? We are about to ship Firefox 7 and I am concerned that the crash volume may rise dramatically when we ship.

Also we haven't been able to get a reproducible scenario (as Luke notes in Comment 10), so we cannot really debug the issue until we are able to do that reliably.

Let us know if you need help with the compat portion.  Thanks very much.
Well, I don't want to bump down the compatibility, because many (most?) FF6/7 users don't have a problem. I cannot reproduce the problem either, and I've tried many scenarios.
Also - many users were using FF6/7 for quite some time with no crashes. It seems like it was only with 6.0.2 that the crashes began happening, so it feels like some type of bug that was introduced in FF rather than my add-on. Many of my users have said they have similar crashing problems even without the add-on enabled.

Just in case this problem has anything to do with how I am setting/getting preferences, I've pasted some of the underlying code below. This was mostly generated for me from the add-on-builder, but has been tweaked a bit.

	this.getValue=function(prefName, defaultValue) {
		try {
			return pref.getComplexValue(prefName,Components.interfaces.nsISupportsString).data;
		} catch(e) { return ""; }
	}

	this.setValue=function(prefName, value) {
		this.remove(prefName);
		var str = Components.classes["@mozilla.org/supports-string;1"].createInstance(Components.interfaces.nsISupportsString);
		str.data = value;
		pref.setComplexValue(prefName, Components.interfaces.nsISupportsString, str);
		return;
		
		var prefType=typeof(value);

		switch (prefType) {
			case "string":
			case "boolean":
				break;
			case "number":
				if (value % 1 != 0) {
					throw new Error("Cannot set preference to non integral number");
				}
				break;
			default:
				throw new Error("Cannot set preference with datatype: " + prefType);
		}

		// underlying preferences object throws an exception if new pref has a
		// different type than old one. i think we should not do this, so delete
		// old pref first if this is the case.
		if (this.exists(prefName) && prefType != typeof(this.getValue(prefName))) {
			this.remove(prefName);
		}

		// set new value using correct method
		switch (prefType) {
			case "string": pref.setCharPref(prefName, value); break;
			case "boolean": pref.setBoolPref(prefName, value); break;
			case "number": pref.setIntPref(prefName, Math.floor(value)); break;
		}
	}
(In reply to Matt Kruse from comment #13)
> Also - many users were using FF6/7 for quite some time with no crashes. It
> seems like it was only with 6.0.2 that the crashes began happening, so it
> feels like some type of bug that was introduced in FF rather than my add-on.

6.0.2 had no code changes that could trigger this, it only changed accepted certificates. I suspect that a Facebook change has caused this to be triggered, unfortunately, we have no idea what actually causes the crash at all. We only know that garbage collection triggers the crash, but the faulty code causing this has been run some time before it - between milliseconds and hours before it, actually.

> Many of my users have said they have similar crashing problems even without
> the add-on enabled.

Did you inspect their crash reports or can you point us to those, so we can verify if it's the same thing or something else they've seen there?
Update: I'm still trying to find the cause of this, with no luck yet.

Question: Is this line in the stack the highest up it goes?

45 	XUL 	nsDOMUIEvent::DuplicatePrivateData 	content/events/src/nsDOMUIEvent.cpp:391

I'd like to try to figure out some context of when this happens. For example, when a DOM node is inserted, or if something on the screen is updated, or if the screen is resized, or any kind of event like that.

Is the "DuplicatePrivateData" method called when a node is cloned or something? Perhaps a node has a reference to something that can't be cloned or cleaned?

My questions may not make any sense. I'm just throwing out some thoughts in the hopes that something will stick and I might have a clue about where to look further.
I see no crashes for either signature after 10.0.12esr
Flags: needinfo?(mozillamarcia.knous)
I think we can close this one out - there are still 12 crashes, but as Wayne notes all of the users are using outdated versions of Firefox.
Status: NEW → RESOLVED
Closed: 11 years ago
Flags: needinfo?(mozillamarcia.knous)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.