Closed Bug 434673 Opened 16 years ago Closed 16 years ago

Using a webmail client with LinkedIn's toolbar extension causes RC1 to crash [@ xpc_CloneJSFunction - XPCWrapper::GetOrSetNativeProperty]

Categories

(Core :: XPConnect, defect)

defect
Not set
critical

Tracking

()

VERIFIED FIXED

People

(Reporter: mr_wgee, Unassigned)

References

()

Details

(Keywords: crash, Whiteboard: [RC2?])

Crash Data

Attachments

(6 files, 1 obsolete file)

User-Agent:       Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; MSN Optimized;US)
Build Identifier: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9) Gecko/2008051202 Firefox/3.0

RC1 crashes when using the AJAX version of Yahoo webmail with the LinkedIn toolbar installed.

Reproducible: Always

Steps to Reproduce:
1. Install the pre-release version of the LinkedIn toolbar (I will try to attach it to this bug report)
2. Open up a Yahoo Inbox using the AJAX (non-classic) version

Actual Results:  
RC1 crashes

Expected Results:  
Yahoo (AJAX) Inbox shows up

Also crashes WinXP version of RC1, but not always. Crash IDs:

a07afd0c-261c-11dd-a477-0013211cbf8a (Max)
3ded0db3-261d-11dd-b6fa-001cc45a2ce4 (WinXP)

This bug is similar to bug #428418 reported earlier this year except that bug crashed beta5 when using any webmail.
Signature	@0x1557bff5
UUID	a07afd0c-261c-11dd-a477-0013211cbf8a
Time	2008-05-19 20:26:11-07:00
Uptime	13950
Product	Firefox
Version	3.0
Build ID	2008051202
OS	Mac OS X
OS Version	10.5.2 9C7010
CPU	x86
CPU Info	GenuineIntel family 6 model 7 stepping 6
Crash Reason	EXC_BAD_ACCESS / KERN_PROTECTION_FAILURE
Crash Address	0x1557bff5
Comments	
Crashing Thread
Frame 	Module 	Signature 	Source
0 		@0x1557bff5 	
1 	XUL 	xpc_CloneJSFunction 	mozilla/js/src/xpconnect/src/xpcwrappednativeinfo.cpp:76
2 	XUL 	XPCWrapper::GetOrSetNativeProperty 	mozilla/js/src/xpconnect/src/XPCWrapper.cpp:1399
3 	XUL 	XPC_NW_GetOrSetProperty 	mozilla/js/src/xpconnect/src/XPCNativeWrapper.cpp:528
4 	libmozjs.dylib 	js_NativeGet 	mozilla/js/src/jsobj.c:3561
5 	libmozjs.dylib 	js_Interpret 	mozilla/js/src/jsinterp.c:4160
6 	libmozjs.dylib 	js_Invoke 	mozilla/js/src/jsinvoke.c:1312
7 	XUL 	nsXPCWrappedJSClass::CallMethod 	mozilla/js/src/xpconnect/src/xpcwrappedjsclass.cpp:1523
8 	XUL 	nsXPCWrappedJS::CallMethod 	mozilla/js/src/xpconnect/src/xpcwrappedjs.cpp:559
9 	XUL 	PrepareAndDispatch 	mozilla/xpcom/reflect/xptcall/src/md/unix/xptcstubs_unixish_x86.cpp:93
10 	XUL 	nsXPTCStubBase::Stub3 	mozilla/xpcom/reflect/xptcall/src/md/unix/xptcstubs_unixish_x86.cpp:1
11 	XUL 	nsEventListenerManager::HandleEventSubType 	mozilla/content/events/src/nsEventListenerManager.cpp:1080
Component: Extension Compatibility → XPConnect
Keywords: crash
Product: Firefox → Core
QA Contact: extension.compatibility → xpconnect
Summary: Using a webmail client with LinkedIn's toolbar extension causes RC1 to crash → Using a webmail client with LinkedIn's toolbar extension causes RC1 to crash [@ xpc_CloneJSFunction - XPCWrapper::GetOrSetNativeProperty]
Whiteboard: DUPEME
Version: unspecified → Trunk
This bug also crashes RC1 with the AOL webmail client. The attached XPI does not crash FF2.0.0.14. Screen shots attached to show what pages I was trying to reach when the crashes occur.
Warren a reduced testcase (e.g. a block of javascript/c++ code), a set of crash reports, and a nightly build at which this stopped working would be most helpful in understanding the issue.
note that i'm not able to install this pre-release for testing because: Signing could not be verified.
-260

Signature Verification Error: the signature on this .jar archive is invalid because the certificate used to sign this file has an unrecognized issuer.
Yeah, I forgot that the XPI was signed with developer certs. Sorry. I will put together a reduced test case, create some crash reports and check the nightlies.
I've been testing with the AOL webmail on Windows because it seems to be less stable than Yahoo webmail. The first nightly that failed is 2008-01-21-04. I've narrowed it down to a block of code, not sure if this is the direct or indirect cause of the crash:

var urltype = "";
for (var i = 0; i < searchREs.length; i += 2)
{
    var regexp = searchREs[i];
    if (regexp.test(url))
    {
         urltype = "site=" + searchREs[i + 1] + "&page=search";
         break;
    }
}

searchREs contains alternating sequences of regexp patterns and strings. If this code block executes then FF will crash eventually. FF seems to run fine if you insert a 'return' at the top of the block. I'll do some more digging tomorrow.
Attached file Reduced test case
Unzip and install into your extension folder under the name "{e2337727-f9c9-411b-929e-287584341d1a}"
Attachment #321713 - Attachment is obsolete: true
We noticed that there's an easier way to reproduce this bug by visiting the home page of the Canadian Monster site.

Reproducible: about 50% of the time on WinXP

Steps to Reproduce:
  1. Install FF3RC1
  2. Install the attachment {e2337727-f9c9-411b-929e-287584341d1a}.zip by unzipping it into your extensions folder
  3. Start FF
  4. Visit "monster.ca"

Actual Results:  
  RC1 crashes

Expected Results:  
  Home page of Canadian Monster comes up.

Crash IDs:
  c82cebc0-289d-11dd-8fe3-0013211cbf8a
  a5d8b0cc-289d-11dd-bae7-0013211cbf8a
  750b8487-289d-11dd-a6df-0013211cbf8a

Crashes Nightly Build: 2008-01-21-04-trunk, crash IDs: 
  abecef91-28a1-11dd-bfb5-001cc45a2ce4 
  85b0a423-28a1-11dd-a325-001321b13766

Does NOT crash: 2008-01-20-04-trunk (I tried more than 10 times)
Is there more to reproducing this than installing the extension? I did that, and it did install, but no crash. Tested on both Vista and Linux :(

I also loaded monster.ca on both systems and saw no crash :(
Mike, thanks for the info. I'll take a peek if I have the time but I'm not used to digging into 2 million + lines of code?

Johnny, having the extension installed alone does not cause a crash, it's having it installed AND visiting "monster.ca" that will trigger the crash.

I had one of our QA guys try the extension on his machine running VMWare Fusion (MAC). It crashed RC1 on Vista, XP, and Mac. We don't have linux onsite so I don't know what would happen.

Vista Ultimate SP1:
  023ab730-290b-11dd-9262-0013211cbf8a
  4e9c05e3-290a-11dd-96b3-001cc45a2c28

Win XP Pro SP2:
  683e6707-290f-11dd-81d5-001cc45a2c28
  02dc7526-290f-11dd-aaea-0013211cbf8a

Mac OS X (10.5.2)
  91868cd7-2911-11dd-bc12-001a4bd46e84
  6b459c80-2911-11dd-b113-001cc45a2ce4

Please try: clearing your cache; make sure you have no 3rd party extensions; there's no leftover baggage from an existing profile like bookmarks (clean install of RC1); etc.

You might also try setting your home page to the blank page. Restart your browser and enter "monster.ca" into the address bar. If that doesn't work try restarting FF instead of reloading the page. In general, I'd say it crashes about 50% of the time on Windows. Our Mac tests crashed most frequently--7 out of 8 tries.
I tried that too, several times, even created an account on monster.ca and logged in, still no luck :( Ben said he'd give it a try too...
You just need to land on the home page to trigger the crash. That page has about 18 frames and will cause FF to execute the JS extension code that will directly or indirectly cause the crash.
Attached file Stack trace
The optimizer or breakpad really messed up the stack trace that I got from RC1 so I got this trace from my debug build.

Of note 'funobj' in frame 2 has been deleted (map has amazingly large refcount, dslots is 0xdadadada).
(In reply to comment #17)
> Of note 'funobj' in frame 2 

I meant frame 1 :(
Status: UNCONFIRMED → NEW
Ever confirmed: true
I heard a rumor that FF3 might be officially launched on June 3rd. Since this bug will affect all LinkedIn toolbar users can anybody give me an idea of what is going to happen to this bug (when will it be fixed) and when is the actual release of FF3? We need to know to plan accordingly. Thanks. 
We do not have a release date for FF3.  It is unlikely that this will be fixed in the 3.0 release unless we get a very small patch, very very soon.

Ben: can you get the JS stack at that point, so that LinkedIn can investigate workarounds?

Based on the stack, my suspicion is that bug 408301 is involved.  jst?
Attached file JS Stack trace
Here's the JS that's dying, in case anyone can work around this bug.
The stacks make me think that this can't be worked around.
Attached patch Fix.Splinter Review
There's a GC hazard hiding in XPCNativeMember::NewFunctionObject() (which is inline and thus doesn't tend to show up in stacks from optimized builds). Normally when that function is called the member is one that's already resolved and defined on a wrapper or its prototype, and thus won't be collected, but in this case we're manually resolving the member in the XPCWrapper code (XOW/XPCNW/SJOW), and thus it's never seen by the GC, so the GC collects it. In this case we come in to xpc_CloneJSFunction() (from NewFunctionObject()) with a live function object to clone, we clone it but then JS_SetPrototype() GCs and the funobj we cloned can be dead.
Attachment #322704 - Flags: superreview?(brendan)
Attachment #322704 - Flags: review?(brendan)
Flags: blocking1.9?
Whiteboard: DUPEME → [RC2?]
Brendan, even with the debug code in JS_SetPrototype() that does a GC commented
out I still see GC happening inside of xpc_CloneFunctionObject(), the GC is 
coming from JS_CloneFunctionObject(), with this stack:

        js_GC()
        js_NewGCThing()
        js_NewObjectWithGivenProto()
        js_NewObject()
        js_CloneFunctionObject()
        JS_CloneFunctionObject()
        xpc_CloneJSFunction()
        XPCNativeMember::NewFunctionObject()

I still stand by this patch being correct, the XPCNativeMember itself never
actually dies here (until later), it's held alive by the interface it came from
(XPCNativeInterface), which comes from the wrapper in question here, and that's
all guarded against GC. It's just the value (mValue, and possibly its name) in
the member that's dying, everything else is properly guarded against here it
seems.
Comment on attachment 322704 [details] [diff] [review]
Fix.

Oh, sure -- the allocator itself hitting last ditch. It still seems like xpconnect should protect mVal in a consolidated fashion, but this will do for now.

Followup bug on mVal rooting/tracing?

/be
Attachment #322704 - Flags: superreview?(brendan)
Attachment #322704 - Flags: superreview+
Attachment #322704 - Flags: review?(brendan)
Attachment #322704 - Flags: review+
It looks like Johnny is confident that he's checked in a fix for this bug. I'll
test the nightly tomorrow morning (Wed) and see if it's stable with our
extension. Since the release builds are pulled off the nightlies is it safe to
assume that the fix will go out when the next release of FF is pulled?

My workaround for now is to lose some functionality (in exchange for a stable
browser) by commenting out the offending code in our extension. That doesn't
seem to be enough. Our tests with the workaround indicate the FF is a lot more
stable but will still cause non-repeatable (non-reproducible) crashes in Gmail. 

I am beginning to suspect that the bug involves doing RegExp() calls. Does this
sound reasonable? I'm having trouble following this thread because I'm not
heavy into the FF source code. At this point in time I'm trying to determine if
there's a core issue (no workaround) or a second bug that affects only Gmail.

Let me know if there's anything else I can do to help.
This patch hasn't been committed yet, but it should be shortly I hope.  It also fixes vlad's mystery crashes from bug 435502.
Comment on attachment 322704 [details] [diff] [review]
Fix.

a=shaver, please land on CVS trunk with all due haste.  Thanks a ton!
Attachment #322704 - Flags: approval1.9+
Fix checked in. This *should* be included in Wednesday's nightly builds now.

What's likely to trigger this is if a page does a lot of JS allocation heavy work (I'd imagine heavy regexp usage could be enough) *and* there's code involved that makes *function calls* from chrome into content (property access shouldn't be enough), or calls through a cross origin wrapper (i.e. cross window/frame communication). The key here is that a JS garbage collection has to happen while the XPCWrapper code is cloning a function object.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
is bug 401097 another instance or is this bug limited to trunk?
OS: Mac OS X → All
Hardware: Macintosh → All
Warren this fix will be included in Firefox 3 RC2 and thus FF3 final.  Thank you very much for your help here - much appreciate. 
Flags: blocking1.9? → blocking1.9+
Mike, no problem. Thanks to you and your team for staying on top of this and getting it resolved. I was sweating bullets all this past week--cutting it close there. I found out about today's code freeze of RC2 just last night. We'll QA the nightly later today. Thanks again.
We only found out about RC2's code freeze Tuesday afternoon, so don't feel too bad. :)
I'd blog this but I need a quick answer.

Our QA team has tested the nightly build dated 05-28-08. They've concluded that FF no longer crashes on our troubled sites: AOL mail, Yahoo AJAX mail, Gmail, monster.ca, etc. Now, however, several key features of our extension no longer work. They were working with the RC1 release.

My understanding is that the release builds are pulled from the nightlies so are there any reasons beside new bugs getting into RC2 that would cause us to lose functionality? I'll trouble-shoot in the mean time to figure out what's wrong. Hopefully it's minor.
That's correct, nightlies are the release stream.  The set of things in RC2 is quite small, and we had hoped very safe, fwiw.
Warren, can you please let us know what worked before and no longer works? It's crucial that we find out quickly (i.e. now :) for there to be any hope of us fixing it for RC2. Please file new bugs on new issues and cc me etc.
Sorry, didn't mean to leave you guys in suspense. :) Everything is fine. Our extension is not Minefield-compatible. We rely on finding "Firefox" in the user-agent but found "Minefield" and choked. Some of our JS code has to be IE-compatible. Our extension is now Minefield-friendly, all functionality is available and we had no crashes all day with 2 QA engineers testing.
Awesome, thanks for the update! I'm glad it was nothing more than that :)
I verified this with RC2 as well using the attached extension. No crashing.

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9) Gecko/2008052912 Firefox/3.0
Status: RESOLVED → VERIFIED
Crash Signature: [@ xpc_CloneJSFunction - XPCWrapper::GetOrSetNativeProperty]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: