Closed Bug 639162 Opened 14 years ago Closed 14 years ago

Startup crash [@ XPCCallContext::NewStringWrapper]

Categories

(Core :: XPConnect, defect)

ARM
Android
defect
Not set
critical

Tracking

()

RESOLVED DUPLICATE of bug 634594
Tracking Status
fennec 2.0+ ---

People

(Reporter: dougt, Unassigned)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 obsolete file)

This bug was filed from the Socorro interface and is report bp-fe7c9029-f954-408f-b54a-ae1ad2110305 . ============================================================= Startup crash in todays nightly
tracking-fennec: --- → 2.0+
Crashes first appeared in Fennec 4.0b6pre/20110302. Signature XPCCallContext::NewStringWrapper UUID fe7c9029-f954-408f-b54a-ae1ad2110305 Time 2011-03-05 07:26:14.151129 Uptime 0 Last Crash 79 seconds before submission Install Age 811 seconds (13.5 minutes) since version was first installed. Product Fennec Version 4.0b6pre Build ID 20110305042456 Branch 2.0 OS Linux OS Version 0.0.0 Linux 2.6.32.9 #3 Tue Oct 12 21:33:42 KST 2010 armv7l CPU arm CPU Info Crash Reason SIGBUS Crash Address 0x0 App Notes samsung SCH-I800 verizon/SCH-I800/SCH-I800/SCH-I800:2.2/FROYO/DJ11:user/release-keys Processor Notes WARNING: Json file missing Add-ons Frame Module Signature [Expand] Source 0 libxul.so XPCCallContext::NewStringWrapper nsTSubstring.h:593 1 libxul.so XPCConvert::JSData2Native js/src/xpconnect/src/xpcconvert.cpp:762 2 libxul.so XPCWrappedNative::CallMethod js/src/xpconnect/src/xpcwrappednative.cpp:2968 3 libxul.so XPC_WN_CallMethod js/src/xpconnect/src/xpcwrappednativejsops.cpp:1613 4 libxul.so js::Interpret js/src/jscntxtinlines.h:701 5 libxul.so js::Invoke js/src/jsinterp.cpp:653 6 libxul.so js::ExternalInvoke js/src/jsinterp.cpp:863 7 libxul.so JS_CallFunctionValue js/src/jsapi.cpp:5173 8 libxul.so nsXPCWrappedJSClass::CallMethod js/src/xpconnect/src/xpcwrappedjsclass.cpp:1672 9 libxul.so nsXPCWrappedJS::CallMethod js/src/xpconnect/src/xpcwrappedjs.cpp:588 10 libxul.so PrepareAndDispatch xpcom/reflect/xptcall/src/md/unix/xptcstubs_arm.cpp:132 11 libxul.so libxul.so@0xd0948b 12 libxul.so nsComponentManagerImpl::CreateInstanceByContractID xpcom/components/nsComponentManager.cpp:1315 ... The regression range is: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=ec8a059bea58&tochange=dd542a890eb7 More reports at: https://crash-stats.mozilla.com/report/list?range_value=4&range_unit=weeks&signature=XPCCallContext%3A%3ANewStringWrapper
Component: General → XPConnect
Keywords: regression
OS: Linux → Android
Product: Fennec → Core
QA Contact: general → xpconnect
Hardware: All → ARM
Summary: crash [@ XPCCallContext::NewStringWrapper] → Startup crash [@ XPCCallContext::NewStringWrapper]
Putting back on nom list.
tracking-fennec: 2.0+ → ?
This is ... odd. The crash is claimed to be a null-deref, but the relevant code is: 757 const jschar *strChars = JS_GetStringCharsZAndLength(cx, str, &strLength); 758 if (!strChars) 759 return JS_FALSE; 760 761 XPCReadableJSStringWrapper *wrapper = 762 ccx.NewStringWrapper(strChars, strLength); The NewStringWrapper just looks up _where_ to construct the string; it passes things through directly to the string code.... which is where the crash claims to be? Hmm. I wonder whether the problem is that the XPC cache thing is doing unaligned access?
Almost certainly: char mStringWrapperData[sizeof(StringWrapperEntry) * XPCCCX_STRING_CACHE_SIZE]; And the previous member is a PRUint16 and the one before that is void*, so this will tend to be aligned on a 2-byte boundary. But note that this code has been around forever (well since late 2007), so I don't see how this can be a regression. Unless fixing breakpad for fennec made us finally get useful stacks here?
Blocks: 408139
Assignee: nobody → doug.turner
tracking-fennec: ? → 2.0+
If this was the reason, wouldn't we expect a crash every single time? There is no randomness in what alignment this will get (modulo 4), it should be 2-byte aligned every time.
Assignee: doug.turner → azakai
Not sure what we want to do here. On the one hand, I still don't know what is causing this crash. I would suspect ccx of being null more than I would suspect an unaligned access, since this crash is rare (unaligned access should kill us every time - unless only very few Android kernels are configured that way? Perhaps these are modified kernels?). Also not sure this is a startup crash, as other reports have uptime > 0 (in the range 1-8). No STR so hard to be sure what is going on here. On the other hand, if unaligned access is the problem, I'm not sure how we would solve it. We can juggle the elements in that structure, or add manual padding, but (1) this would increase the size of the structure, which is a shame if this isn't really the issue, and (2) it is hard to see how to do so in a way that is ensured to work on all archs. But, looking through the code, it doesn't look like any significant amount of work is actually done in the constructors which are avoided here - which is what this hack is trying to optimize. And unaligned access, even if nonfailing, tends to be slower. So this optimization may not be that useful anyhow (although it does save some memory). Perhaps the simplest solution here is to just remove the optimization?
If I set everything up properly, k_OFFSETOF_XPCCallContext_mStringWrapperData: .word 102 in a fairly vanilla android opt build.
But should note that the surrounding area is .align 2 .type k_OFFSETOF_XPCCallContext_mStringWrapperData, %object .size k_OFFSETOF_XPCCallContext_mStringWrapperData, 4 k_OFFSETOF_XPCCallContext_mStringWrapperData: .word 102 I wonder if this is a THUMB/no-THUMB thing ...
Alon, string constructors are a pretty noticeable performance sink in general... Note that it's not just the ctors but also the heap-allocation of the string that this thing optimizes away.
Attached patch patch (obsolete) — Splinter Review
If alignment is the issue, this should fix it. I am not happy with how it does it (just shuffle some members - so vulnerable to future edits of the file), but I don't see a better solution. I am also not happy that we are not sure that alignment is the cause, but there seems no harm in trying this. (I was worried this would take more memory, but it won't since the entire structure is padded anyhow to a multiple of the alignment of the largest member, so we are just moving padding from after to before.)
Attachment #518202 - Flags: review?(bzbarsky)
(In reply to comment #9) > Alon, string constructors are a pretty noticeable performance sink in > general... Note that it's not just the ctors but also the heap-allocation of > the string that this thing optimizes away. Yeah, I read more of the code meanwhile, and it is significant. So removing the optimization would have been bad, forget my earlier comment.
(In reply to comment #8) > But should note that the surrounding area is > > .align 2 Just double-checked this, and the .align param on ARM is the number of low-order *bits* to which to align the location, not byte-alignment. So this isn't unexpected.
(In reply to comment #12) > (In reply to comment #8) > > But should note that the surrounding area is > > > > .align 2 > > Just double-checked this, and the .align param on ARM is the number of > low-order *bits* to which to align the location, not byte-alignment. So this > isn't unexpected. Cool, so at least that part makes sense. The remaining mystery is why we only crash some of the time because of this alignment issue. But, I think we can go ahead with the patch without figuring that out.
Comment on attachment 518202 [details] [diff] [review] patch Why don't we use a union to align this properly instead of depending on ordering of previous stuff? See the what nsAutoArrayBase in nsTArray.h does to align its mAutoBuf, for example.
Attachment #518202 - Flags: review?(bzbarsky) → review-
Hmm... apparently all of this has happened before. I noticed now that this is fixed on tracemonkey, bug 634594. Will nom that bug.
Depends on: 634594
Attachment #518202 - Attachment is obsolete: true
There have been no crashes since 4.0b6pre/20110308, I close it as "workforme". The working range is: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=1240004b90e6&tochange=89429f5cbf81
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → WORKSFORME
This is really just a dupe of bug 634594.
No longer depends on: 634594
Resolution: WORKSFORME → DUPLICATE
> This is really just a dupe of bug 634594. Are you sure? The patch in bug 634594 has not been landed yet in m-c.
(In reply to comment #18) > > This is really just a dupe of bug 634594. > Are you sure? The patch in bug 634594 has not been landed yet in m-c. I'm not sure what the patch landing or not has to do with this being a duplicate of the other bug - maybe I didn't understand what you meant here. In any case, the patch in that bug has landed a few minutes ago :)
Crash Signature: [@ XPCCallContext::NewStringWrapper]
Assignee: azakai → nobody
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: