Closed Bug 890985 Opened 12 years ago Closed 11 years ago

crash in nsStandardURL::SetSpec @ nsStandardURL::BuildNormalizedSpec

Categories

(Core :: Networking, defect)

24 Branch
ARM
Android
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla27
Tracking Status
firefox23 --- unaffected
firefox24 + wontfix
firefox25 + wontfix
firefox26 + fixed
firefox27 --- fixed
firefox28 --- fixed
b2g-v1.2 --- fixed
fennec + ---

People

(Reporter: nhirata, Assigned: blassey)

References

Details

(Keywords: crash, regression, topcrash-android-armv7, Whiteboard: [native-crash])

Crash Data

Attachments

(2 files, 3 obsolete files)

This bug was filed from the Socorro interface and is report bp-e7382553-3587-4290-8c51-cc57a2130707 . ============================================================= Frame Module Signature Source 0 libxul.so nsStandardURL::BuildNormalizedSpec(char const*) netwerk/base/src/nsStandardURL.cpp 1 libxul.so nsStandardURL::SetSpec(nsACString_internal const&) netwerk/base/src/nsStandardURL.cpp 2 libxul.so nsStandardURL::Init(unsigned int, int, nsACString_internal const&, char const*, nsIURI*) netwerk/base/src/nsStandardURL.cpp 3 libxul.so nsJARURI::CreateEntryURL(nsACString_internal const&, char const*, nsIURL**) modules/libjar/nsJARURI.cpp 4 libxul.so nsJARURI::SetSpecWithBase(nsACString_internal const&, nsIURI*) modules/libjar/nsJARURI.cpp 5 libxul.so nsJARProtocolHandler::NewURI(nsACString_internal const&, char const*, nsIURI*, nsIURI**) modules/libjar/nsJARProtocolHandler.cpp 6 libxul.so nsIOService::NewURI(nsACString_internal const&, char const*, nsIURI*, nsIURI**) netwerk/base/src/nsIOService.cpp 7 libxul.so nsJARURI::SetSpecWithBase(nsACString_internal const&, nsIURI*) modules/libjar/nsJARURI.cpp 8 libxul.so nsJARProtocolHandler::NewURI(nsACString_internal const&, char const*, nsIURI*, nsIURI**) modules/libjar/nsJARProtocolHandler.cpp 9 libxul.so nsIOService::NewURI(nsACString_internal const&, char const*, nsIURI*, nsIURI**) netwerk/base/src/nsIOService.cpp 10 libxul.so nsIOService::NewChannel(nsACString_internal const&, char const*, nsIURI*, nsIChannel**) netwerk/base/src/nsIOService.cpp 11 libxul.so nsResProtocolHandler::NewChannel(nsIURI*, nsIChannel**) netwerk/protocol/res/nsResProtocolHandler.cpp 12 libxul.so nsIOService::NewChannelFromURIWithProxyFlags(nsIURI*, nsIURI*, unsigned int, nsIChannel**) netwerk/base/src/nsIOService.cpp 13 libxul.so nsIOService::NewChannelFromURI(nsIURI*, nsIChannel**) netwerk/base/src/nsIOService.cpp 14 libxul.so mozJSComponentLoader::ImportInto(nsACString_internal const&, JS::Handle<JSObject*>, JSContext*, JS::MutableHandle<JSObject*>) js/xpconnect/loader/mozJSComponentLoader.cpp 15 libxul.so mozJSComponentLoader::Import(nsACString_internal const&, JS::Value const&, JSContext*, unsigned char, JS::Value*) js/xpconnect/loader/mozJSComponentLoader.cpp 16 libxul.so nsXPCComponents_Utils::Import(nsACString_internal const&, JS::Value const&, JSContext*, unsigned char, JS::Value*) js/xpconnect/src/XPCComponents.cpp 17 libxul.so NS_InvokeByIndex xpcom/reflect/xptcall/src/md/unix/xptcinvoke_arm.cpp 18 libxul.so XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) js/xpconnect/src/XPCWrappedNative.cpp 19 libxul.so XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*) js/xpconnect/src/XPCWrappedNativeJSOps.cpp 20 libxul.so js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) js/src/jscntxtinlines.h 21 libxul.so Interpret js/src/vm/Interpreter.cpp 22 libxul.so js::RunScript(JSContext*, js::RunState&) js/src/vm/Interpreter.cpp 23 libxul.so js::Execute(JSContext*, JS::Handle<JSScript*>, JSObject&, JS::Value*) js/src/vm/Interpreter.cpp 24 libxul.so JS_ExecuteScript(JSContext*, JSObject*, JSScript*, JS::Value*) js/src/jsapi.cpp 25 libxul.so JS_ExecuteScriptVersion(JSContext*, JSObject*, JSScript*, JS::Value*, JSVersion) js/src/jsapi.cpp 26 libxul.so mozJSComponentLoader::ObjectForLocation(nsIFile*, nsIURI*, JSObject**, char**, bool, JS::MutableHandle<JS::Value>) js/xpconnect/loader/mozJSComponentLoader.cpp 27 libxul.so mozJSComponentLoader::ImportInto(nsACString_internal const&, JS::Handle<JSObject*>, JSContext*, JS::MutableHandle<JSObject*>) js/xpconnect/loader/mozJSComponentLoader.cpp 28 libxul.so mozJSComponentLoader::Import(nsACString_internal const&, JS::Value const&, JSContext*, unsigned char, JS::Value*) js/xpconnect/loader/mozJSComponentLoader.cpp 29 libxul.so nsXPCComponents_Utils::Import(nsACString_internal const&, JS::Value const&, JSContext*, unsigned char, JS::Value*) js/xpconnect/src/XPCComponents.cpp 30 libxul.so NS_InvokeByIndex xpcom/reflect/xptcall/src/md/unix/xptcinvoke_arm.cpp 31 libxul.so XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) js/xpconnect/src/XPCWrappedNative.cpp 32 libxul.so XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*) js/xpconnect/src/XPCWrappedNativeJSOps.cpp 33 libxul.so js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) js/src/jscntxtinlines.h 34 libxul.so Interpret js/src/vm/Interpreter.cpp 35 libxul.so js::RunScript(JSContext*, js::RunState&) js/src/vm/Interpreter.cpp 36 libxul.so js::Execute(JSContext*, JS::Handle<JSScript*>, JSObject&, JS::Value*) js/src/vm/Interpreter.cpp 37 libxul.so JS_ExecuteScript(JSContext*, JSObject*, JSScript*, JS::Value*) js/src/jsapi.cpp 38 libxul.so JS_ExecuteScriptVersion(JSContext*, JSObject*, JSScript*, JS::Value*, JSVersion) js/src/jsapi.cpp 39 libxul.so mozJSComponentLoader::ObjectForLocation(nsIFile*, nsIURI*, JSObject**, char**, bool, JS::MutableHandle<JS::Value>) js/xpconnect/loader/mozJSComponentLoader.cpp 40 libxul.so mozJSComponentLoader::ImportInto(nsACString_internal const&, JS::Handle<JSObject*>, JSContext*, JS::MutableHandle<JSObject*>) js/xpconnect/loader/mozJSComponentLoader.cpp 41 libxul.so mozJSComponentLoader::Import(nsACString_internal const&, JS::Value const&, JSContext*, unsigned char, JS::Value*) js/xpconnect/loader/mozJSComponentLoader.cpp 42 libxul.so nsXPCComponents_Utils::Import(nsACString_internal const&, JS::Value const&, JSContext*, unsigned char, JS::Value*) js/xpconnect/src/XPCComponents.cpp 43 libxul.so NS_InvokeByIndex xpcom/reflect/xptcall/src/md/unix/xptcinvoke_arm.cpp 44 libxul.so XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) js/xpconnect/src/XPCWrappedNative.cpp 45 libxul.so XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*) js/xpconnect/src/XPCWrappedNativeJSOps.cpp 46 libxul.so js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) js/src/jscntxtinlines.h 47 libxul.so Interpret js/src/vm/Interpreter.cpp 48 libxul.so js::RunScript(JSContext*, js::RunState&) js/src/vm/Interpreter.cpp 49 libxul.so js::Execute(JSContext*, JS::Handle<JSScript*>, JSObject&, JS::Value*) js/src/vm/Interpreter.cpp 50 libxul.so JS_ExecuteScript(JSContext*, JSObject*, JSScript*, JS::Value*) js/src/jsapi.cpp 51 libxul.so JS_ExecuteScriptVersion(JSContext*, JSObject*, JSScript*, JS::Value*, JSVersion) js/src/jsapi.cpp 52 libxul.so mozJSComponentLoader::ObjectForLocation(nsIFile*, nsIURI*, JSObject**, char**, bool, JS::MutableHandle<JS::Value>) js/xpconnect/loader/mozJSComponentLoader.cpp 53 libxul.so mozJSComponentLoader::ImportInto(nsACString_internal const&, JS::Handle<JSObject*>, JSContext*, JS::MutableHandle<JSObject*>) js/xpconnect/loader/mozJSComponentLoader.cpp 54 libxul.so mozJSComponentLoader::Import(nsACString_internal const&, JS::Value const&, JSContext*, unsigned char, JS::Value*) js/xpconnect/loader/mozJSComponentLoader.cpp 55 libxul.so nsXPCComponents_Utils::Import(nsACString_internal const&, JS::Value const&, JSContext*, unsigned char, JS::Value*) js/xpconnect/src/XPCComponents.cpp 56 libxul.so NS_InvokeByIndex xpcom/reflect/xptcall/src/md/unix/xptcinvoke_arm.cpp 57 libxul.so XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) js/xpconnect/src/XPCWrappedNative.cpp 58 libxul.so XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*) js/xpconnect/src/XPCWrappedNativeJSOps.cpp 59 libxul.so js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) js/src/jscntxtinlines.h 60 libxul.so Interpret js/src/vm/Interpreter.cpp 61 libxul.so js::RunScript(JSContext*, js::RunState&) js/src/vm/Interpreter.cpp 62 libxul.so js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) js/src/vm/Interpreter.cpp 63 libxul.so js_fun_call(JSContext*, unsigned int, JS::Value*) js/src/jsfun.cpp 64 libxul.so js_fun_apply(JSContext*, unsigned int, JS::Value*) js/src/jsfun.cpp 65 libxul.so js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) js/src/jscntxtinlines.h 66 libxul.so Interpret js/src/vm/Interpreter.cpp 67 libxul.so js::RunScript(JSContext*, js::RunState&) js/src/vm/Interpreter.cpp 68 libxul.so js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) js/src/vm/Interpreter.cpp 69 libxul.so js::Invoke(JSContext*, JS::Value const&, JS::Value const&, unsigned int, JS::Value*, JS::Value*) js/src/vm/Interpreter.cpp 70 libxul.so js::DirectProxyHandler::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) js/src/jsproxy.cpp 71 libxul.so js::CrossCompartmentWrapper::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) js/src/jswrapper.cpp 72 libxul.so js::Proxy::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) js/src/jsproxy.cpp 73 libxul.so proxy_Call js/src/jsproxy.cpp 74 libxul.so js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) js/src/jscntxtinlines.h 75 libxul.so js::Invoke(JSContext*, JS::Value const&, JS::Value const&, unsigned int, JS::Value*, JS::Value*) js/src/vm/Interpreter.cpp 76 libxul.so js::InvokeGetterOrSetter(JSContext*, JSObject*, JS::Value const&, unsigned int, JS::Value*, JS::Value*) js/src/vm/Interpreter.cpp 77 libxul.so js_NativeGet(JSContext*, JS::Handle<JSObject*>, JS::Handle<JSObject*>, JS::Handle<js::Shape*>, unsigned int, JS::MutableHandle<JS::Value>) js/src/vm/Shape-inl.h 78 libxul.so Interpret js/src/vm/Interpreter-inl.h 79 libxul.so js::RunScript(JSContext*, js::RunState&) js/src/vm/Interpreter.cpp 80 libxul.so js::Execute(JSContext*, JS::Handle<JSScript*>, JSObject&, JS::Value*) js/src/vm/Interpreter.cpp 81 libxul.so JS_ExecuteScript(JSContext*, JSObject*, JSScript*, JS::Value*) js/src/jsapi.cpp 82 libxul.so JS_ExecuteScriptVersion(JSContext*, JSObject*, JSScript*, JS::Value*, JSVersion) js/src/jsapi.cpp 83 libxul.so mozJSComponentLoader::ObjectForLocation(nsIFile*, nsIURI*, JSObject**, char**, bool, JS::MutableHandle<JS::Value>) js/xpconnect/loader/mozJSComponentLoader.cpp 84 libxul.so mozJSComponentLoader::LoadModule(mozilla::FileLocation&) js/xpconnect/loader/mozJSComponentLoader.cpp 85 libxul.so nsComponentManagerImpl::KnownModule::Load() xpcom/components/nsComponentManager.cpp 86 libxul.so nsFactoryEntry::GetFactory() xpcom/components/nsComponentManager.cpp 87 libxul.so nsComponentManagerImpl::CreateInstance(nsID const&, nsISupports*, nsID const&, void**) xpcom/components/nsComponentManager.cpp 88 libxul.so nsComponentManagerImpl::GetService(nsID const&, nsID const&, void**) xpcom/components/nsComponentManager.cpp 89 libxul.so nsJSCID::GetService(JS::Value const&, JSContext*, unsigned char, JS::Value*) js/xpconnect/src/XPCJSID.cpp 90 libxul.so NS_InvokeByIndex xpcom/reflect/xptcall/src/md/unix/xptcinvoke_arm.cpp 91 libxul.so XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) js/xpconnect/src/XPCWrappedNative.cpp 92 libxul.so XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*) js/xpconnect/src/XPCWrappedNativeJSOps.cpp 93 libxul.so js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) js/src/jscntxtinlines.h 94 libxul.so Interpret js/src/vm/Interpreter.cpp 95 libxul.so js::RunScript(JSContext*, js::RunState&) js/src/vm/Interpreter.cpp 96 libxul.so js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) js/src/vm/Interpreter.cpp 97 libxul.so js::Invoke(JSContext*, JS::Value const&, JS::Value const&, unsigned int, JS::Value*, JS::Value*) js/src/vm/Interpreter.cpp 98 libxul.so JS_CallFunctionValue(JSContext*, JSObject*, JS::Value, unsigned int, JS::Value*, JS::Value*) js/src/jsapi.cpp 99 libxul.so nsXPCWrappedJSClass::CallMethod(nsXPCWrappedJS*, unsigned short, XPTMethodDescriptor const*, nsXPTCMiniVariant*) js/xpconnect/src/XPCWrappedJSClass.cpp 100 libxul.so nsXPCWrappedJS::CallMethod(unsigned short, XPTMethodDescriptor const*, nsXPTCMiniVariant*) js/xpconnect/src/XPCWrappedJS.cpp 162 libdvm.so libdvm.so@0xb9ce6 163 libdvm.so libdvm.so@0x60387 164 libdvm.so libdvm.so@0x6d4c3 165 libdvm.so libdvm.so@0xb9ce6 166 libdvm.so libdvm.so@0x60387 167 libdvm.so libdvm.so@0xb5012 168 libdvm.so libdvm.so@0x60437 169 libdvm.so libdvm.so@0x60387 170 libc.so libc.so@0x1319e 171 libc.so libc.so@0x12cd6 More crashes : https://crash-stats.mozilla.com/report/list?product=FennecAndroid&signature=nsStandardURL%3A%3ABuildNormalizedSpec%28char+const*%29
Component: General → Networking
Product: Firefox for Android → Core
Version: Firefox 23 → 24 Branch
There are four crashes in 24.0a2 and two in 25.0a1.
Keywords: regression
Hardware: All → ARM
Whiteboard: [native-crash]
With the stack trace above, it first showed up in 24.0a1/20130623.
Summary: crash in nsStandardURL::BuildNormalizedSpec(char const*) → crash in nsStandardURL::SetSpec @ nsStandardURL::BuildNormalizedSpec
It's #3 crasher in 24.0b1, #24 in 25.0a2, and #32 in 26.0a1.
tracking-fennec: --- → ?
Keywords: topcrash
tracking-fennec: ? → 24+
:kbrosanan/:kairo, Can we get any device/OS correlations here which may help narrow down any info related to this crash. :mfinkle, Would anyone from the mobile team be the right assignee here or would you recommend reaching out to the networking team given the component and stack trace as a first stab from Engineering ?
Flags: needinfo?(mark.finkle)
Flags: needinfo?(kbrosnan)
I don't think a developer could begin to debug this yet. It's needle-in-a-haystack right now. We need to be able to reproduce the crash. Sometimes we can do some code review to look for suspicious code. In this case, we don't know where to look. The stack itself is too high level and is a very common code path.
Flags: needinfo?(mark.finkle)
Samsung GT-P6200 18 Samsung GT-I9100 9 Samsung GT-P6200L 5 Samsung GT-P6210 4 Samsung GT-P6800 3 Samsung GT-N7000 2 LGE LG-P880 2 HTC One X 2 FUJITSU ISW13F 2 TOSHIBA AT570 1 Samsung SC-02C 1 TOSHIBA AT100 1 Galaxy Tab, Galaxy SII international, Galaxy Note lead the device list. Though the listing makes me suspicious that this is not device specific.
Flags: needinfo?(kbrosnan)
Loaded the URLs on a GT-P7300 while not on the device list it is a similar vintage to the other affected Samsung devices. No crashes resulted from loading the URLs. As I noted in the channel meeting the URLs trend non-English. I strongly suspect that all these user have a non-English Android system language set. Common URLs from the 28 day data set include * https://duckduckgo.com with Spanish word queries * http://www.computerra.ru shows up twice * http://ortlieb.de is one of the German language sites For the search queries I tried opening the first 20 results. I tried * Bookmarking some of the URLs * reading mode for some of the URLs. * Adding to the reading list * Setting as a home screen launcher * Launching from a home screen launcher * Loading from an intent from another program * General browsing based off the crash URLs * Setting the system language to Chinese Traditional I loaded the most recent crashes and checked if they contain the same extension. Most had no extensions installed. The crash comments do not have any useful STR.
Outside of emailing localizers to check their crash IDs, or trying to get in touch with those people submitting their emails, this may have to be a wontfix for 24 :\
Whiteboard: [native-crash] → [native-crash][unactionable]
tracking-fennec: 24+ → +
So looking at crash stats new devices view. 20% of the crashes happen on GT-I9100 will try loading URLs on that.
Flags: needinfo?(kbrosnan)
Note to myself the GT-I9100s that are crashing most frequently are reporting Android API level 10 (Android 2.3.3 - 2.3.7).
Flags: needinfo?(kbrosnan)
Flags: needinfo?(kbrosnan)
Flags: needinfo?(kbrosnan)
Whiteboard: [native-crash][unactionable] → [native-crash]
Randell, this is crashing in code you touched last (bug 125608 & bug 706249), any thoughts?
Flags: needinfo?(rjesup)
Well, it's very odd it's crashing on Android and not desktop - the URLs aren't android-specific, and neither is the code. So either the environment is different somehow, or something external changes, or some ownership assumption is being violated. (Or a compiler bug, or a random stack/etc difference.) Since it's virtually non-repeatable/random, it smells like timing and something external, or maybe stack/initialization garbage - but ASAN should have flagged any generic problem here, and we don't see it on desktop. Looking at the code, but I doubt anything will jump up and bite me in the face. Maybe I can think of some tripwires or double-checks to add
Flags: needinfo?(rjesup)
Thanks!
The crash is claimed to be on line 465 here if the crash-stats links aren't lying: 463 nsStandardURL::AppendToBuf(char *buf, uint32_t i, const char *str, uint32_t len) 464 { 465 memcpy(buf + i, str, len); There are lots of AppendToBuf calls in BuildNormalizedSpec. And this is a SIGBUS with a claimed address of 0x0, again if we can believe the crash-stats output. Here "buf" is whatever mSpec.BeginWriting returned, and we did check that mSpec.SetLength returned true. So I would hope that buf is not null. The "str" argument is either the literal string "://" or portbuf.get(). Could we be in a situation where portbuf.get() is null? Could crash-stats be wrong about the SIGBUS being on 0x0?
Oh, _if_ 0x0 is the right address then my best guess is that something here does a fallible allocation that fails but is then not null-checked.
And portbuf is an autostring, so I wouldn't expect it to need to allocate just to do AppendInt (though AppendInt in fact will silently fail to append anything on OOM if it _does_ need to allocate).
Do we to try to reproduce again (hopefully with new direction for QA), or should we continue with code/stack investigation? Perhaps there is some extra logging we can add to the client and collect?
I'll look to see if I can put up a patch that will kick out distinctive crashes that show what the real cause is (and throw logs if we get make it happen in tbpl/try). Or maybe a patch that complains and hides it without crashing, and see if the crashes disappear (though if there are a couple of cases we may not know which it was, but we may not care).
It's line 471 in 26a2 (which is also the memcpy()). And portbuf has a 64-byte stack buffer, so as bz says, it's hard to believe that could be the source of the null. Both uses of AppendToBuf() are preceded by code that does stuff with what buf points to, so *that* really can't be it unless something is trashing the ptr. So, I don't see how this can be a null there barring trashing. So perhaps the line number being reported is incorrect. (linker weirdness?)
Because of the inlining of AppendToBuf(), in jimdb I'm seeing the *start* of BuildNormalizedSpec() be listed as line 471 in "disass /m" (after b nsStandardURL.cpp:471 in an opt debug build). So I suspect it's not really AppendToBuf() that's derefing null. (jim, does this seem reasonable? Can you look at the generated code for BuildNormalizedURL/AppendToBuf in FF 24 (others that have crashes on this, like 26a2) and see what code the crashstats might identify as 471?) Ted, likewise since you know something about crashstats? If it's all of BuildNormalizedSpec, it's a bigger space; if it's just the start.... In theory we should be able to say "if crashstats says it failed here to here it's line 471". Also, jim - can you dump the actual code at some of the actual crash addresses in the reports (you may need to look at the raw reports) so we can try to figure out which operation really failed? Thanks! Fun fun fun...
Flags: needinfo?(nchen)
(In reply to Boris Zbarsky [:bz] (Vacation Oct 12 - Oct 27) from comment #15) > The crash is claimed to be on line 465 here if the crash-stats links aren't > lying: The top frame of the crash report comes directly from the signal handler context, so it's the most reliable thing we have. (Whether the source line matches up is debatable, certainly.) > Could crash-stats be wrong about the SIGBUS being on 0x0? Breakpad simply uses the si_addr field of the siginfo_t it gets in the signal handler to fill the "crash address" field: http://code.google.com/p/google-breakpad/source/browse/trunk/src/client/linux/minidump_writer/minidump_writer.cc#1759 If this is being set incorrectly then it's either the kernel or libc's fault, not ours. I will take a peek at the symbols in a bit.
Dumping the minidump through minidump_stackwalk shows that the top frame is at libxul.so + 0x35a67e. The matching line in the symbol file does indeed map this to line 465 in that file. Per the symbols, the preceding bytes in this function map to this line: http://hg.mozilla.org/releases/mozilla-release/annotate/394220b26adc/netwerk/base/src/nsStandardURL.cpp#l567 So the crash does appear to be occurring in the inlined AppendToBuf call on the following line. I don't really have the motivation to fetch the matching binaries and disassemble the function in question, but that would probably be the next step.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #23) > Dumping the minidump through minidump_stackwalk shows that the top frame is > at libxul.so + 0x35a67e. The matching line in the symbol file does indeed > map this to line 465 in that file. Per the symbols, the preceding bytes in > this function map to this line: > http://hg.mozilla.org/releases/mozilla-release/annotate/394220b26adc/netwerk/ > base/src/nsStandardURL.cpp#l567 That's AppendToBuf(buf, i, "://", 3). Given that's right after two functions that write to buf, buf really can't/shouldn't be null, nor should a string literal. Seriously weird. And why android only? Compiler bug? > So the crash does appear to be occurring in the inlined AppendToBuf call on > the following line. I don't really have the motivation to fetch the matching > binaries and disassemble the function in question, but that would probably > be the next step. Brad, can you find someone good at ARM asm to look over a production build where we have reports of this bug? (24 and 26a2 at least show it; mostly reports from 24)
Flags: needinfo?(blassey.bugs)
Jim, mind having a look?
Flags: needinfo?(blassey.bugs)
This looks to be an unaligned access fault. GCC 4.7 now defaults to -munaligned-access on ARM. With that option, GCC optimizes the 3-byte memcpy into a 2-byte read/write plus a 1-byte read-write, and the string literal "://" happens to be not 2-byte aligned. Unaligned access on ARMv6/ARMv7 depends on the kernel setting the right CPU flags, and it appears on some oldish kernels it still causes a problem. I think it's best to include -mno-unaligned-access in our flags, but obviously we'd take some performance penalty on platforms that do support unaligned access. Glandium, where would be the best place to include -mno-unaligned-access?
Flags: needinfo?(nchen) → needinfo?(mh+mozilla)
That shouldn't crash the memcpy, though. Are you sure this is the problem?
Flags: needinfo?(mh+mozilla)
(In reply to Mike Hommey [:glandium] from comment #28) > That shouldn't crash the memcpy, though. Are you sure this is the problem? I think so. In this case, with default -munaligned-access, the memcpy call is optimized into > ldrh r0, [r3, #0] < SIGBUS > ldrb r3, [r3, #2] > strh r0, [r5, r2] > adds r1, r5, r2 > strb r3, [r1, #2] r3 is 0x6223A5B3 in one of the crash reports. With -mno-unaligned-access, the memcpy call is kept and not optimized away.
BTW, the call in question is > memcpy(buf + i, "://", 3); where buf is r5, i is r2, and the string literal is r3 in the disassembly above.
Is /proc/cpu/alignment "2" on the devices where the crash happens?
I doubt we have any easy way to know that as we don't have a device with this problem; perhaps we can contact someone who gave a report? blassey: shall we reassign this to someone on your team, or to build eng? I don't want this one to drop on the floor since we're closing on a solution (though I don't think we have 100% confidence yet on the cause).
Flags: needinfo?(blassey.bugs)
Attached patch no_unalign.patch (obsolete) — Splinter Review
Assignee: nobody → blassey.bugs
Attachment #816737 - Flags: review?(mh+mozilla)
Flags: needinfo?(blassey.bugs)
Attached patch nspr_no_unalign.patch (obsolete) — Splinter Review
Ted, feel free to steal Glandium's review if you get here first
Attachment #816742 - Flags: review?(ted)
Comment on attachment 816737 [details] [diff] [review] no_unalign.patch Review of attachment 816737 [details] [diff] [review]: ----------------------------------------------------------------- ::: build/autoconf/arch.m4 @@ +163,5 @@ > > +case "$MOZ_ALIGN" in > +toolchain-default|"") > + align_flag="" > + ;; You can remove this case, since it's the same as * below.
Attachment #816737 - Flags: review?(mh+mozilla) → review+
topcrash is being replaced by more precise keywords per https://bugzilla.mozilla.org/show_bug.cgi?id=927557#c3
(In reply to Mike Hommey [:glandium] from comment #31) > Is /proc/cpu/alignment "2" on the devices where the crash happens? cat /proc/cpu/alignment User faults: 4 (signal)
Attached patch no_unalign.patch (obsolete) — Splinter Review
Attachment #816737 - Attachment is obsolete: true
Attachment #820256 - Flags: review?(mh+mozilla)
Comment on attachment 820256 [details] [diff] [review] no_unalign.patch try run failed on b2g
Attachment #820256 - Flags: review?(mh+mozilla) → review-
Attached patch no_unalign.patchSplinter Review
now actually testing the flag try run: https://tbpl.mozilla.org/?tree=Try&rev=71c83ecea53e
Attachment #820256 - Attachment is obsolete: true
Attachment #820311 - Flags: review?(mh+mozilla)
Comment on attachment 820311 [details] [diff] [review] no_unalign.patch Review of attachment 820311 [details] [diff] [review]: ----------------------------------------------------------------- ::: build/autoconf/arch.m4 @@ +163,5 @@ > > +case "$MOZ_ALIGN" in > +toolchain-default|"") > + align_flag="" > + ;; As per comment 35, please remove this case. @@ +179,5 @@ > +if test -n "$align_flag"; then > + _SAVE_CFLAGS="$CFLAGS" > + CFLAGS="$CFLAGS $align_flag" > + AC_MSG_CHECKING(whether alignment flag ($align_flag) is supported) > + AC_TRY_COMPILE([],[return 0;],,align_flag="") AC_TRY_COMPILE([],[],,align_flag="") should be enough.
Attachment #820311 - Flags: review?(mh+mozilla) → review+
nspr patch with test for compiler support
Attachment #816742 - Attachment is obsolete: true
Attachment #816742 - Flags: review?(ted)
Attachment #820970 - Flags: review?(ted)
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
Fx26 is affected. The crash is currently #7. Is this safe enough to uplift?
Given that this is one of the major regressions that has made our crash rate worsen between the 23 and 24 releases, and it's now the #7 topcrash on 25 release and 26 beta, and the fix has worked on 27 and later, I'm requesting tracking and ni? on blassey for uplift depending on risk assessment.
Flags: needinfo?(blassey.bugs)
I think this is relatively low risk. I'd also like to get the NSPR patch on trunk though, so adding a needinfo for ted.
Flags: needinfo?(blassey.bugs) → needinfo?(ted)
Attachment #820311 - Flags: approval-mozilla-beta?
Attachment #820311 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment on attachment 820970 [details] [diff] [review] nspr_no_unalign.patch Review of attachment 820970 [details] [diff] [review]: ----------------------------------------------------------------- r=me with one change. (Also we usually like to have NSPR changes in separate bugs for tracking purposes.) ::: nsprpub/configure.in @@ +1149,5 @@ > ;; > esac > > +case "$MOZ_ALIGN" in > +toolchain-default|"") I don't see anywhere that sets MOZ_ALIGN=toolchain-default. (NSPR doesn't have arch.m4 which is what the top-level configure uses.) Can you fix that to set that as the default?
Attachment #820970 - Flags: review?(ted) → review+
Flags: needinfo?(ted)
Blocks: 941461
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: