Closed Bug 784368 Opened 7 years ago Closed 7 years ago
"make package" broken in ASan builds (GC related ASan failure)
Currently, the ASan builds on mozilla-central are broken due to an xpcshell invocation in make package failing. The invocation failing is dist/bin/xpcshell -g "$PWD" -a "$PWD" -f /srv/repos/browser/mozilla-central/toolkit/mozapps/installer/precompile_cache.js -e "populate_startupcache('GreD', 'omni.ja');" from the dist/firefox directory. This is related to JS caching. The ASan trace in a debug-only build looks like this (rev 35b8d6ef5d46): ==51726== ERROR: AddressSanitizer crashed on unknown address 0x7fabc4012850 (pc 0x7fabc4012850 sp 0x7fffd1f00b58 bp 0x7fffd1f00cf0 T0) AddressSanitizer can not provide additional info. ABORTING #0 0x7fabc4012850 #1 0x7fabf8debe15 in bool js::gc::Arena::finalize<JSObject>(js::FreeOp*, js::gc::AllocKind, unsigned long) js/src/jsgc.cpp:320 #2 0x7fabf8dca565 in bool js::gc::FinalizeTypedArenas<JSObject>(js::FreeOp*, js::gc::ArenaHeader**, js::gc::ArenaList&, js::gc::AllocKind, js::SliceBudget&) js/src/jsgc.cpp:383 #3 0x7fabf8cf0190 in js::gc::FinalizeArenas(js::FreeOp*, js::gc::ArenaHeader**, js::gc::ArenaList&, js::gc::AllocKind, js::SliceBudget&) js/src/jsgc.cpp:420 #4 0x7fabf8cf1f8e in js::gc::ArenaLists::finalizeNow(js::FreeOp*, js::gc::AllocKind) js/src/jsgc.cpp:1596 #5 0x7fabf8cf17aa in js::gc::ArenaLists::queueObjectsForSweep(js::FreeOp*) js/src/jsgc.cpp:1716 #6 0x7fabf8d9a31a in BeginSweepPhase(JSRuntime*) js/src/jsgc.cpp:3714 #7 0x7fabf8d955b1 in IncrementalCollectSlice(JSRuntime*, long, js::gcreason::Reason, js::JSGCInvocationKind) js/src/jsgc.cpp:4119 #8 0x7fabf8d92587 in GCCycle(JSRuntime*, bool, long, js::JSGCInvocationKind, js::gcreason::Reason) js/src/jsgc.cpp:4288 #9 0x7fabf8d1b994 in Collect(JSRuntime*, bool, long, js::JSGCInvocationKind, js::gcreason::Reason) js/src/jsgc.cpp:4396 #10 0x7fabf8d0a30b in js::GC(JSRuntime*, js::JSGCInvocationKind, js::gcreason::Reason) js/src/jsgc.cpp:4419 #11 0x7fabf8ac0264 in js::DestroyContext(JSContext*, js::DestroyContextMode) js/src/jscntxt.cpp:398 #12 0x7fabf87a1da8 in JS_DestroyContext js/src/jsapi.cpp:1242 #13 0x7fabee12ddda in mozJSComponentLoader::UnloadModules() js/xpconnect/loader/mozJSComponentLoader.cpp:969 #14 0x7fabee152989 in mozJSComponentLoader::Observe(nsISupports*, char const*, unsigned short const*) js/xpconnect/loader/mozJSComponentLoader.cpp:1308 #15 0x7fabee152a8f in non-virtual thunk to mozJSComponentLoader::Observe(nsISupports*, char const*, unsigned short const*) ???:0 #16 0x7fabf1c36961 in mozilla::ShutdownXPCOM(nsIServiceManager*) xpcom/build/nsXPComInit.cpp:664 #17 0x7fabf1c35ae3 in NS_ShutdownXPCOM_P xpcom/build/nsXPComInit.cpp:541 #18 0x7fac00bbba93 in NS_ShutdownXPCOM xpcom/stub/nsXPComStub.cpp:134 #19 0x429ed4 in main js/xpconnect/shell/xpcshell.cpp:1957 #20 0x7fabe25e230d in __libc_start_main /build/buildd/eglibc-2.13/csu/libc-start.c:258 However it affects all builds, including opt-only. Until the issue is resolved, I applied the patch in 742899 which disables the JS caching. This doesn't fix the actual problem but allows us to remove the blocking xpcshell invocation. I ran all tests on try and didn't see any unusual failures so it might be that the resulting builds can still be used for other testing. Billm has been looking into the issue already on one of my machines but I don't know about the progress so far.
I did a manual bisect and managed to isolate the regressing changeset: The first bad revision is: changeset: 102264:11372010763f user: David Rajchenbach-Teller date: Mon Aug 13 21:54:42 2012 -0400 summary: Bug 771927 - [OS.File] (de)serialize ArrayBuffer, C arrays, C pointers, Strings - Implementation. r=froydnj David, is this likely to be causing the actual problem or is it just surfacing another issue?
An analysis from terrence on IRC: <terrence> decoder: well, the bisected bug you found adds code to pass raw pointers to the insides of js objects out of the js engine without any sort of barrier or api interface <terrence> decoder: given that the failure appears to happen at free(), i'd immediately suspect heap corruption caused by this lunacy
I believe I know what happens, because I have been bitten by this already. My code uses of |nsIOSFileConstantsService|. This component requires some reasonable protocol (i.e. if initialization failed, do not execute the methods). Unfortunately, |populate_startupcache| manages to both fail initialization (due to some missing runtime dependencies – don't ask me how) and misuse the API (due to completely ignoring exceptions). Turns out that I already have a patch on inbound that should solve the issue, as part of bug 775588. Christian, could you recheck once that patch has landed? By the way, my patch effectively adds a memory-unsafe API (that will disappear once bug 720949 and bug 720083 have landed), but that API is never called by |populate_startupcache|.
Depends on: 775588
I am almost sure this is a dup of 785102.
In that case, a fix may already have landed.
The issue is gone now on tip (tested 1c0ac073dc65). Is this just masked or has the necessary fix landed?
A fix landed in bug 785102. There is still some discussion going on on that bug about a better way to fix it.
Resolving as fixed by comment 8 and I haven't hit this problem anymore since then.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.