50.85 KB, text/plain
44.08 KB, text/plain
925 bytes, patch
|Details | Diff | Splinter Review|
6.35 KB, application/x-gzip
8.12 KB, application/x-gzip
1.63 KB, patch
|Details | Diff | Splinter Review|
Created attachment 535858 [details] crash report linux 1. http://nfomation.net/sitemap.xml 2. Crash after timing out Windows XP and Linux 32bit beta so far. One instance on Linux so far. Operating system: Linux 0.0.0 Linux 188.8.131.52-91.fc14.i686.PAE #1 SMP Tue May 3 13:29:55 UTC 2011 i686 CPU: x86 GenuineIntel family 6 model 44 stepping 2 1 CPU Crash reason: SIGSEGV Crash address: 0x3db5 Thread 0 (crashed) 0 libxul.so!txExecutionState::end [txExecutionState.cpp : 202 + 0x6] eip = 0x016edce2 esp = 0xbf9bc3a0 ebp = 0xbf9bc3c8 ebx = 0x03282054 esi = 0x0a8198c0 edi = 0x014a0cf6 eax = 0x00003db5 ecx = 0x0a78c220 edx = 0x00b2a3a0 efl = 0x00010246 Found by: given as instruction pointer in context 1 libxul.so!txMozillaXSLTProcessor::TransformToDoc [txMozillaXSLTProcessor.cpp : 718 + 0x14] eip = 0x017319ef esp = 0xbf9bc3d0 ebp = 0xbf9bc5a8 ebx = 0x03282054 esi = 0x0a8198c0 edi = 0x014a0cf6 Found by: call frame info 2 libxul.so!nsTransformBlockerEvent::Run [txMozillaXSLTProcessor.cpp : 595 + 0x25] eip = 0x01735e82 esp = 0xbf9bc5b0 ebp = 0xbf9bc5c8 ebx = 0x03282054 esi = 0xbf9bc61c edi = 0x00000000 Found by: call frame info 3 libxul.so!nsThread::ProcessNextEvent [nsThread.cpp : 618 + 0x16] eip = 0x0245a965 esp = 0xbf9bc5d0 ebp = 0xbf9bc658 ebx = 0x03282054 esi = 0xbf9bc61c edi = 0x00000000 Found by: call frame info 4 libxul.so!NS_ProcessNextEvent_P [nsThreadUtils.cpp : 250 + 0x1f] eip = 0x023ebabd esp = 0xbf9bc660 ebp = 0xbf9bc698 ebx = 0x03282054 esi = 0x00000001 edi = 0x0a00e018 Found by: call frame info 5 libxul.so!mozilla::ipc::MessagePump::Run [MessagePump.cpp : 110 + 0x15] eip = 0x022ca908 esp = 0xbf9bc6a0 ebp = 0xbf9bc6e8 ebx = 0x03282054 esi = 0x00000001 edi = 0x0a00e018 Found by: call frame info
Created attachment 535859 [details] crash report xp damn, I didn't notice the address at first. Operating system: Windows NT 5.1.2600 Service Pack 3 CPU: x86 GenuineIntel family 6 model 44 stepping 2 1 CPU Crash reason: EXCEPTION_ACCESS_VIOLATION_READ Crash address: 0xffffffffddddddf1 Thread 0 (crashed) 0 xul.dll!txExecutionState::end(unsigned int) [txExecutionState.cpp : 202 + 0x12] eip = 0x109e3688 esp = 0x0012d4f4 ebp = 0x0012d4fc ebx = 0x00000001 esi = 0x01987af8 edi = 0x00000000 eax = 0x0012d540 ecx = 0x036a4e80 edx = 0xdddddddd efl = 0x00010286 Found by: given as instruction pointer in context 1 xul.dll!txMozillaXSLTProcessor::TransformToDoc(nsIDOMDocument *,nsIDOMDocument * *) [txMozillaXSLTProcessor.cpp : 718 + 0x11] eip = 0x10a2ff29 esp = 0x0012d504 ebp = 0x0012d6d0 Found by: call frame info 2 xul.dll!nsTransformBlockerEvent::Run() [txMozillaXSLTProcessor.cpp : 595 + 0x15] eip = 0x10a2f749 esp = 0x0012d6d8 ebp = 0x0012d6e0 Found by: call frame info 3 xul.dll!nsThread::ProcessNextEvent(int,int *) [nsThread.cpp : 618 + 0x18] eip = 0x118cb174 esp = 0x0012d6e8 ebp = 0x0012d748 Found by: call frame info 4 xul.dll!NS_ProcessNextEvent_P(nsIThread *,int) [nsThreadUtils.cpp : 250 + 0x15] eip = 0x1184d113 esp = 0x0012d750 ebp = 0x0012d764 Found by: call frame info 5 xul.dll!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate *) [MessagePump.cpp : 110 + 0xd] eip = 0x1171161d esp = 0x0012d76c ebp = 0x0012d798 Found by: call frame info
Couldn't reproduce on a Mac Aurora (release, not debug). I doubt we fixed anything in xslt between Fx5 and Fx6. The document is a huge sitemap and takes forever to process, maybe you've got less memory dedicated to your VM?
I was going to zip up the site .xml and .xsl files and attach them, but even zipped they're 6Mb. Uncompressed the sitemap is 50Mb
Yeah, the original vms this ran on are only 1G for xp and 2G for linux. I only reproduced this with beta though, not aurora or nightly. I'll try locally.
I tried to reproduce locally with a 1G vm on WinXP and Linux but have been unable to so far. I can reproduce it in the automation vms though. I'll keep trying. I do have working valgrind on linux now that I've disabled elfhack. I'll try that too.
automation retested with Windows XP, Windows 7, Linux 32bit and 64bit using 2.0.0, beta, aurora, nightly. signature address os branch txExecutionState::end 0xffffffffddddddf1 Windows XP beta 0xffffffffddddddf1 Windows 7 beta 0x6cf3 Linux 32bit beta 0x0 Linux 64bit nightly txMozillaXSLTProcessor::TransformToDoc 0x1 Windows XP aurora/nightly 0x1 Windows 7 aurora/nightly trying a valgrind run on 64bit linux now.
valgrind on 32bit fedora 14 nightly gave during shutdown ==23727== Mismatched free() / delete / delete  ==23727== at 0x4006951: operator delete(void*) (vg_replace_malloc.c:387) ==23727== by 0x5DD4C77: js::RegExp::~RegExp() (jsregexpinlines.h:133) ==23727== by 0x5DD89EE: void JSContext::delete_<js::RegExp>(js::RegExp*) (in /work/mozilla/builds/nightly/mozilla/firefox-debug/toolkit/library/libxul.so) ==23727== by 0x5DD5C76: js::RegExp::decref(JSContext*) (jsregexpinlines.h:568) ==23727== by 0x5EE5F95: regexp_finalize(JSContext*, JSObject*) (jsregexp.cpp:393) ==23727== by 0x5E498F3: JSObject::finalize(JSContext*) (jsobjinlines.h:141) ==23727== by 0x5E50B61: bool js::gc::Arena::finalize<JSObject_Slots8>(JSContext*) (jsgc.cpp:224) ==23727== by 0x5E47FE2: void js::gc::FinalizeArenas<JSObject_Slots8>(JSContext*, js::gc::ArenaHeader**) (jsgc.cpp:271) ==23727== by 0x5E4DDC6: void js::gc::ArenaList::finalizeNow<JSObject_Slots8>(JSContext*) (jsgc.cpp:1216) ==23727== by 0x5E45AF5: JSCompartment::finalizeObjectArenaLists(JSContext*) (jsgc.cpp:1966) ==23727== by 0x5E46B09: MarkAndSweep(JSContext*, JSCompartment*, JSGCInvocationKind, js::GCTimer&) (jsgc.cpp:2348) ==23727== by 0x5E4743F: GCCycle(JSContext*, JSCompartment*, JSGCInvocationKind, js::GCTimer&) (jsgc.cpp:2643) ==23727== Address 0xfa47268 is 0 bytes inside a block of size 104 alloc'd ==23727== at 0x40078AE: malloc (vg_replace_malloc.c:236) ==23727== by 0x6047A93: js_malloc (jsutil.h:234) ==23727== by 0x604FB0A: JSC::Yarr::BytecodePattern* js::OffTheBooks::new_<JSC::Yarr::BytecodePattern, JSC::Yarr::ByteDisjunction*, JSC::Yarr::Vector<JSC::Yarr::ByteDisjunction*, 0u>, JSC::Yarr::Ref<JSC::Yarr::YarrPattern>, WTF::Bump PointerAllocator*>(JSC::Yarr::ByteDisjunction*, JSC::Yarr::Vector<JSC::Yarr::ByteDisjunction*, 0u>, JSC::Yarr::Ref<JSC::Yarr::YarrPattern>, WTF::BumpPointerAllocator*) (in /work/mozilla/builds/nightly/mozilla/firefox-debug/toolkit/libr ary/libxul.so) ==23727== by 0x604D416: JSC::Yarr::ByteCompiler::compile(WTF::BumpPointerAllocator*) (YarrInterpreter.cpp:1461) ==23727== by 0x6047AEE: JSC::Yarr::byteCompile(JSC::Yarr::YarrPattern&, WTF::BumpPointerAllocator*) (YarrInterpreter.cpp:1897) ==23727== by 0x5DD5A80: js::RegExp::compileHelper(JSContext*, JSLinearString&) (jsregexpinlines.h:493) ==23727== by 0x5DD5B2C: js::RegExp::compile(JSContext*) (jsregexpinlines.h:506) ==23727== by 0x5DD5777: js::RegExp::create(JSContext*, JSString*, unsigned int) (jsregexpinlines.h:420) ==23727== by 0x5EE6729: SwapRegExpInternals(JSContext*, JSObject*, js::Value*, JSString*, unsigned int) (jsregexp.cpp:570) ==23727== by 0x5EE6FF9: CompileRegExpAndSwap(JSContext*, JSObject*, unsigned int, js::Value*, js::Value*) (jsregexp.cpp:758) ==23727== by 0x5EE7186: regexp_construct(JSContext*, unsigned int, js::Value*) (jsregexp.cpp:797) ==23727== by 0x5E66063: js::CallJSNative(JSContext*, int (*)(JSContext*, unsigned int, js::Value*), unsigned int, js::Value*) (jscntxtinlines.h:277) ==23727==
Gal, who should look at a mismatched free/delete in js::RegExp::~RegExp() ?
cdleary knows the code well.
Created attachment 538264 [details] [diff] [review] Match new and delete. dmandelin, this was one of the // YYY things from the update YARR patch with the bare delete. What are those // YYYs about?
Comment on attachment 538264 [details] [diff] [review] Match new and delete. The YYY was a tag I was using during development to mark things that needed to be fixed up in some way before landing. There would have been some in the intermediate patches, but there should have been none in the final patch. No idea how that one got through.
Hope bc can retest when this patch lands to make sure it fixes the bug he's seeing.
http://hg.mozilla.org/tracemonkey/rev/e39bcd6cba32 I'm not going to mark as fixed-in-tracemonkey because I'm unsure this valgrind spew reflects the actual cause of the original error.
cdleary-bot mozilla-central merge info: http://hg.mozilla.org/mozilla-central/rev/e39bcd6cba32 Note: not marking as fixed because fixed-in-tracemonkey is not present on the whiteboard.
I'll test the patch on tracemonkey to see if the valgrind issues were related.
Created attachment 539240 [details] 32 bit valgrind run on tracemonkey I don't crash locally on either 32 bit or 64 bit when just running the browser, but do crash in both (in different places) when running under valgrind. 32 bit crashes just after a use and read of an uninitialized value in nsDOMOfflineResourceList::SwapCache() (nsDOMOfflineResourceList.cpp:543)
Created attachment 539242 [details] 64 bit valgrind run on tracemonkey 64 bit crashes during UnwindBacktrace with js::ExternalGetOrSet(JSContext*, JSObject*, jsid, js::Value const&, JSAccessMode, unsigned int, js::Value*, js::Value*) (jsinterp.cpp:845) on the stack after an invalid read
Jonas, can you look into this? Is there anything XSLT related going on here, or is this something else?
Peter, could you look into this XSLT critical crash bug?
This won't make it for the 6 train.
I can't reproduce this in a trunk build on linux or OS X.
I just retested the url using the automation on Windows XP, Windows 7, Mac OS X 10.5, Fedora 14 32bit and 64 bit and could only reproduce crashes on Windows XP and Windows 7 for beta, aurora and nightly. I didn't see the deleted memory in the crash address but Ted's exploitable tool for analyzing the minidumps showed high for aurora and nightly but none for beta for some reason. I'm running it locally under msvc with a slightly differently configured xp but haven't crashed yet. More news as I get it.
Ok, things have changed a bit in terms of how to reproduce. I haven't been able to crash just loading the page either in WIndows XP or Mac OS X 10.5 on beta, aurora or nightly. I can reproduce on Windows XP and Mac OS X 10.5 on beta, aurora and nightly at least if I: 1. start browser and attach debugger 2. start loading http://nfomation.net/sitemap.xml 3. exit the browser before the file has completed transferring. I haven't seen any indication of deleted memory. Not sure if that is meaningful or not. I'll try it out on Linux w and wo valgrind to see if there is anything useful there.
If you need the files from when this bug was first reported I think I have a copy of that state -- let me know and I can put it up on people.m.o or something
I was able to reproduce this in a Windows XP VM using the latest trunk nightly. https://crash-stats.mozilla.com/report/index/bp-10f3bb0f-1a61-4bb5-8e90-d4eee2110811 is my crash report.
Created attachment 555522 [details] [diff] [review] v1 Don't really know how we could make an automated testcase for this, we need to either run out of memory or quit while loading the source file.
Peter, do you think we should just go ahead and take this fix w/o tests given the lack of an environment where we can realistically test this? If so, is this patch ready to review, or are you still working on it? We're running very low on time for fixes for 7, and this would be great to get in there...
Comment on attachment 555522 [details] [diff] [review] v1 If we stop before we have source and stylesheet then txExecutionState::init fails, but that was leaving mOutputHandler as a dangling pointer.
Comment on attachment 555522 [details] [diff] [review] v1 Review of attachment 555522 [details] [diff] [review]: ----------------------------------------------------------------- r=me
Pushed to inbound: http://hg.mozilla.org/integration/mozilla-inbound/rev/c3d40a5579b1 And I realized I forgot to add r=sicking in the commit message :(
NO SOUP FOR YOU!
Clearly you must mean no Burrata :)
Comment on attachment 555522 [details] [diff] [review] v1 This is effectively changing us from accessing uninitialized memory to initializing a pointer and adding a null check. Very safe fix, and a sg:critical bug. We should IMO take this on both aurora and beta.
Comment on attachment 555522 [details] [diff] [review] v1 Approved for mozilla-aurora and mozilla-beta
We should take this safe fix in 3.6.x as well.
Changing to qa- as this is an assertion.