Closed Bug 660453 Opened 13 years ago Closed 13 years ago

Crash [@ txExecutionState::end]

Categories

(Core :: XSLT, defect)

5 Branch
x86
All
defect
Not set
critical

Tracking

()

RESOLVED FIXED
Tracking Status
firefox5 - wontfix
firefox6 - wontfix
firefox7 + fixed
firefox8 + fixed
blocking2.0 --- -
status2.0 --- wontfix
blocking1.9.2 --- .23+
status1.9.2 --- .23-fixed

People

(Reporter: bc, Assigned: peterv)

References

()

Details

(Keywords: crash, reproducible, Whiteboard: [sg:critical?],[qa-])

Crash Data

Attachments

(6 files)

Attached file crash report linux
1. http://nfomation.net/sitemap.xml
2. Crash after timing out

Windows XP and Linux 32bit beta so far.

One instance on Linux so far.

Operating system: Linux
                  0.0.0 Linux 2.6.35.13-91.fc14.i686.PAE #1 SMP Tue May 3 13:29:55 UTC 2011 i686
CPU: x86
     GenuineIntel family 6 model 44 stepping 2
     1 CPU

Crash reason:  SIGSEGV
Crash address: 0x3db5

Thread 0 (crashed)
 0  libxul.so!txExecutionState::end [txExecutionState.cpp : 202 + 0x6]
    eip = 0x016edce2   esp = 0xbf9bc3a0   ebp = 0xbf9bc3c8   ebx = 0x03282054
    esi = 0x0a8198c0   edi = 0x014a0cf6   eax = 0x00003db5   ecx = 0x0a78c220
    edx = 0x00b2a3a0   efl = 0x00010246
    Found by: given as instruction pointer in context
 1  libxul.so!txMozillaXSLTProcessor::TransformToDoc [txMozillaXSLTProcessor.cpp : 718 + 0x14]
    eip = 0x017319ef   esp = 0xbf9bc3d0   ebp = 0xbf9bc5a8   ebx = 0x03282054
    esi = 0x0a8198c0   edi = 0x014a0cf6
    Found by: call frame info
 2  libxul.so!nsTransformBlockerEvent::Run [txMozillaXSLTProcessor.cpp : 595 + 0x25]
    eip = 0x01735e82   esp = 0xbf9bc5b0   ebp = 0xbf9bc5c8   ebx = 0x03282054
    esi = 0xbf9bc61c   edi = 0x00000000
    Found by: call frame info
 3  libxul.so!nsThread::ProcessNextEvent [nsThread.cpp : 618 + 0x16]
    eip = 0x0245a965   esp = 0xbf9bc5d0   ebp = 0xbf9bc658   ebx = 0x03282054
    esi = 0xbf9bc61c   edi = 0x00000000
    Found by: call frame info
 4  libxul.so!NS_ProcessNextEvent_P [nsThreadUtils.cpp : 250 + 0x1f]
    eip = 0x023ebabd   esp = 0xbf9bc660   ebp = 0xbf9bc698   ebx = 0x03282054
    esi = 0x00000001   edi = 0x0a00e018
    Found by: call frame info
 5  libxul.so!mozilla::ipc::MessagePump::Run [MessagePump.cpp : 110 + 0x15]
    eip = 0x022ca908   esp = 0xbf9bc6a0   ebp = 0xbf9bc6e8   ebx = 0x03282054
    esi = 0x00000001   edi = 0x0a00e018
    Found by: call frame info
Group: core-security
Attached file crash report xp
damn, I didn't notice the address at first.

Operating system: Windows NT
                  5.1.2600 Service Pack 3
CPU: x86
     GenuineIntel family 6 model 44 stepping 2
     1 CPU

Crash reason:  EXCEPTION_ACCESS_VIOLATION_READ
Crash address: 0xffffffffddddddf1

Thread 0 (crashed)
 0  xul.dll!txExecutionState::end(unsigned int) [txExecutionState.cpp : 202 + 0x12]
    eip = 0x109e3688   esp = 0x0012d4f4   ebp = 0x0012d4fc   ebx = 0x00000001
    esi = 0x01987af8   edi = 0x00000000   eax = 0x0012d540   ecx = 0x036a4e80
    edx = 0xdddddddd   efl = 0x00010286
    Found by: given as instruction pointer in context
 1  xul.dll!txMozillaXSLTProcessor::TransformToDoc(nsIDOMDocument *,nsIDOMDocument * *) [txMozillaXSLTProcessor.cpp : 718 + 0x11]
    eip = 0x10a2ff29   esp = 0x0012d504   ebp = 0x0012d6d0
    Found by: call frame info
 2  xul.dll!nsTransformBlockerEvent::Run() [txMozillaXSLTProcessor.cpp : 595 + 0x15]
    eip = 0x10a2f749   esp = 0x0012d6d8   ebp = 0x0012d6e0
    Found by: call frame info
 3  xul.dll!nsThread::ProcessNextEvent(int,int *) [nsThread.cpp : 618 + 0x18]
    eip = 0x118cb174   esp = 0x0012d6e8   ebp = 0x0012d748
    Found by: call frame info
 4  xul.dll!NS_ProcessNextEvent_P(nsIThread *,int) [nsThreadUtils.cpp : 250 + 0x15]
    eip = 0x1184d113   esp = 0x0012d750   ebp = 0x0012d764
    Found by: call frame info
 5  xul.dll!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate *) [MessagePump.cpp : 110 + 0xd]
    eip = 0x1171161d   esp = 0x0012d76c   ebp = 0x0012d798
    Found by: call frame info
Couldn't reproduce on a Mac Aurora (release, not debug). I doubt we fixed anything in xslt between Fx5 and Fx6. The document is a huge sitemap and takes forever to process, maybe you've got less memory dedicated to your VM?
I was going to zip up the site .xml and .xsl files and attach them, but even zipped they're 6Mb. Uncompressed the sitemap is 50Mb
Yeah, the original vms this ran on are only 1G for xp and 2G for linux. I only reproduced this with beta though, not aurora or nightly. I'll try locally.
I tried to reproduce locally with a 1G vm on WinXP and Linux but have been unable to so far. I can reproduce it in the automation vms though. I'll keep trying. I do have working valgrind on linux now that I've disabled elfhack. I'll try that too.
automation retested with Windows XP, Windows 7, Linux 32bit and 64bit using 2.0.0, beta, aurora, nightly.

signature  address os branch

txExecutionState::end                  0xffffffffddddddf1 Windows XP beta
                                       0xffffffffddddddf1 Windows 7 beta
                                       0x6cf3             Linux 32bit beta
                                       0x0                Linux 64bit nightly


txMozillaXSLTProcessor::TransformToDoc 0x1 Windows XP aurora/nightly
                                       0x1 Windows 7 aurora/nightly

trying a valgrind run on 64bit linux now.
Assignee: nobody → jonas
Whiteboard: [sg:critical?]
valgrind on 32bit fedora 14 nightly gave during shutdown

==23727== Mismatched free() / delete / delete []
==23727==    at 0x4006951: operator delete(void*) (vg_replace_malloc.c:387)
==23727==    by 0x5DD4C77: js::RegExp::~RegExp() (jsregexpinlines.h:133)
==23727==    by 0x5DD89EE: void JSContext::delete_<js::RegExp>(js::RegExp*) (in /work/mozilla/builds/nightly/mozilla/firefox-debug/toolkit/library/libxul.so)
==23727==    by 0x5DD5C76: js::RegExp::decref(JSContext*) (jsregexpinlines.h:568)
==23727==    by 0x5EE5F95: regexp_finalize(JSContext*, JSObject*) (jsregexp.cpp:393)
==23727==    by 0x5E498F3: JSObject::finalize(JSContext*) (jsobjinlines.h:141)
==23727==    by 0x5E50B61: bool js::gc::Arena::finalize<JSObject_Slots8>(JSContext*) (jsgc.cpp:224)
==23727==    by 0x5E47FE2: void js::gc::FinalizeArenas<JSObject_Slots8>(JSContext*, js::gc::ArenaHeader**) (jsgc.cpp:271)
==23727==    by 0x5E4DDC6: void js::gc::ArenaList::finalizeNow<JSObject_Slots8>(JSContext*) (jsgc.cpp:1216)
==23727==    by 0x5E45AF5: JSCompartment::finalizeObjectArenaLists(JSContext*) (jsgc.cpp:1966)
==23727==    by 0x5E46B09: MarkAndSweep(JSContext*, JSCompartment*, JSGCInvocationKind, js::GCTimer&) (jsgc.cpp:2348)
==23727==    by 0x5E4743F: GCCycle(JSContext*, JSCompartment*, JSGCInvocationKind, js::GCTimer&) (jsgc.cpp:2643)
==23727==  Address 0xfa47268 is 0 bytes inside a block of size 104 alloc'd
==23727==    at 0x40078AE: malloc (vg_replace_malloc.c:236)
==23727==    by 0x6047A93: js_malloc (jsutil.h:234)
==23727==    by 0x604FB0A: JSC::Yarr::BytecodePattern* js::OffTheBooks::new_<JSC::Yarr::BytecodePattern, JSC::Yarr::ByteDisjunction*, JSC::Yarr::Vector<JSC::Yarr::ByteDisjunction*, 0u>, JSC::Yarr::Ref<JSC::Yarr::YarrPattern>, WTF::Bump
PointerAllocator*>(JSC::Yarr::ByteDisjunction*, JSC::Yarr::Vector<JSC::Yarr::ByteDisjunction*, 0u>, JSC::Yarr::Ref<JSC::Yarr::YarrPattern>, WTF::BumpPointerAllocator*) (in /work/mozilla/builds/nightly/mozilla/firefox-debug/toolkit/libr
ary/libxul.so)
==23727==    by 0x604D416: JSC::Yarr::ByteCompiler::compile(WTF::BumpPointerAllocator*) (YarrInterpreter.cpp:1461)
==23727==    by 0x6047AEE: JSC::Yarr::byteCompile(JSC::Yarr::YarrPattern&, WTF::BumpPointerAllocator*) (YarrInterpreter.cpp:1897)
==23727==    by 0x5DD5A80: js::RegExp::compileHelper(JSContext*, JSLinearString&) (jsregexpinlines.h:493)
==23727==    by 0x5DD5B2C: js::RegExp::compile(JSContext*) (jsregexpinlines.h:506)
==23727==    by 0x5DD5777: js::RegExp::create(JSContext*, JSString*, unsigned int) (jsregexpinlines.h:420)
==23727==    by 0x5EE6729: SwapRegExpInternals(JSContext*, JSObject*, js::Value*, JSString*, unsigned int) (jsregexp.cpp:570)
==23727==    by 0x5EE6FF9: CompileRegExpAndSwap(JSContext*, JSObject*, unsigned int, js::Value*, js::Value*) (jsregexp.cpp:758)
==23727==    by 0x5EE7186: regexp_construct(JSContext*, unsigned int, js::Value*) (jsregexp.cpp:797)
==23727==    by 0x5E66063: js::CallJSNative(JSContext*, int (*)(JSContext*, unsigned int, js::Value*), unsigned int, js::Value*) (jscntxtinlines.h:277)
==23727==
Gal, who should look at a mismatched free/delete in js::RegExp::~RegExp() ?
cdleary knows the code well.
dmandelin, this was one of the // YYY things from the update YARR patch with the bare delete. What are those // YYYs about?
Attachment #538264 - Flags: review?(dmandelin)
Comment on attachment 538264 [details] [diff] [review]
Match new and delete.

The YYY was a tag I was using during development to mark things that needed to be fixed up in some way before landing. There would have been some in the intermediate patches, but there should have been none in the final patch. No idea how that one got through.
Attachment #538264 - Flags: review?(dmandelin) → review+
Hope bc can retest when this patch lands to make sure it fixes the bug he's seeing.
will do.
http://hg.mozilla.org/tracemonkey/rev/e39bcd6cba32

I'm not going to mark as fixed-in-tracemonkey because I'm unsure this valgrind spew reflects the actual cause of the original error.
Crash Signature: [@ xExecutionState::end]
cdleary-bot mozilla-central merge info:
http://hg.mozilla.org/mozilla-central/rev/e39bcd6cba32
Note: not marking as fixed because fixed-in-tracemonkey is not present on the whiteboard.
I'll test the patch on tracemonkey to see if the valgrind issues were related.
I don't crash locally on either 32 bit or 64 bit when just running the browser, but do crash in both (in different places) when running under valgrind.

32 bit crashes just after a use and read of an uninitialized value in nsDOMOfflineResourceList::SwapCache() (nsDOMOfflineResourceList.cpp:543)
64 bit crashes during UnwindBacktrace with js::ExternalGetOrSet(JSContext*, JSObject*, jsid, js::Value const&, JSAccessMode, unsigned int, js::Value*, js::Value*) (jsinterp.cpp:845) on the stack after an invalid read
Jonas, can you look into this? Is there anything XSLT related going on here, or is this something else?
Jonas, ping...
blocking2.0: --- → -
status2.0: --- → wontfix
Whiteboard: [sg:critical?] → [sg:critical?] [waiting on jonas]
Peter, could you look into this XSLT critical crash bug?
Assignee: jonas → peterv
Whiteboard: [sg:critical?] [waiting on jonas] → [sg:critical?]
This won't make it for the 6 train.
I can't reproduce this in a trunk build on linux or OS X.
I just retested the url using the automation on Windows XP, Windows 7, Mac OS X 10.5, Fedora 14 32bit and 64 bit and could only reproduce crashes on Windows XP and Windows 7 for beta, aurora and nightly. I didn't see the deleted memory in the crash address but Ted's exploitable tool for analyzing the minidumps showed high for aurora and nightly but none for beta for some reason. I'm running it locally under msvc with a slightly differently configured xp but haven't crashed yet. More news as I get it.
Ok, things have changed a bit in terms of how to reproduce. I haven't been able to crash just loading the page either in WIndows XP or Mac OS X 10.5 on beta, aurora or nightly. I can reproduce on Windows XP and Mac OS X 10.5 on beta, aurora and nightly at least if I:

1. start browser and attach debugger
2. start loading http://nfomation.net/sitemap.xml
3. exit the browser before the file has completed transferring.

I haven't seen any indication of deleted memory. Not sure if that is meaningful or not. I'll try it out on Linux w and wo valgrind to see if there is anything useful there.
If you need the files from when this bug was first reported I think I have a copy of that state -- let me know and I can put it up on people.m.o or something
I was able to reproduce this in a Windows XP VM using the latest trunk nightly. https://crash-stats.mozilla.com/report/index/bp-10f3bb0f-1a61-4bb5-8e90-d4eee2110811 is my crash report.
Attached patch v1Splinter Review
Don't really know how we could make an automated testcase for this, we need to either run out of memory or quit while loading the source file.
Crash Signature: [@ xExecutionState::end] → [@ txExecutionState::end]
Summary: Crash [@ xExecutionState::end] → Crash [@ txExecutionState::end]
Peter, do you think we should just go ahead and take this fix w/o tests given the lack of an environment where we can realistically test this? If so, is this patch ready to review, or are you still working on it? We're running very low on time for fixes for 7, and this would be great to get in there...
Comment on attachment 555522 [details] [diff] [review]
v1

If we stop before we have source and stylesheet then txExecutionState::init fails, but that was leaving mOutputHandler as a dangling pointer.
Attachment #555522 - Flags: review?(jonas)
Comment on attachment 555522 [details] [diff] [review]
v1

Review of attachment 555522 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Attachment #555522 - Flags: review?(jonas) → review+
Pushed to inbound:

http://hg.mozilla.org/integration/mozilla-inbound/rev/c3d40a5579b1

And I realized I forgot to add r=sicking in the commit message :(
Clearly you must mean no Burrata :)
Fixed!

http://hg.mozilla.org/mozilla-central/rev/c3d40a5579b1
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Comment on attachment 555522 [details] [diff] [review]
v1

This is effectively changing us from accessing uninitialized memory to initializing a pointer and adding a null check. Very safe fix, and a sg:critical bug. We should IMO take this on both aurora and beta.
Attachment #555522 - Flags: approval-mozilla-beta?
Attachment #555522 - Flags: approval-mozilla-aurora?
Comment on attachment 555522 [details] [diff] [review]
v1

Approved for mozilla-aurora and mozilla-beta
Attachment #555522 - Flags: approval-mozilla-beta?
Attachment #555522 - Flags: approval-mozilla-beta+
Attachment #555522 - Flags: approval-mozilla-aurora?
Attachment #555522 - Flags: approval-mozilla-aurora+
Whiteboard: [sg:critical?] → [sg:critical?],[qa+]
We should take this safe fix in 3.6.x as well.
blocking1.9.2: --- → ?
blocking1.9.2: ? → .23+
Changing to qa- as this is an assertion.
Whiteboard: [sg:critical?],[qa+] → [sg:critical?],[qa-]
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: