Last Comment Bug 660453 - Crash [@ txExecutionState::end]
: Crash [@ txExecutionState::end]
: crash, reproducible
Product: Core
Classification: Components
Component: XSLT (show other bugs)
: 5 Branch
: x86 All
: -- critical (vote)
: ---
Assigned To: Peter Van der Beken [:peterv]
: Andrew Overholt [:overholt]
Depends on:
Blocks: 532972
  Show dependency treegraph
Reported: 2011-05-28 08:01 PDT by Bob Clary [:bc:]
Modified: 2015-10-16 11:38 PDT (History)
10 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---

crash report linux (50.85 KB, text/plain)
2011-05-28 08:01 PDT, Bob Clary [:bc:]
no flags Details
crash report xp (44.08 KB, text/plain)
2011-05-28 08:04 PDT, Bob Clary [:bc:]
no flags Details
Match new and delete. (925 bytes, patch)
2011-06-09 08:19 PDT, Chris Leary [:cdleary] (not checking bugmail)
dmandelin: review+
Details | Diff | Splinter Review
32 bit valgrind run on tracemonkey (6.35 KB, application/x-gzip)
2011-06-14 10:14 PDT, Bob Clary [:bc:]
no flags Details
64 bit valgrind run on tracemonkey (8.12 KB, application/x-gzip)
2011-06-14 10:17 PDT, Bob Clary [:bc:]
no flags Details
v1 (1.63 KB, patch)
2011-08-24 13:42 PDT, Peter Van der Beken [:peterv]
jonas: review+
christian: approval‑mozilla‑aurora+
christian: approval‑mozilla‑beta+
Details | Diff | Splinter Review

Description Bob Clary [:bc:] 2011-05-28 08:01:53 PDT
Created attachment 535858 [details]
crash report linux

2. Crash after timing out

Windows XP and Linux 32bit beta so far.

One instance on Linux so far.

Operating system: Linux
                  0.0.0 Linux #1 SMP Tue May 3 13:29:55 UTC 2011 i686
CPU: x86
     GenuineIntel family 6 model 44 stepping 2
     1 CPU

Crash reason:  SIGSEGV
Crash address: 0x3db5

Thread 0 (crashed)
 0!txExecutionState::end [txExecutionState.cpp : 202 + 0x6]
    eip = 0x016edce2   esp = 0xbf9bc3a0   ebp = 0xbf9bc3c8   ebx = 0x03282054
    esi = 0x0a8198c0   edi = 0x014a0cf6   eax = 0x00003db5   ecx = 0x0a78c220
    edx = 0x00b2a3a0   efl = 0x00010246
    Found by: given as instruction pointer in context
 1!txMozillaXSLTProcessor::TransformToDoc [txMozillaXSLTProcessor.cpp : 718 + 0x14]
    eip = 0x017319ef   esp = 0xbf9bc3d0   ebp = 0xbf9bc5a8   ebx = 0x03282054
    esi = 0x0a8198c0   edi = 0x014a0cf6
    Found by: call frame info
 2!nsTransformBlockerEvent::Run [txMozillaXSLTProcessor.cpp : 595 + 0x25]
    eip = 0x01735e82   esp = 0xbf9bc5b0   ebp = 0xbf9bc5c8   ebx = 0x03282054
    esi = 0xbf9bc61c   edi = 0x00000000
    Found by: call frame info
 3!nsThread::ProcessNextEvent [nsThread.cpp : 618 + 0x16]
    eip = 0x0245a965   esp = 0xbf9bc5d0   ebp = 0xbf9bc658   ebx = 0x03282054
    esi = 0xbf9bc61c   edi = 0x00000000
    Found by: call frame info
 4!NS_ProcessNextEvent_P [nsThreadUtils.cpp : 250 + 0x1f]
    eip = 0x023ebabd   esp = 0xbf9bc660   ebp = 0xbf9bc698   ebx = 0x03282054
    esi = 0x00000001   edi = 0x0a00e018
    Found by: call frame info
 5!mozilla::ipc::MessagePump::Run [MessagePump.cpp : 110 + 0x15]
    eip = 0x022ca908   esp = 0xbf9bc6a0   ebp = 0xbf9bc6e8   ebx = 0x03282054
    esi = 0x00000001   edi = 0x0a00e018
    Found by: call frame info
Comment 1 Bob Clary [:bc:] 2011-05-28 08:04:06 PDT
Created attachment 535859 [details]
crash report xp

damn, I didn't notice the address at first.

Operating system: Windows NT
                  5.1.2600 Service Pack 3
CPU: x86
     GenuineIntel family 6 model 44 stepping 2
     1 CPU

Crash address: 0xffffffffddddddf1

Thread 0 (crashed)
 0  xul.dll!txExecutionState::end(unsigned int) [txExecutionState.cpp : 202 + 0x12]
    eip = 0x109e3688   esp = 0x0012d4f4   ebp = 0x0012d4fc   ebx = 0x00000001
    esi = 0x01987af8   edi = 0x00000000   eax = 0x0012d540   ecx = 0x036a4e80
    edx = 0xdddddddd   efl = 0x00010286
    Found by: given as instruction pointer in context
 1  xul.dll!txMozillaXSLTProcessor::TransformToDoc(nsIDOMDocument *,nsIDOMDocument * *) [txMozillaXSLTProcessor.cpp : 718 + 0x11]
    eip = 0x10a2ff29   esp = 0x0012d504   ebp = 0x0012d6d0
    Found by: call frame info
 2  xul.dll!nsTransformBlockerEvent::Run() [txMozillaXSLTProcessor.cpp : 595 + 0x15]
    eip = 0x10a2f749   esp = 0x0012d6d8   ebp = 0x0012d6e0
    Found by: call frame info
 3  xul.dll!nsThread::ProcessNextEvent(int,int *) [nsThread.cpp : 618 + 0x18]
    eip = 0x118cb174   esp = 0x0012d6e8   ebp = 0x0012d748
    Found by: call frame info
 4  xul.dll!NS_ProcessNextEvent_P(nsIThread *,int) [nsThreadUtils.cpp : 250 + 0x15]
    eip = 0x1184d113   esp = 0x0012d750   ebp = 0x0012d764
    Found by: call frame info
 5  xul.dll!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate *) [MessagePump.cpp : 110 + 0xd]
    eip = 0x1171161d   esp = 0x0012d76c   ebp = 0x0012d798
    Found by: call frame info
Comment 2 Daniel Veditz [:dveditz] 2011-05-29 09:11:10 PDT
Couldn't reproduce on a Mac Aurora (release, not debug). I doubt we fixed anything in xslt between Fx5 and Fx6. The document is a huge sitemap and takes forever to process, maybe you've got less memory dedicated to your VM?
Comment 3 Daniel Veditz [:dveditz] 2011-05-29 09:21:39 PDT
I was going to zip up the site .xml and .xsl files and attach them, but even zipped they're 6Mb. Uncompressed the sitemap is 50Mb
Comment 4 Bob Clary [:bc:] 2011-05-29 09:32:02 PDT
Yeah, the original vms this ran on are only 1G for xp and 2G for linux. I only reproduced this with beta though, not aurora or nightly. I'll try locally.
Comment 5 Bob Clary [:bc:] 2011-05-31 12:53:25 PDT
I tried to reproduce locally with a 1G vm on WinXP and Linux but have been unable to so far. I can reproduce it in the automation vms though. I'll keep trying. I do have working valgrind on linux now that I've disabled elfhack. I'll try that too.
Comment 6 Bob Clary [:bc:] 2011-06-01 09:25:38 PDT
automation retested with Windows XP, Windows 7, Linux 32bit and 64bit using 2.0.0, beta, aurora, nightly.

signature  address os branch

txExecutionState::end                  0xffffffffddddddf1 Windows XP beta
                                       0xffffffffddddddf1 Windows 7 beta
                                       0x6cf3             Linux 32bit beta
                                       0x0                Linux 64bit nightly

txMozillaXSLTProcessor::TransformToDoc 0x1 Windows XP aurora/nightly
                                       0x1 Windows 7 aurora/nightly

trying a valgrind run on 64bit linux now.
Comment 7 Bob Clary [:bc:] 2011-06-07 08:59:30 PDT
valgrind on 32bit fedora 14 nightly gave during shutdown

==23727== Mismatched free() / delete / delete []
==23727==    at 0x4006951: operator delete(void*) (vg_replace_malloc.c:387)
==23727==    by 0x5DD4C77: js::RegExp::~RegExp() (jsregexpinlines.h:133)
==23727==    by 0x5DD89EE: void JSContext::delete_<js::RegExp>(js::RegExp*) (in /work/mozilla/builds/nightly/mozilla/firefox-debug/toolkit/library/
==23727==    by 0x5DD5C76: js::RegExp::decref(JSContext*) (jsregexpinlines.h:568)
==23727==    by 0x5EE5F95: regexp_finalize(JSContext*, JSObject*) (jsregexp.cpp:393)
==23727==    by 0x5E498F3: JSObject::finalize(JSContext*) (jsobjinlines.h:141)
==23727==    by 0x5E50B61: bool js::gc::Arena::finalize<JSObject_Slots8>(JSContext*) (jsgc.cpp:224)
==23727==    by 0x5E47FE2: void js::gc::FinalizeArenas<JSObject_Slots8>(JSContext*, js::gc::ArenaHeader**) (jsgc.cpp:271)
==23727==    by 0x5E4DDC6: void js::gc::ArenaList::finalizeNow<JSObject_Slots8>(JSContext*) (jsgc.cpp:1216)
==23727==    by 0x5E45AF5: JSCompartment::finalizeObjectArenaLists(JSContext*) (jsgc.cpp:1966)
==23727==    by 0x5E46B09: MarkAndSweep(JSContext*, JSCompartment*, JSGCInvocationKind, js::GCTimer&) (jsgc.cpp:2348)
==23727==    by 0x5E4743F: GCCycle(JSContext*, JSCompartment*, JSGCInvocationKind, js::GCTimer&) (jsgc.cpp:2643)
==23727==  Address 0xfa47268 is 0 bytes inside a block of size 104 alloc'd
==23727==    at 0x40078AE: malloc (vg_replace_malloc.c:236)
==23727==    by 0x6047A93: js_malloc (jsutil.h:234)
==23727==    by 0x604FB0A: JSC::Yarr::BytecodePattern* js::OffTheBooks::new_<JSC::Yarr::BytecodePattern, JSC::Yarr::ByteDisjunction*, JSC::Yarr::Vector<JSC::Yarr::ByteDisjunction*, 0u>, JSC::Yarr::Ref<JSC::Yarr::YarrPattern>, WTF::Bump
PointerAllocator*>(JSC::Yarr::ByteDisjunction*, JSC::Yarr::Vector<JSC::Yarr::ByteDisjunction*, 0u>, JSC::Yarr::Ref<JSC::Yarr::YarrPattern>, WTF::BumpPointerAllocator*) (in /work/mozilla/builds/nightly/mozilla/firefox-debug/toolkit/libr
==23727==    by 0x604D416: JSC::Yarr::ByteCompiler::compile(WTF::BumpPointerAllocator*) (YarrInterpreter.cpp:1461)
==23727==    by 0x6047AEE: JSC::Yarr::byteCompile(JSC::Yarr::YarrPattern&, WTF::BumpPointerAllocator*) (YarrInterpreter.cpp:1897)
==23727==    by 0x5DD5A80: js::RegExp::compileHelper(JSContext*, JSLinearString&) (jsregexpinlines.h:493)
==23727==    by 0x5DD5B2C: js::RegExp::compile(JSContext*) (jsregexpinlines.h:506)
==23727==    by 0x5DD5777: js::RegExp::create(JSContext*, JSString*, unsigned int) (jsregexpinlines.h:420)
==23727==    by 0x5EE6729: SwapRegExpInternals(JSContext*, JSObject*, js::Value*, JSString*, unsigned int) (jsregexp.cpp:570)
==23727==    by 0x5EE6FF9: CompileRegExpAndSwap(JSContext*, JSObject*, unsigned int, js::Value*, js::Value*) (jsregexp.cpp:758)
==23727==    by 0x5EE7186: regexp_construct(JSContext*, unsigned int, js::Value*) (jsregexp.cpp:797)
==23727==    by 0x5E66063: js::CallJSNative(JSContext*, int (*)(JSContext*, unsigned int, js::Value*), unsigned int, js::Value*) (jscntxtinlines.h:277)
Comment 8 Bob Clary [:bc:] 2011-06-07 09:01:01 PDT
Gal, who should look at a mismatched free/delete in js::RegExp::~RegExp() ?
Comment 9 Andreas Gal :gal 2011-06-07 09:02:54 PDT
cdleary knows the code well.
Comment 10 Chris Leary [:cdleary] (not checking bugmail) 2011-06-09 08:19:28 PDT
Created attachment 538264 [details] [diff] [review]
Match new and delete.

dmandelin, this was one of the // YYY things from the update YARR patch with the bare delete. What are those // YYYs about?
Comment 11 David Mandelin [:dmandelin] 2011-06-09 11:12:58 PDT
Comment on attachment 538264 [details] [diff] [review]
Match new and delete.

The YYY was a tag I was using during development to mark things that needed to be fixed up in some way before landing. There would have been some in the intermediate patches, but there should have been none in the final patch. No idea how that one got through.
Comment 12 Daniel Veditz [:dveditz] 2011-06-09 13:29:44 PDT
Hope bc can retest when this patch lands to make sure it fixes the bug he's seeing.
Comment 13 Bob Clary [:bc:] 2011-06-09 13:46:39 PDT
will do.
Comment 14 Chris Leary [:cdleary] (not checking bugmail) 2011-06-09 15:03:07 PDT

I'm not going to mark as fixed-in-tracemonkey because I'm unsure this valgrind spew reflects the actual cause of the original error.
Comment 15 Chris Leary [:cdleary] (not checking bugmail) 2011-06-13 10:58:48 PDT
cdleary-bot mozilla-central merge info:
Note: not marking as fixed because fixed-in-tracemonkey is not present on the whiteboard.
Comment 16 Bob Clary [:bc:] 2011-06-13 11:04:17 PDT
I'll test the patch on tracemonkey to see if the valgrind issues were related.
Comment 17 Bob Clary [:bc:] 2011-06-14 10:14:44 PDT
Created attachment 539240 [details]
32 bit valgrind run on tracemonkey

I don't crash locally on either 32 bit or 64 bit when just running the browser, but do crash in both (in different places) when running under valgrind.

32 bit crashes just after a use and read of an uninitialized value in nsDOMOfflineResourceList::SwapCache() (nsDOMOfflineResourceList.cpp:543)
Comment 18 Bob Clary [:bc:] 2011-06-14 10:17:17 PDT
Created attachment 539242 [details]
64 bit valgrind run on tracemonkey

64 bit crashes during UnwindBacktrace with js::ExternalGetOrSet(JSContext*, JSObject*, jsid, js::Value const&, JSAccessMode, unsigned int, js::Value*, js::Value*) (jsinterp.cpp:845) on the stack after an invalid read
Comment 19 Johnny Stenback (:jst, 2011-06-23 13:29:46 PDT
Jonas, can you look into this? Is there anything XSLT related going on here, or is this something else?
Comment 20 Johnny Stenback (:jst, 2011-07-07 13:49:54 PDT
Jonas, ping...
Comment 21 Johnny Stenback (:jst, 2011-07-28 13:31:12 PDT
Peter, could you look into this XSLT critical crash bug?
Comment 22 Johnny Stenback (:jst, 2011-08-04 13:09:40 PDT
This won't make it for the 6 train.
Comment 23 Peter Van der Beken [:peterv] 2011-08-10 05:03:31 PDT
I can't reproduce this in a trunk build on linux or OS X.
Comment 24 Bob Clary [:bc:] 2011-08-10 09:47:31 PDT
I just retested the url using the automation on Windows XP, Windows 7, Mac OS X 10.5, Fedora 14 32bit and 64 bit and could only reproduce crashes on Windows XP and Windows 7 for beta, aurora and nightly. I didn't see the deleted memory in the crash address but Ted's exploitable tool for analyzing the minidumps showed high for aurora and nightly but none for beta for some reason. I'm running it locally under msvc with a slightly differently configured xp but haven't crashed yet. More news as I get it.
Comment 25 Bob Clary [:bc:] 2011-08-10 15:58:38 PDT
Ok, things have changed a bit in terms of how to reproduce. I haven't been able to crash just loading the page either in WIndows XP or Mac OS X 10.5 on beta, aurora or nightly. I can reproduce on Windows XP and Mac OS X 10.5 on beta, aurora and nightly at least if I:

1. start browser and attach debugger
2. start loading
3. exit the browser before the file has completed transferring.

I haven't seen any indication of deleted memory. Not sure if that is meaningful or not. I'll try it out on Linux w and wo valgrind to see if there is anything useful there.
Comment 26 Daniel Veditz [:dveditz] 2011-08-11 13:38:17 PDT
If you need the files from when this bug was first reported I think I have a copy of that state -- let me know and I can put it up on people.m.o or something
Comment 27 Marcia Knous [:marcia - use ni] 2011-08-11 13:41:17 PDT
I was able to reproduce this in a Windows XP VM using the latest trunk nightly. is my crash report.
Comment 28 Peter Van der Beken [:peterv] 2011-08-24 13:42:17 PDT
Created attachment 555522 [details] [diff] [review]

Don't really know how we could make an automated testcase for this, we need to either run out of memory or quit while loading the source file.
Comment 29 Johnny Stenback (:jst, 2011-09-01 13:17:04 PDT
Peter, do you think we should just go ahead and take this fix w/o tests given the lack of an environment where we can realistically test this? If so, is this patch ready to review, or are you still working on it? We're running very low on time for fixes for 7, and this would be great to get in there...
Comment 30 Peter Van der Beken [:peterv] 2011-09-02 07:22:47 PDT
Comment on attachment 555522 [details] [diff] [review]

If we stop before we have source and stylesheet then txExecutionState::init fails, but that was leaving mOutputHandler as a dangling pointer.
Comment 31 Jonas Sicking (:sicking) No longer reading bugmail consistently 2011-09-02 09:53:08 PDT
Comment on attachment 555522 [details] [diff] [review]

Review of attachment 555522 [details] [diff] [review]:

Comment 32 Johnny Stenback (:jst, 2011-09-02 15:08:15 PDT
Pushed to inbound:

And I realized I forgot to add r=sicking in the commit message :(
Comment 33 Jonas Sicking (:sicking) No longer reading bugmail consistently 2011-09-02 15:15:09 PDT
Comment 34 Johnny Stenback (:jst, 2011-09-02 16:50:32 PDT
Clearly you must mean no Burrata :)
Comment 35 Johnny Stenback (:jst, 2011-09-04 22:10:54 PDT
Comment 36 Johnny Stenback (:jst, 2011-09-04 22:14:39 PDT
Comment on attachment 555522 [details] [diff] [review]

This is effectively changing us from accessing uninitialized memory to initializing a pointer and adding a null check. Very safe fix, and a sg:critical bug. We should IMO take this on both aurora and beta.
Comment 37 christian 2011-09-06 14:14:23 PDT
Comment on attachment 555522 [details] [diff] [review]

Approved for mozilla-aurora and mozilla-beta
Comment 39 Daniel Veditz [:dveditz] 2011-09-18 08:15:35 PDT
We should take this safe fix in 3.6.x as well.
Comment 41 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2011-09-26 09:20:03 PDT
Changing to qa- as this is an assertion.

Note You need to log in before you can comment on or make changes to this bug.