Closed Bug 805296 Opened 13 years ago Closed 12 years ago

crash in AncestorFilter::PushAncestor

Categories

(Core :: CSS Parsing and Computation, defect)

17 Branch
PowerPC
Linux
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: stevensn, Unassigned)

Details

(Keywords: crash)

Crash Data

Attachments

(4 files)

Attached file out.trace
User Agent: Mozilla/5.0 (X11; Linux ppc; rv:9.0.1) Gecko/20120123 Firefox/9.0.1 Build ID: 20120123201830 Steps to reproduce: I started firefox when to www.google.com entered 'firefox' in the search window, when the result rendered I clicked back on the search window field and started to type more words Actual results: Firefox generally crashes when I try to do the above steps. It sometimes crashes a bit earlier or a bit later. I can't isolate it to any particular keystroke or mouse click. Expected results: Firefox shouldn't have crashed.
You posted with Firefox 9 that has vulnerabilities: http://www.mozilla.org/security/known-vulnerabilities/firefox.html Please update it. Does it happen in Safe Mode (see https://support.mozilla.org/kb/troubleshoot-firefox-issues-using-safe-mode)? Does it happen with a new profile (see https://support.mozilla.org/kb/profile-manager-create-and-remove-firefox-profiles)?
Severity: normal → critical
Crash Signature: [@ AncestorFilter::PushAncestor]
Component: Untriaged → Style System (CSS)
Flags: needinfo?(steve)
Keywords: crash
Product: Firefox → Core
Hardware: Other → PowerPC
Updating from firefox 9 is exactly what I'm *trying* to do. In ff16 was periodically crashing due with symptoms that matched a bug that claimed to be fixed in firefox 17. Firefox 17 doesn't stay up very long. The stack trace I sent was generated in safe-mode (I should have mentioned this) I get this both with a new profile or an existing one.
Flags: needinfo?(steve)
Is it a Mozilla's Firefox or a Firefox from your distro? Do you have all required libraries (see http://www.mozilla.org/firefox/17.0beta/system-requirements/)?
> Is it a Mozilla's Firefox There are no PPC/Linux builds produced by Mozilla. Steve, what's the actual crashing address, if you can get gdb to give you that? Do you have the option of a build with full debugging symbols? What revision was your build built from? Line 3292 doesn't even exist in nsCSSRuleProcessor.cpp on the FIREFOX_9_0_1_RELEASE tag... so I have no idea what code your build is running and why.
Note that line 3292 does exist on tip, but it's a for loop header which is just incrementing stack integers. Not likely to crash.
This bug is with Firefox 17 beta. This is NOT with firefox 9, I just happen to be using firefox 9 to file the bug because it works and I think bugzilla automatically included that. In this version (firefox-17.0b2.source.tar.bz2) the line in question is 3292 mHashes.AppendElement(classes->AtomAt(i)->hash()); gdb) p classes $2 = (const nsAttrValue *) 0x5f7df510 (gdb) p *classes $3 = {static sEnumTableArray = 0x48257794, mBits = 1600686801} p mHashes $4 = {<nsTArray_base<nsTArrayDefaultAllocator>> = { mHdr = 0x5f79dd00}, <nsTArray_SafeElementAtHelper<unsigned int, nsTArray<unsigned int, nsTArrayDefaultAllocator> >> = {<No data fields>}, <No data fields>} (gdb) I *suspect that classes->AtomAt(i) returned some bad object pointer but that is a pure guess.
Do you still have this in a debugger? What's the value of "i"? What's classCount? What value does classes->AtomAt(i) return? This should all be pretty straightforward: "classes" holds an array of atoms and we're just walking through it...
I have the core file in the debugger (post crash) p classCount $5 = 88 (gdb) p i $6 = 1 (gdb) p classes->AtomAt(i) You can't do that without a process to debug. Tomorrow night I will see if I can get a the value of classes->AtomAt(i)
Thanks! That classCount is already looking pretty suspect. Looking at your *classes, this nsAttrValue is of type eOtherBase. Which means 0x5f688ad0 (mBits with the low two bits cleared) should be a MiscContainer*. If you cast it to that type and then examine the resulting struct, what does it look like?
p *(const nsAttrValue::MiscContainer*)classes $3 = {mType = 1600686801, mStringBits = 1602041264, {mInteger = 0, mColor = 0, mEnumValue = 0, mPercent = 0, mCSSStyleRule = 0x0, mURL = 0x0, mImage = 0x0, mAtomArray = 0x0, mDoubleValue = 0, mIntMargin = 0x0, mSVGAngle = 0x0, mSVGIntegerPair = 0x0, mSVGLength = 0x0, mSVGLengthList = 0x0, mSVGNumberList = 0x0, mSVGNumberPair = 0x0, mSVGPathData = 0x0, mSVGPointList = 0x0, mSVGPreserveAspectRatio = 0x0, mSVGStringList = 0x0, mSVGTransformList = 0x0, mSVGViewBox = 0x0}}
> p *(const nsAttrValue::MiscContainer*)classes No. You want: p *(const nsAttrValue::MiscContainer*)0x5f688ad0
p *(const nsAttrValue::MiscContainer*)0x5f688ad0 $1 = {mType = nsAttrValue::eAtomArray, mStringBits = 1602090210, {mInteger = 1464116576, mColor = 1464116576, mEnumValue = 1464116576, mPercent = 1464116576, mCSSStyleRule = 0x5744a560, mURL = 0x5744a560, mImage = 0x5744a560, mAtomArray = 0x5744a560, mDoubleValue = 2.4825863350733872e+112, mIntMargin = 0x5744a560, mSVGAngle = 0x5744a560, mSVGIntegerPair = 0x5744a560, mSVGLength = 0x5744a560, mSVGLengthList = 0x5744a560, mSVGNumberList = 0x5744a560, mSVGNumberPair = 0x5744a560, mSVGPathData = 0x5744a560, mSVGPointList = 0x5744a560, mSVGPreserveAspectRatio = 0x5744a560, mSVGStringList = 0x5744a560, mSVGTransformList = 0x5744a560, mSVGViewBox = 0x5744a560}} (gdb)
> mType = nsAttrValue::eAtomArray OK. In that case, $1.mAtomArray (so 0x5744a560) should be a nsCOMArray<nsIAtom>*... And then you want to look into what the memory layout of that nsCOMArray looks like.
I have split the problem line as follows: for(uint32_t i = 0; i < classCount; ++i) { 3292 nsIAtom * id = classes->AtomAt(i); 3293 uint32_t h = id->hash(); 3294 mHashes.AppendElement(classes->AtomAt(i)->hash()); 3295 } It is crashing on line 3293, the id->hash() call. The nsComArray looks as follows (note: memory addresses have changed from the previous core) 3293 uint32_t h = id->hash(); (gdb) p id $1 = <value optimized out> (gdb) l 3288 const nsAttrValue *classes = aElement->GetClasses(); 3289 if (classes) { 3290 uint32_t classCount = classes->GetAtomCount(); 3291 for(uint32_t i = 0; i < classCount; ++i) { 3292 nsIAtom * id = classes->AtomAt(i); 3293 uint32_t h = id->hash(); 3294 mHashes.AppendElement(classes->AtomAt(i)->hash()); 3295 } 3296 } 3297 (gdb) p classes $2 = (const nsAttrValue *) 0x58477050 (gdb) p *classes $3 = {static sEnumTableArray = 0x48257794, mBits = 1393757281} (gdb) p i $4 = 1 (gdb) p *( const nsAttrValue::MiscContainer* )0x53130C60 $5 = {mType = nsAttrValue::eAtomArray, mStringBits = 1481076770, {mInteger = 1491691252, mColor = 1491691252, mEnumValue = 1491691252, mPercent = 1491691252, mCSSStyleRule = 0x58e966f4, mURL = 0x58e966f4, mImage = 0x58e966f4, mAtomArray = 0x58e966f4, mDoubleValue = 2.0498356191576885e+120, mIntMargin = 0x58e966f4, mSVGAngle = 0x58e966f4, mSVGIntegerPair = 0x58e966f4, mSVGLength = 0x58e966f4, mSVGLengthList = 0x58e966f4, mSVGNumberList = 0x58e966f4, mSVGNumberPair = 0x58e966f4, mSVGPathData = 0x58e966f4, mSVGPointList = 0x58e966f4, mSVGPreserveAspectRatio = 0x58e966f4, mSVGStringList = 0x58e966f4, mSVGTransformList = 0x58e966f4, mSVGViewBox = 0x58e966f4}} (gdb) p (*( const nsAttrValue::MiscContainer* )0x53130C60).mAtomArray $6 = (class nsTArray<nsCOMPtr<nsIAtom>, nsTArrayDefaultAllocator> *) 0x58e966f4 (gdb) p *(*( const nsAttrValue::MiscContainer* )0x53130C60).mAtomArray $7 = {<nsTArray_base<nsTArrayDefaultAllocator>> = { mHdr = 0x590cd240}, <nsTArray_SafeElementAtHelper<nsCOMPtr<nsIAtom>, nsTArray<nsCOMPtr<nsIAtom>, nsTArrayDefaultAllocator> >> = {<nsTArray_SafeElementAtSmartPtrHelper<nsIAtom, nsTArray<nsCOMPtr<nsIAtom>, nsTArrayDefaultAllocator> >> = {<No data fields>}, <No data fields>}, <No data fields>} (gdb)
Hmm. id->hash() should be just an inline member get. That suggests that dereferencing id itself is the problem. It's a bit unfortunate that that exact thing is optimized out.... It'd be interesting to know what "id" and "*id" look like (assuming gdb can even read the latter). Since you're compiling yourself, could you try doing it with --disable-optimize?
with the --disable-optimize build I am unable to reproduce the error. It is pretty easy to reproduce with the optimized build.
Interesting. What compiler are you using? What optimization settings? At first glance, it's miscompiling the code....
c++ --version c++ (Debian 4.4.5-8) 4.4.5 Copyright (C) 2010 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. c++ -o nsCSSRuleProcessor.o -c -fvisibility=hidden -DMOZ_GLUE_IN_PROGRAM -DMOZILLA_INTERNAL_API -D_IMPL_NS_COM -DEXPORT_XPT_API -DEXPORT_XPTC_API -D_IMPL_NS_GFX -D_IMPL_NS_WIDGET -DIMPL_XREAPI -DIMPL_NS_NET -DIMPL_THEBES -DSTATIC_EXPORTABLE_JS_API -DEXCLUDE_SKIA_DEPENDENCIES -DOS_LINUX=1 -DOS_POSIX=1 -D_IMPL_NS_LAYOUT -I/media/nfs/usb_drive_src/firefox/clean_test/mozilla-beta/ipc/chromium/src -I/media/nfs/usb_drive_src/firefox/clean_test/mozilla-beta/ipc/glue -I../../ipc/ipdl/_ipdlheaders -I/media/nfs/usb_drive_src/firefox/clean_test/mozilla-beta/layout/style/../base -I/media/nfs/usb_drive_src/firefox/clean_test/mozilla-beta/layout/style/../generic -I/media/nfs/usb_drive_src/firefox/clean_test/mozilla-beta/layout/style/../xul/base/src -I/media/nfs/usb_drive_src/firefox/clean_test/mozilla-beta/layout/style/../../content/base/src -I/media/nfs/usb_drive_src/firefox/clean_test/mozilla-beta/layout/style/../../content/html/content/src -I/media/nfs/usb_drive_src/firefox/clean_test/mozilla-beta/layout/style/../../content/xbl/src -I/media/nfs/usb_drive_src/firefox/clean_test/mozilla-beta/layout/style/../../content/xul/document/src -I/media/nfs/usb_drive_src/firefox/clean_test/mozilla-beta/layout/style -I. -I../../dist/include -I/media/nfs/usb_drive_src/firefox/clean_test/mozilla-beta/build-output/dist/include/nspr -I/media/nfs/usb_drive_src/firefox/clean_test/mozilla-beta/build-output/dist/include/nss -fPIC -Wall -Wpointer-arith -Woverloaded-virtual -Werror=return-type -Wtype-limits -Wempty-body -Wno-ctor-dtor-privacy -Wno-overlength-strings -Wno-invalid-offsetof -Wno-variadic-macros -Wcast-align -Wno-long-long -fno-exceptions -fno-strict-aliasing -fno-rtti -ffunction-sections -fdata-sections -fno-exceptions -fshort-wchar -pthread -pipe -DNDEBUG -DTRIMMED -g -Os -freorder-blocks -fomit-frame-pointer -DMOZILLA_CLIENT -include ../../mozilla-config.h -MD -MF .deps/nsCSSRuleProcessor.o.pp /media/nfs/usb_drive_src/firefox/clean_test/mozilla-beta/layout/style/nsCSSRuleProcessor.cpp It also could be that the non-optimized builds have a different memory layout or timings and just don't trigger the bug. I will try to see if I can reproduce the error on a optimized build with just that one file non-optimized+debug symbols so I can print id.
The element returned by AtomAt() is corrupt. 3293 uint32_t h = id->hash(); (gdb) p id $1 = (class nsIAtom *) 0x670062 (gdb) p *id Cannot access memory at address 0x670062 (gdb) l 3288 const nsAttrValue *classes = aElement->GetClasses(); 3289 if (classes) { 3290 uint32_t classCount = classes->GetAtomCount(); 3291 for(uint32_t i = 0; i < classCount; ++i) { 3292 nsIAtom * id = classes->AtomAt(i); 3293 uint32_t h = id->hash(); 3294 mHashes.AppendElement(classes->AtomAt(i)->hash()); 3295 } 3296 } 3297 (gdb) p i $2 = 0 (gdb) p classes $3 = (const nsAttrValue *) 0x58901970 (gdb) p *classes $4 = {static sEnumTableArray = 0x48257794, mBits = 1486461841} (gdb) p 0x58999B90 $5 = 1486461840 (gdb) p *(const nsAttrValue::MiscContainer*) 0x58999B90 $6 = {mType = nsAttrValue::eAtomArray, mStringBits = 1484329538, {mInteger = 1509917280, mColor = 1509917280, mEnumValue = 1509917280, mPercent = 1509917280, mCSSStyleRule = 0x59ff8260, mURL = 0x59ff8260, mImage = 0x59ff8260, mAtomArray = 0x59ff8260, mDoubleValue = 3.3327039069259748e+125, mIntMargin = 0x59ff8260, mSVGAngle = 0x59ff8260, mSVGIntegerPair = 0x59ff8260, mSVGLength = 0x59ff8260, mSVGLengthList = 0x59ff8260, mSVGNumberList = 0x59ff8260, mSVGNumberPair = 0x59ff8260, mSVGPathData = 0x59ff8260, mSVGPointList = 0x59ff8260, mSVGPreserveAspectRatio = 0x59ff8260, mSVGStringList = 0x59ff8260, mSVGTransformList = 0x59ff8260, mSVGViewBox = 0x59ff8260}} (gdb) p *((const nsAttrValue::MiscContainer*) 0x58999B90)->mAtomArray $7 = {<nsTArray_base<nsTArrayDefaultAllocator>> = { mHdr = 0x586e93a0}, <nsTArray_SafeElementAtHelper<nsCOMPtr<nsIAtom>, nsTArray<nsCOMPtr<nsIAtom>, nsTArrayDefaultAllocator> >> = {<nsTArray_SafeElementAtSmartPtrHelper<nsIAtom, nsTArray<nsCOMPtr<nsIAtom>, nsTArrayDefaultAllocator> >> = {<No data fields>}, <No data fields>}, <No data fields>} (gdb) p *((const nsAttrValue::MiscContainer*) 0x58999B90)->mAtomArray->mHdr $8 = {static sEmptyHdr = {static sEmptyHdr = <same as static member of an already seen type>, mLength = 0, mCapacity = 0, mIsAutoArray = 0}, mLength = 88, mCapacity = 741820884, mIsAutoArray = 0} (gdb)
The $8 there looks very bogus. It claims to be an array of length 88, with capacity for 741820884 elements. But when you try to get the element at index 0 it returns a garbage value. There's just no way that should happen normally. Can you maybe try running this under valgrind?
When I compile for valgrind per the wiki instructions firefox still crashes when I type a search query into www.google.com but this stack trace is different.
Attached is the valgrind output that corresponds to the stack trace I just attached. I think the stack trace is much more interesting than this =18154== Memcheck, a memory error detector ==18154== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==18154== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==18154== Command: build-output/dist/bin/firefox-bin ==18154== ==18156== ==18156== HEAP SUMMARY: ==18156== in use at exit: 35,288 bytes in 275 blocks ==18156== total heap usage: 2,055 allocs, 1,780 frees, 8,688,628 bytes allocated ==18156== ==18156== LEAK SUMMARY: ==18156== definitely lost: 5,440 bytes in 50 blocks ==18156== indirectly lost: 20,136 bytes in 90 blocks ==18156== possibly lost: 5,654 bytes in 108 blocks ==18156== still reachable: 4,058 bytes in 27 blocks ==18156== suppressed: 0 bytes in 0 blocks ==18156== Rerun with --leak-check=full to see details of leaked memory ==18156== ==18156== For counts of detected and suppressed errors, rerun with: -v ==18156== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 7 from 5) ==18154== Conditional jump or move depends on uninitialised value(s) ==18154== at 0xC85421C: inflateReset2 (in /usr/lib/libz.so.1.2.3.4) ==18154== by 0xC854383: inflateInit2_ (in /usr/lib/libz.so.1.2.3.4) ==18154== by 0xC8543DF: inflateInit_ (in /usr/lib/libz.so.1.2.3.4) ==18154== by 0xC7568CB: png_create_read_struct_2 (in /lib/libpng12.so.0.44.0) ==18154== by 0xB7D51BF: ??? (in /usr/lib/gtk-2.0/2.10.0/loaders/libpixbufloader-png.so) ==18154== by 0xCA8E41F: ??? (in /usr/lib/libgdk_pixbuf-2.0.so.0.2000.1) ==18154== by 0xCA8F2E7: gdk_pixbuf_new_from_file (in /usr/lib/libgdk_pixbuf-2.0.so.0.2000.1) ==18154== by 0xEA6CEA7: nsWindow::SetIcon(nsAString_internal const&) (nsWindow.cpp:1770) ==18154== by 0xEA69C1F: nsWindow::SetDefaultIcon() (nsWindow.cpp:4508) ==18154== by 0xEA70D43: nsWindow::Create(nsIWidget*, void*, nsIntRect const&, nsDeviceContext*, nsWidgetIni ==18154== by 0xEA69C1F: nsWindow::SetDefaultIcon() (nsWindow.cpp:4508) ==18154== by 0xEA70D43: nsWindow::Create(nsIWidget*, void*, nsIntRect const&, nsDeviceContext*, nsWidgetInitData*) (nsWindow.cpp:3421) ==18154== by 0xE8D2D23: nsWebShellWindow::Initialize(nsIXULWindow*, nsIXULWindow*, nsIURI*, int, int, bool, nsWidgetInitData&) (nsWebShellWindow.cpp:169) ==18154== by 0xE8CFFC3: nsAppShellService::JustCreateTopWindow(nsIXULWindow*, nsIURI*, unsigned int, int, int, bool, nsWebShellWindow**) (nsAppShellService.cpp:351) ~
That's pretty bizarre... :(
I can reproduce this issue on an aurora based branch as well. The same sequence of user-actions sometimes also produces this stack trace, other times the crash is in ::PushAncestor.
I get exactly the same crash as well in FF17 (and FF16 when I tried a month ago). My build is made from building Ubuntu lucid firefox sources on a karmic distribution (I know its old, but unfortunately I'm not able to update the rest of that system at this time). It segfaults on the same line: 3292 nsIAtom * id = classes->AtomAt(i);
I was pretty consistently able to reproduce this in FF17 and FF18 builds, but it no longer happens when I build mozilla-central(FF19). I can't say if the problem has actually been fixed by someone or if things have moved around in memory enough so I'm not hitting it.
Crash Signature: [@ AncestorFilter::PushAncestor] → [@ AncestorFilter::PushAncestor ]
I am marking this as fixed even I can't point at the commit that fixed it. I haven't any more crashes like this in the last 10 versions of mozilla
Status: UNCONFIRMED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
No specific fix = WFM
Resolution: FIXED → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: