Closed Bug 458531 Opened 16 years ago Closed 16 years ago

Crash [@ nsAttrValue::ToString] with radialGradient, gradientTransform and command

Categories

(Core :: XUL, defect, P1)

defect

Tracking

()

VERIFIED FIXED

People

(Reporter: martijn.martijn, Assigned: smaug)

References

Details

(4 keywords, Whiteboard: [sg:critical?])

Crash Data

Attachments

(3 files)

Attached file testcase
See testcase, which crashes current trunk build and Firefox 3 within 100ms.
It doesn't crash in Firefox 2. I can look for a regression range, if wanted.

http://crash-stats.mozilla.com/report/index/c492d0cc-924e-11dd-b1de-001cc4e2bf68?p=1
0  	xul.dll  	nsAttrValue::ToString  	 content/base/src/nsAttrValue.cpp:325
1 	xul.dll 	nsXULDocument::AttributeChanged 	content/xul/document/src/nsXULDocument.cpp:983
Flags: blocking1.9.1?
This crash looks somewhat scary... setting to [sg:critical?] for now.  Feel free to lower the severity if you don't think this is exploitable.
Whiteboard: [sg:critical?]
In GDB, I tried placing a breakpoint at nsNodeUtils.cpp:107 (inside of nsNodeUtils::AttributeChanged).  I set this breapoint to simply continue each time it's hit.

When the "doe" function fires, I hit this breakpoint thousands of times in sequence, alternating between aModType = 3 (nsIDOMMutationEvent::ADDITION) and aModType = 2 (nsIDOMMutationEvent::REMOVAL).  I just let it run for a few minutes, and it hit the breakpoint about 2000 times so far.

Now, if I interrupt it and disable the breakpoint (or if I never set up the breakpoint in the first place), then I segfault almost immediately.

So, it looks like we've got a combination of an infinite loop and a race condition.
 - When the breakpoint is enabled, it works around the race condition (because of the tiny delay on each break), but this reveals the infinite loop.  
 - When the breakpoint is disabled, the race condition quickly triggers a crash, and we lose.
OS: Windows XP → All
Hardware: PC → All
(In reply to comment #3)
> So, it looks like we've got a combination of an infinite loop and a race
> condition.

I lied -- we're actually doing infinite recursion (and presumably running out of stack space).  Specifically, this is what happens:

* We start in nsGenericElement::UnsetAttr, with the following arguments:
    this == the nsSVGRadialGradientElement
    aName == nsGkAtoms::gradientTransform

* That method creates this local variable:
    mozAutoDocUpdate updateBatch(document, UPDATE_CONTENT_MODEL, aNotify);

* Just before we finish nsGenericElement::UnsetAttr, we run the destructor for this local variable.
 - The destructor ("~mozAutoDocUpdate") calls mDocument->EndUpdate()
 - That calls UnsetAttr on the nsSVGRadialGradientElement (with  aName == nsGkAtoms::gradientTransform)
 - That puts us in nsSVGElement::UnsetAttr, which calls nsGenericElement::UnsetAttr in its last line.

So, we end up in a recursive call to nsGenericElement::UnsetAttr.  That recursive call does all this same stuff, and this repeats until we crash.
Here's a backtrace, taken fairly early on, that shows the infinite recursion starting.  Specifically, stack levels 1-29 show several layers of the recursion, in its early stages.
(In reply to comment #0)
> I can look for a regression range, if wanted.

Martijn: Yes, a regression range would be quite helpful in determining what's causing the recursion / how to proceed, if you don't mind looking for one.  (Thanks!)
Thanks for that regression range!  I'm most suspicious of bug 320622, which added/tweaked a bunch of "UnsetAttr" functions in SVG code, and that's precisely what's involved in the infinite recursion here.

Tentatively marking dependency on that bug.
Blocks: 320622
Oh, yes, I was confused, I thought that one was backed out, but it was rechecked in.
Confirmed regression from bug 320622 -- I tried backing its patch out of a CVS checkout taken at the end of the regression-day, and that backout fixes this crash.
Perhaps bug 468176 is related? The stack looks the same. Although the testcase and regression range is different.
Yes -- and they both have "Crash Reason: EXCEPTION_STACK_OVERFLOW", which sounds like infinite recursion.
Ok, I did some analysis in GDB, comparing a CVS checkout from just before the regression vs a CVS checkout from just after the regression, and here's what I've got.

Consider these two chunks from nsGenericElement::UnsetAttr -- they are supposed to save us from this bug, as described below.

 4513   PRInt32 index = mAttrsAndChildren.IndexOfAttr(aName, aNameSpaceID);
 4514   if (index < 0) {
 4515     return NS_OK;
 4516   }
 ...
 4563   rv = mAttrsAndChildren.RemoveAttrAt(index, oldValue);

WHAT HAPPENS IN PRE-REGRESSION CODE:
The first time we hit nsGenericElement::UnsetAttr, we remove the attribute from mAttrsAndChildren at line 4563.  Then, a few stack-levels down, we end up making a recursive call to UnsetAttr, but it returns immediately (at line 4515), because the attribute is no longer in mAttrsAndChildren.  So we're fine.

WHAT HAPPENS IN POST-REGRESSION CODE:
The regressing patch ends up *re-inserting* the attribute into mAttrsAndChildren, *just* before our recursive call to nsGenericElement::UnsetAttr.  So when we end up making the recursive call, we find the attribute (at line 4513) and we skip the early return, continuing into a death-spiral.

The fatal re-insertion happens inside the call to "transform->Clear()", which bug 320622 added in nsSVGElement::UnsetAttr.  The added "Clear()" unfolds in the backtrace below, with "SetAndTakeAttr" at the bottom. (that does the actual re-insertion)

#1  0xb4abf6d2 in nsAttrAndChildArray::SetAndTakeAttr (this=0xa762288, aLocalName=0x99cd524, aValue=@0xbfdce92c) at /scratch/work/builds/old_builds/trunk.08-12-09.13-50_patched/mozilla/content/base/src/nsAttrAndChildArray.cpp:387
#2  0xb4b363cd in nsGenericElement::SetAttrAndNotify (this=0xa762270, aNamespaceID=0, aName=0x99cd524, aPrefix=0x0, aOldValue=@0xb7f02368, aParsedValue=@0xbfdce92c, aModification=0, aFireMutation=0, aNotify=1) at /scratch/work/builds/old_builds/trunk.08-12-09.13-50_patched/mozilla/content/base/src/nsGenericElement.cpp:3613
#3  0xb4f0eb73 in nsSVGElement::DidModifySVGObservable (this=0xa762270, aObservable=0xa4202bc, aModType=nsISVGValue::mod_other) at /scratch/work/builds/old_builds/trunk.08-12-09.13-50_patched/mozilla/content/svg/content/src/nsSVGElement.cpp:708
#4  0xb4f8513e in nsSVGValue::NotifyObservers (this=0xa4202bc, f=&virtual table offset 20, aModType=nsISVGValue::mod_other) at /scratch/work/builds/old_builds/trunk.08-12-09.13-50_patched/mozilla/content/svg/content/src/nsSVGValue.cpp:78
#5  0xb4f85208 in nsSVGValue::DidModify (this=0xa4202bc, aModType=nsISVGValue::mod_other) at /scratch/work/builds/old_builds/trunk.08-12-09.13-50_patched/mozilla/content/svg/content/src/nsSVGValue.cpp:95
#6  0xb4f082e7 in nsSVGAnimatedTransformList::DidModifySVGObservable (this=0xa4202b8, observable=0xa46a750, aModType=nsISVGValue::mod_other) at /scratch/work/builds/old_builds/trunk.08-12-09.13-50_patched/mozilla/content/svg/content/src/nsSVGAnimatedTransformList.cpp:180
#7  0xb4f8513e in nsSVGValue::NotifyObservers (this=0xa46a750, f=&virtual table offset 20, aModType=nsISVGValue::mod_other) at /scratch/work/builds/old_builds/trunk.08-12-09.13-50_patched/mozilla/content/svg/content/src/nsSVGValue.cpp:78
#8  0xb4f85208 in nsSVGValue::DidModify (this=0xa46a750, aModType=nsISVGValue::mod_other) at /scratch/work/builds/old_builds/trunk.08-12-09.13-50_patched/mozilla/content/svg/content/src/nsSVGValue.cpp:95
#9  0xb4f80fa1 in nsSVGTransformList::Clear (this=0xa46a750) at /scratch/work/builds/old_builds/trunk.08-12-09.13-50_patched/mozilla/content/svg/content/src/nsSVGTransformList.cpp:239
#10 0xb4f0f473 in nsSVGElement::UnsetAttr (this=0xa762270, aNamespaceID=0, aName=0x99cd524, aNotify=1) at /scratch/work/builds/old_builds/trunk.08-12-09.13-50_patched/mozilla/content/svg/content/src/nsSVGElement.cpp:392

So, that's what initially caused the regression... And I imagine the basic problem is still the same in up-to-date code.
(In reply to comment #13)
> The fatal re-insertion happens inside the call to "transform->Clear()", which
...
> So, that's what initially caused the regression... And I imagine the basic
> problem is still the same in up-to-date code.

Yup -- I just explored this in up-to-date mozilla-central, and confirmed that this is still the root of the problem, though the code has shifted slightly. In particular, the call to nsSVGTransformList::Clear is now in a method called "nsSVGElement::ResetOldStyleBaseType", which is called by nsSVGElement::UnsetAttr.

Here's a call-stack, with a just-completed nsGenericElement::UnsetAttr at the highest level, and with us re-inserting the attribute at the lowest level.

#1  0xb35db814 in nsAttrAndChildArray::SetAndTakeAttr (this=0xb043f51c, aLocalName=0xb4947598, aValue=@0xbfe155f0) at /scratch/work/builds/mozilla-central/mozilla-central.08-12-08.15-49/mozilla/content/base/src/nsAttrAndChildArray.cpp:399
#2  0xb366d187 in nsGenericElement::SetAttrAndNotify (this=0xb043f500, aNamespaceID=0, aName=0xb4947598, aPrefix=0x0, aOldValue=@0xb7db4308, aParsedValue=@0xbfe155f0, aModification=0, aFireMutation=0, aNotify=1, aValueForAfterSetAttr=0x0) at /scratch/work/builds/mozilla-central/mozilla-central.08-12-08.15-49/mozilla/content/base/src/nsGenericElement.cpp:4298
#3  0xb3b330ff in nsSVGElement::DidModifySVGObservable (this=0xb043f500, aObservable=0xb1b2ea64, aModType=nsISVGValue::mod_other) at /scratch/work/builds/mozilla-central/mozilla-central.08-12-08.15-49/mozilla/content/svg/content/src/nsSVGElement.cpp:997
#4  0xb3bacfb6 in nsSVGValue::NotifyObservers (this=0xb1b2ea64, f=&virtual table offset 20, aModType=nsISVGValue::mod_other) at /scratch/work/builds/mozilla-central/mozilla-central.08-12-08.15-49/mozilla/content/svg/content/src/nsSVGValue.cpp:78
#5  0xb3bad080 in nsSVGValue::DidModify (this=0xb1b2ea64, aModType=nsISVGValue::mod_other) at /scratch/work/builds/mozilla-central/mozilla-central.08-12-08.15-49/mozilla/content/svg/content/src/nsSVGValue.cpp:95
#6  0xb3b2b45f in nsSVGAnimatedTransformList::DidModifySVGObservable (this=0xb1b2ea60, observable=0xb1c1c600, aModType=nsISVGValue::mod_other) at /scratch/work/builds/mozilla-central/mozilla-central.08-12-08.15-49/mozilla/content/svg/content/src/nsSVGAnimatedTransformList.cpp:180
#7  0xb3bacfb6 in nsSVGValue::NotifyObservers (this=0xb1c1c600, f=&virtual table offset 20, aModType=nsISVGValue::mod_other) at /scratch/work/builds/mozilla-central/mozilla-central.08-12-08.15-49/mozilla/content/svg/content/src/nsSVGValue.cpp:78
#8  0xb3bad080 in nsSVGValue::DidModify (this=0xb1c1c600, aModType=nsISVGValue::mod_other) at /scratch/work/builds/mozilla-central/mozilla-central.08-12-08.15-49/mozilla/content/svg/content/src/nsSVGValue.cpp:95
#9  0xb3ba892b in nsSVGTransformList::Clear (this=0xb1c1c600) at /scratch/work/builds/mozilla-central/mozilla-central.08-12-08.15-49/mozilla/content/svg/content/src/nsSVGTransformList.cpp:246
#10 0xb3b3383a in nsSVGElement::ResetOldStyleBaseType (this=0xb043f500, svg_value=0xb1b2ea64) at /scratch/work/builds/mozilla-central/mozilla-central.08-12-08.15-49/mozilla/content/svg/content/src/nsSVGElement.cpp:640
#11 0xb3b33f2c in nsSVGElement::UnsetAttr (this=0xb043f500, aNamespaceID=0, aName=0xb4947598, aNotify=1) at /scratch/work/builds/mozilla-central/mozilla-central.08-12-08.15-49/mozilla/content/svg/content/src/nsSVGElement.cpp:597
#12 0xb38a760d in nsXULDocument::EndUpdate (this=0xb1b4a800, aUpdateType=1) at /scratch/work/builds/mozilla-central/mozilla-central.08-12-08.15-49/mozilla/content/xul/document/src/nsXULDocument.cpp:3294
#13 0xb33fe973 in ~mozAutoDocUpdate (this=0xbfe15a4c) at ../../../dist/include/content/mozAutoDocUpdate.h:66
#14 0xb366c822 in nsGenericElement::UnsetAttr (this=0xb043f500, aNameSpaceID=0, aName=0xb4947598, aNotify=1) at /scratch/work/builds/mozilla-central/mozilla-central.08-12-08.15-49/mozilla/content/base/src/nsGenericElement.cpp:4616
Assignee: nobody → Olli.Pettay
Patch for bug 468176 fixes this one too.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Verified fixed, using:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2a1pre) Gecko/20081220 Minefield/3.2a1pre
Status: RESOLVED → VERIFIED
has this fix made it to 1.9.1 branch.    sounds like we should try and get it there if it hasn't made it.
needs a blocking 1.91 plus
(In reply to comment #18)
> needs a blocking 1.91 plus

No it doesn't, actually.  Note that there's no fix posted here -- this was fixed by the patch for 468176, which has approval1.9.1+ and apparently landed as
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/95fa0d1e8913

(In reply to comment #17)
> has this fix made it to 1.9.1 branch
Per the link above, it looks like it has.
Flags: blocking1.9.1? → blocking1.9.1+
Keywords: fixed1.9.1
Priority: -- → P1
Verified on Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1b3pre) Gecko/20090204 Shiretoko/3.1b3pre Ubiquity/0.1.5
Crash Signature: [@ nsAttrValue::ToString]
Group: core-security
Moving to Core:XUL per https://bugzilla.mozilla.org/show_bug.cgi?id=1455336
Component: XP Toolkit/Widgets: XUL → XUL
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: