Closed Bug 292717 Opened 19 years ago Closed 19 years ago

[FIXr]Crash on Hixie's Evil XML tests [@ nsXBLPrototypeBinding::GetAllowScripts]

Categories

(Core :: XBL, defect, P1)

x86
Windows XP
defect

Tracking

()

RESOLVED FIXED
mozilla1.8beta3

People

(Reporter: bc, Assigned: bzbarsky)

References

()

Details

(Keywords: crash)

Crash Data

Attachments

(3 files)

cvs sm build from 2005-05-02

nsXBLPrototypeBinding::GetAllowScripts(int * 0x0012dca4) line 350 + 7 bytes
nsXBLBinding::AllowScripts() line 1171
nsXBLBinding::ExecuteDetachedHandler() line 774 + 8 bytes
nsXBLBinding::MarkForDeath() line 1258
nsBindingManager::RemoveLayeredBinding(nsBindingManager * const 0x03740868,
nsIContent * 0x03050e10, nsIURI * 0x037d9b70) line 685
nsDocument::RemoveBinding(nsDocument * const 0x037d9f60, nsIDOMElement *
0x03050e2c, const nsAString & {...}) line 2697 + 44 bytes
nsXMLPrettyPrinter::EndUpdate(nsIDocument * 0x037d9ec0, unsigned int 1) line 231
+ 48 bytes
nsDocument::EndUpdate(unsigned int 1) line 1944
mozAutoDocUpdate::~mozAutoDocUpdate() line 749
nsGenericElement::InsertChildAt(nsIContent * 0x03679350, unsigned int 0, int 1)
line 2719 + 15 bytes
nsGenericElement::InsertBefore(nsGenericElement * const 0x03050e10, nsIDOMNode *
0x0367936c, nsIDOMNode * 0x00000000, nsIDOMNode * * 0x0012e0c0) line 3041 + 26 bytes
nsXMLElement::InsertBefore(nsXMLElement * const 0x03050e10, nsIDOMNode *
0x0367936c, nsIDOMNode * 0x00000000, nsIDOMNode * * 0x0012e0c0) line 64 + 24 bytes
nsGenericElement::AppendChild(nsGenericElement * const 0x03050e10, nsIDOMNode *
0x0367936c, nsIDOMNode * * 0x0012e0c0) line 516
nsXMLElement::AppendChild(nsXMLElement * const 0x03050e10, nsIDOMNode *
0x0367936c, nsIDOMNode * * 0x0012e0c0) line 64 + 20 bytes
XPTC_InvokeByIndex(nsISupports * 0x03050e2c, unsigned int 18, unsigned int 2,
nsXPTCVariant * 0x0012e0b0) line 102
XPCWrappedNative::CallMethod(XPCCallContext & {...}, XPCWrappedNative::CallMode
CALL_METHOD) line 2065 + 43 bytes
XPC_WN_CallMethod(JSContext * 0x03352018, JSObject * 0x035cdc98, unsigned int 1,
long * 0x03646aec, long * 0x0012e380) line 1287 + 14 bytes
js_Invoke(JSContext * 0x03352018, unsigned int 1, unsigned int 0) line 1320 + 23
bytes
js_Interpret(JSContext * 0x03352018, unsigned char * 0x0355b46f, long *
0x0012ee34) line 3610 + 15 bytes
js_Invoke(JSContext * 0x03352018, unsigned int 1, unsigned int 2) line 1340 + 19
bytes
js_InternalInvoke(JSContext * 0x03352018, JSObject * 0x0335eb28, long 53871400,
unsigned int 0, unsigned int 1, long * 0x0012f020, long * 0x0012f01c) line 1417
+ 20 bytes
JS_CallFunctionValue(JSContext * 0x03352018, JSObject * 0x0335eb28, long
53871400, unsigned int 1, long * 0x0012f020, long * 0x0012f01c) line 3849 + 31 bytes
nsJSContext::CallEventHandler(JSObject * 0x0335eb28, JSObject * 0x03360328,
unsigned int 1, long * 0x0012f020, long * 0x0012f01c) line 1384 + 33 bytes
nsJSEventListener::HandleEvent(nsJSEventListener * const 0x02ec3580, nsIDOMEvent
* 0x036f5400) line 175 + 51 bytes
nsEventListenerManager::HandleEventSubType(nsListenerStruct * 0x0358e750,
nsIDOMEvent * 0x036f5400, nsIDOMEventTarget * 0x0335bfec, unsigned int 1,
unsigned int 7) line 1563 + 20 bytes
nsEventListenerManager::HandleEvent(nsEventListenerManager * const 0x03654678,
nsPresContext * 0x03567958, nsEvent * 0x0012f394, nsIDOMEvent * * 0x0012f304,
nsIDOMEventTarget * 0x0335bfec, unsigned int 7, nsEventStatus * 0x0012f3bc) line
1667
nsGlobalWindow::HandleDOMEvent(nsPresContext * 0x03567958, nsEvent * 0x0012f394,
nsIDOMEvent * * 0x0012f304, unsigned int 7, nsEventStatus * 0x0012f3bc) line 909
DocumentViewerImpl::LoadComplete(DocumentViewerImpl * const 0x035351e0, unsigned
int 0) line 987 + 35 bytes
nsDocShell::EndPageLoad(nsIWebProgress * 0x0335b604, nsIChannel * 0x0356a4a0,
unsigned int 0) line 4561
nsWebShell::EndPageLoad(nsIWebProgress * 0x0335b604, nsIChannel * 0x0356a4a0,
unsigned int 0) line 665
nsDocShell::OnStateChange(nsDocShell * const 0x0335b6b8, nsIWebProgress *
0x0335b604, nsIRequest * 0x0356a4a0, unsigned int 131088, unsigned int 0) line 4487
nsDocLoader::FireOnStateChange(nsIWebProgress * 0x0335b604, nsIRequest *
0x0356a4a0, int 131088, unsigned int 0) line 1195
nsDocLoader::doStopDocumentLoad(nsIRequest * 0x0356a4a0, unsigned int 0) line 832
nsDocLoader::DocLoaderIsEmpty() line 729
nsDocLoader::DocLoaderIsEmpty() line 732
nsDocLoader::OnStopRequest(nsIRequest * 0x0301db48, nsISupports * 0x00000000,
unsigned int 0) line 653
nsLoadGroup::RemoveRequest(nsLoadGroup * const 0x0306bf08, nsIRequest *
0x0301db48, nsISupports * 0x00000000, unsigned int 0) line 732 + 44 bytes
nsFileChannel::OnStopRequest(nsFileChannel * const 0x0301db50, nsIRequest *
0x0359a2a0, nsISupports * 0x00000000, unsigned int 0) line 555
nsInputStreamPump::OnStateStop() line 507
nsInputStreamPump::OnInputStreamReady(nsInputStreamPump * const 0x0359a2a4,
nsIAsyncInputStream * 0x0301d7b8) line 343 + 11 bytes
nsInputStreamReadyEvent::EventHandler(PLEvent * 0x0359a4bc) line 120
PL_HandleEvent(PLEvent * 0x0359a4bc) line 698 + 10 bytes
PL_ProcessPendingEvents(PLEventQueue * 0x00c18788) line 633 + 9 bytes
_md_EventReceiverProc(HWND__ * 0x001201e0, unsigned int 49388, unsigned int 0,
long 12683144) line 1435 + 9 bytes
USER32! 77d48734()
USER32! 77d48816()
USER32! 77d489cd()
USER32! 77d48a10()
nsAppShell::Run(nsAppShell * const 0x00b83428) line 135
nsAppStartup::Run(nsAppStartup * const 0x00b83188) line 208
main1(int 3, char * * 0x002e2638, nsISupports * 0x00ad7d30) line 1272 + 32 bytes
main(int 3, char * * 0x002e2638) line 1763 + 37 bytes
mainCRTStartup() line 338 + 17 bytes
K
Summary: Crash on Hixie's Evil XML tests → Crash on Hixie's Evil XML tests [@ nsXBLPrototypeBinding::GetAllowScripts]
Hmm.. I can't reproduce this with either a Mozilla.org nightly from 05-02 or my
self-build from 04-30....  Bob, can you get any information on those last couple
of stack frames?  What's |this| and |*this|, for example?
Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8b2) Gecko/20050503
Firefox/1.0+

I am seeing the same crash, which occurs at
http://lxr.mozilla.org/seamonkey/source/content/xbl/src/nsXBLBinding.cpp#775

on the line 
775	mPrototypeBinding->BindingDetached(mBoundElement);

when mPrototypeBinding is null.

With my patch, the page renders as expected, so the warning may not be
necessary.
Attachment #182502 - Flags: review?
(In reply to comment #1)
> What's |this| and |*this|, for example?

nsresult
nsXBLPrototypeBinding::GetAllowScripts(PRBool* aResult)
{
=>  return mXBLDocInfoWeak->GetScriptAccess(aResult);
}

this is 0xdddddddd


PRBool
nsXBLBinding::AllowScripts()
{
  PRBool result;
=>  mPrototypeBinding->GetAllowScripts(&result);
  return result;
}

this is 0x0360a498
*this is 

-	*this	{...}
+	mRefCnt	{...}
+	mPrototypeBinding	0x00d90c0a
+	mContent	{...}
+	mNextBinding	{...}
+	mBoundElement	0xfdfdfdfd
+	mInsertionPointTable	0xdddddddd
	mIsStyleBinding	221 'Ý'
	mMarkedForDeath	1 ''


> when mPrototypeBinding is null.

That can't happen unless |this| is already deleted.

> this is 0xdddddddd

Which isn't the same as null, but does indicate that the nsXBLBinding is
deleted.... I guess now we get to sort out why we're making method calls on an
already-deleted binding.  :(
fyi, i found this running -layoutdebug baseline tests although the individual
stack in that case was different and occured in gc-land.
(In reply to comment #4)
> > when mPrototypeBinding is null.
> 
> That can't happen unless |this| is already deleted.

I'll bow out of this Bug, to make things less confusing. I was looking 3 frames 
up, and from a spackle point of view (robustness) refraining from calling
a method on a null pointer seemed sufficient.

It is possible, and indeed likely, that a fix (correctness) needs to be applied
elsewhere.

In my frame, |this| was an nsXBLBinding at 0xbe3c8c0 .

10 frames up, I had:
#10 0x20571bc4 in nsXMLElement::InsertBefore(nsIDOMNode*, nsIDOMNode*,
nsIDOMNode**) (this=0x0, aNewChild=0xbfffca60, aRefChild=0x0, aReturn=0x2) at
../../../../../src/content/xml/content/src/nsXMLElement.h:64
#11 0x20571c50 in nsXMLElement::AppendChild(nsIDOMNode*, nsIDOMNode**)
(this=0x0, aNewChild=0xbfffca60, aReturn=0x2) at
../../../../../src/content/base/src/nsGenericElement.h:514

which might indicate only that my my gdb doesn't always report |this| pointers
correctly, or it might be a place to look...

Anyway, thank you for the reply and I hope that I have been of some help.
> refraining from calling a method on a null pointer seemed sufficient.

The point is that the pointer is, in general, garbage, not null.  Except on
windows debug builds where it's probably (not guaranteed) 0xdddddddd because all
deallocated memory is stamped that way.
Hmm... I just tried setting all the memory for |this| to zero in ~nsXBLBinding
and I still get no crash (with a Linux debug build)...  Same if I set it to 0xdd...
Is the suggestion that this report (the crash at 
http://www.hixie.ch/tests/evil/xml/010.xml ) is platform specific? Videlicet:

Linux: No
Windows XP: Yes
Mac OS X: Yes

I was claiming that at 
http://lxr.mozilla.org/seamonkey/source/content/xbl/src/nsXBLBinding.cpp#775 
|this| had a reasonable value, but mPrototypeBinding did not.

I haven't investigated why this has happened: Actually, I thought that
mPrototypeBinding had not been initialised. Maybe it is more likely that
it has already been uninitialised.

No, the suggestion is that I have no idea why I can't reproduce this.

mPrototypeBinding can't be not initialized -- it's passed as an argument to the
only constructor of the nsXBLBinding class, and that constructor would crash if
null is passed (and asserts so).  So if it's null, the nsXBLBinding object is
most definitely dead.
Attached patch Output to stderrSplinter Review
With my patch, I get these messages only, and these include several fprintf
statements from elsewhere in my code.

I may be taking this a little further than we can be sure of, but perhaps
the truth table is 

mPrototypeBinding becomes null => crash
mPrototypeBinding never becomes null => no crash

I will see whether I can catch mPrototypeBinding's being set to null in gdb.
I have the test case in one tab. When I click refresh, the destructor for
nsXBLBinding is called 6 times, then the constructor is called 7 times.
The 4th time, the arguments are (this=0xffbe2d0, aBinding=0xd2ff3b0).
Next the destructor is called with |this| == 0xffbe2d0.

The next thing that I notice is that gdb breaks at
Breakpoint 3, nsXBLBinding::ExecuteDetachedHandler() (this=0xffbe2d0) at
../../../../../src/content/xbl/src/nsXBLBinding.cpp:779
and 
(gdb) p *this
$67 = {
  mRefCnt = {
    mValue = 266413339
  }, 
  mPrototypeBinding = 0x0, 
  mContent = {
    mRawPtr = 0xfe11060
  }, 
  mNextBinding = {
    mRawPtr = 0x0
  }, 
  mBoundElement = 0xd113440, 
  mInsertionPointTable = 0xff1b580, 
  mIsStyleBinding = 0 '\0', 
  mMarkedForDeath = 1 '\001'
}

So it does seem that an nsXBLBinding is being used after its destructor
has been called.
The crash seems to occur during the call to 
nsBindingManager::RemoveLayeredBinding at
http://lxr.mozilla.org/mozilla/source/content/xbl/src/nsBindingManager.cpp#641

I'll step through it
Is it possible that when SetBinding(aContent, nsnull); is called at
http://lxr.mozilla.org/mozilla/source/content/xbl/src/nsBindingManager.cpp#678
then something happens to the binding(s) in its mBindingTable, and the destructor
for the binding within it for which we hold a pointer acquired using 
nsBindingManager::GetBinding(aContent);
at http://lxr.mozilla.org/mozilla/source/content/xbl/src/nsBindingManager.cpp#643)
is called.

The next thing we do is call binding->MarkForDeath(); at
http://lxr.mozilla.org/mozilla/source/content/xbl/src/nsBindingManager.cpp#679
and this is where the evil happens.

I may be wrong, but I think that MarkForDeath( ) actually reaps the doomed
binding, and this may not be obvious from the name.

I'll put an fprintf statement in and recompile my lizard.
Very much as I expected:

Ahead of line 678 all is well
nsBindingManager::RemoveLayeredBinding About to call SetBinding(),
mPrototypeBinding is 0xb47c820

Within line 678, the destructor for our nsXBLBinding is called 
nsXBLBinding::~nsXBLBinding( )  mPrototypeBinding at 0xcf7b940

After line 678, per comment 10 (and others) our nsXBLBinding is dead 
nsBindingManager::RemoveLayeredBinding About to call 0xb47c820->MarkForDeath(),
mPrototypeBinding is 0x0

... We attempt to use it
WARNING: nsXBLBinding::ExecuteDetachedHandler() mPrototypeBinding was null, file
../../../../../src/content/xbl/src/nsXBLBinding.cpp, line 780


So far as I can tell this function was dates back to 2002, and my guess is
that the semantics of either nsBindingManager::SetBinding( ) or 
nsXBLBinding::MarkForDeath( ) has changed. Perhaps one of these is no longer
needed. The nsBindingManager is manipulating an nsXBLBinding by
remote control. Is there a way that it could simply tell the 
nsIContent* aContent to do the work ... ?
I distcleaned, and rebuilt both debug and opt ff and this is a debug only crash.
Ben or Bob, if in nsBindingManager::RemoveLayeredBinding you change the line:

  nsXBLBinding *binding = nsBindingManager::GetBinding(aContent);

to:

  nsRefPtr<nsXBLBinding> binding = nsBindingManager::GetBinding(aContent);

does the crash go away?
(In reply to comment #17)
> ...:
> 
>   nsRefPtr<nsXBLBinding> binding = nsBindingManager::GetBinding(aContent);
> 
> does the crash go away?

Yes. Or at least it certainly seems to fix the problem - my build does not crash
for me.
OK.  That's probably the right fix, then, with a comment explaining why we need
the ref there.  The problem is that when the binding is removed from the table
the last ref to the binding is dropped, so it gets destroyed.  Then we call a
method on it, and crash.
Attached patch FixSplinter Review
Attachment #182557 - Flags: superreview?(jst)
Attachment #182557 - Flags: review?(jst)
Ben, thanks for helping debug this!
Assignee: general → bzbarsky
Priority: -- → P1
Summary: Crash on Hixie's Evil XML tests [@ nsXBLPrototypeBinding::GetAllowScripts] → [FIX]Crash on Hixie's Evil XML tests [@ nsXBLPrototypeBinding::GetAllowScripts]
Target Milestone: --- → mozilla1.8beta2
(In reply to comment #20)
> Created an attachment (id=182557) [edit]
> Fix
> 

I applied this and rebuilt debug in objdir/content and still see the same crash. 
but touching the file and rebuilding from mozilla/ seems to make the crash go away.
Bob, content/ is linked into gklayout, so if you touch anything in content/ you
have to build in layout/build as well.
I see this crash in the DOM TS as well. It would be good to get this in for
1.8b2. jst, any chance of a quick review?
*** Bug 293471 has been marked as a duplicate of this bug. ***
Severity: major → normal
OS: Windows XP → Windows 2000
Priority: P1 → --
Target Milestone: mozilla1.8beta2 → ---
Sorry for changing the fields, trunk build seems to do that... 
Severity: normal → major
OS: Windows 2000 → Windows XP
Priority: -- → P1
Target Milestone: --- → mozilla1.8beta2
Comment on attachment 182557 [details] [diff] [review]
Fix

r+sr=jst
Attachment #182557 - Flags: superreview?(jst)
Attachment #182557 - Flags: superreview+
Attachment #182557 - Flags: review?(jst)
Attachment #182557 - Flags: review+
Comment on attachment 182557 [details] [diff] [review]
Fix

Requesting 1.8b3 review for simple crash fix -- hold a ref to the object we're
working with so it doesn't go away.
Attachment #182557 - Flags: approval1.8b3?
Summary: [FIX]Crash on Hixie's Evil XML tests [@ nsXBLPrototypeBinding::GetAllowScripts] → [FIXr]Crash on Hixie's Evil XML tests [@ nsXBLPrototypeBinding::GetAllowScripts]
Attachment #182557 - Flags: approval1.8b3? → approval1.8b3+
Fixed for 1.8b3.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Target Milestone: mozilla1.8beta2 → mozilla1.8beta3
Blocks: 194834
Attachment #182502 - Flags: review?
Crash Signature: [@ nsXBLPrototypeBinding::GetAllowScripts]
You need to log in before you can comment on or make changes to this bug.