Closed Bug 73331 Opened 23 years ago Closed 23 years ago

Browser asserts and crashes due to js-created elements with null document [@ nsBindingManager::WalkRules]

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla0.9.1

People

(Reporter: mike_jk, Assigned: jst)

References

()

Details

(Keywords: crash, helpwanted, testcase, Whiteboard: [HAVE FIX])

Crash Data

Attachments

(7 files)

When I visit this site, it loads fine. Then when I click the menu, the browser 
crashes. 

Build ID: 2001032304

MOZILLA caused an invalid page fault in
module GKCONTENT.DLL at 0177:011ac839.
Registers:
EAX=00000000 CS=0177 EIP=011ac839 EFLGS=00010246
EBX=01125e0e SS=017f ESP=0068d1b8 EBP=0068d1e0
ECX=00000000 DS=017f ESI=00000000 FS=61ef
EDX=0068d1d8 ES=017f EDI=028683f0 GS=0000
Bytes at CS:EIP:
8b 08 50 ff 51 10 8b 45 f8 68 50 ce 20 01 89 45 
Stack dump:
0068d1d8 0068d228 025fa560 60df9458 00000000 00000000 00000000 00000000 
00000000 02619780 0068d208 01126dd2 0261f364 025fa560 01125e0e 0068d228
confirming crashing (cvs20010324 winMe)
dont need to click, just moving mouse over menu does it
reproducability: always
Marking NEW.
Status: UNCONFIRMED → NEW
Component: Browser-General → Javascript Engine
Ever confirmed: true
Keywords: crash, testcase
Marking NEW.
Assignee: asa → rogerl
QA Contact: doronr → pschwartau
Confirming crash on debug builds 2001-03-25 on WinNT and Linux. 
OS Win98 --> All. 


Steps to reproduce: 

1. Bring up Mozilla; clear any errors in JavaScript Console
2. load http://www.geek.com
3. Notice there are no errors in JavaScript Console
4. In a debug build, you will see these assertions in the debug console: 


ASSERTION: Content has no document.: 'doc', file nsTextFrame.cpp, line 4445
Break: at file nsTextFrame.cpp, line 4445
ASSERTION: bad width: 'metrics.width>=0', file nsLineLayout.cpp, line 1025
Break: at file nsLineLayout.cpp, line 1025
ASSERTION: bad height: 'metrics.height>=0', file nsLineLayout.cpp, line 1026
Break: at file nsLineLayout.cpp, line 1026
nsLineLayout: Text(-1)@0x88fc00c didn't set max-element-size!
nsLineLayout: Text(-1)@0x88fc00c didn't set whad 0,0,-559038737,-559038737!
ASSERTION: Content has no document.: 'doc', file nsTextFrame.cpp, line 4445
Break: at file nsTextFrame.cpp, line 4445
ASSERTION: bad width: 'metrics.width>=0', file nsLineLayout.cpp, line 1025
Break: at file nsLineLayout.cpp, line 1025
ASSERTION: bad height: 'metrics.height>=0', file nsLineLayout.cpp, line 1026
Break: at file nsLineLayout.cpp, line 1026
nsLineLayout: Text(-1)@0x88fc00c didn't set whad 0,0,-559038737,-559038737!


5. Slowly mouseover the "Home" tab on the left
6. You then get this assertion from a debug build: 


ASSERTION: You can't dereference a NULL nsCOMPtr with operator->().: 
'mRawPtr != 0'
file ..\..\..\..\dist\include\nsCOMPtr.h
line 648


7. CRASH!!!
OS: Windows 98 → All
The function MM_findObj() in the attached JavaScript comes from MacroMedia:

http://www.macromedia.com/support/dreamweaver/programs/navbar_overview/
navbar_overview04.html


MM_findObj() has been controversial: see bug 51020, bug 43157, for example.
The function in its original form is not W3C-compliant:


function MM_findObj(n,d) { //v3.0
 var p,i,x;  

 if(!d) d=document; 
 if((p=n.indexOf("?"))>0&&parent.frames.length) 
 {
  d=parent.frames[n.substring(p+1)].document; 
  n=n.substring(0,p);
 }

 if(!(x=d[n])&&d.all) 
  x=d.all[n]; 
 for (i=0;!x&&i<d.forms.length;i++) 
  x=d.forms[i][n];
 for(i=0;!x&&d.layers&&i<d.layers.length;i++) 
  x=MM_findObj(n,d.layers[i].document); 
 return x;
}



This version returns a null x in Mozilla, because d.all and d.layers
are not supported. However, the version of the function being used here 
contains an extra line: 


function MM_findObj(n,d) { //v3.0
 var p,i,x;  

 if(!d) d=document; 
 if((p=n.indexOf("?"))>0&&parent.frames.length) 
 {
  d=parent.frames[n.substring(p+1)].document; 
  n=n.substring(0,p);
 }

 if(!(x=d[n])&&d.all) 
  x=d.all[n]; 
 for (i=0;!x&&i<d.forms.length;i++) 
  x=d.forms[i][n];
 for(i=0;!x&&d.layers&&i<d.layers.length;i++) 
  x=MM_findObj(n,d.layers[i].document);
 if(!x) x=document.getElementById(n);  <<<<<<<<<<<<<<<<<< FOR W3C, e.g. Mozilla
 return x;
}



These functions are hard to read. What makes them work is the continuing
test for !x beyond the first statement of the function. That is, succeeding 
statements are only applied if x has not already been found...
I don't see any problem with MM_findObj() in this bug; in fact, I have 
a Mozilla nightly binary from 2001-03-12 that runs this site perfectly
on WinNT, with no errors in the JavaScript console.


I do not see any JS Engine issues immediately apparent in the stack trace. 
Based on the asserts we're getting, and the stack trace, reassigning to the 
Layout component for further triage.


Since I don't crash with a 2001-03-12 binary, this may be a recent regression - 
Assignee: rogerl → karnaze
Component: Javascript Engine → Layout
QA Contact: pschwartau → petersen
Reassigning to jst.
Assignee: karnaze → jst
Component: Layout → DOM Content Models
Mark, isn't this a dup of one of the bugs I gave you a few days ago (setting
display: to hidden vs. visible crashes mozilla)?
Assignee: jst → attinasi
Bug 73587 is the one I was thinking about.
I don't see the similarities between this bug and bug 73587. In fact, this one
is crashing in the binding manager and not doing the infinite recursion thing
(apparently). Johnny, were you seeing something I am missing?
The assertions and the crash stem from the same problem: elements with null
documents. In the first case, with the text node assertions, the LineBox should
be catching the error and returning an error but it does not, so there are then
several more assertions. However, the crash happens later when a Div with no
document is in the binding manager and is just assumed to have a non-null
document, which is dereferenced and... crash.

I'm not sure how these elements with no document are getting created, but since
everything is generated by script I'm guessing we'll need some help from jst or
somebody else who knows about how that all works.

I'm CC'ing Hyatt as well in case he wants to look at the crash in the binding
manager - my guess is that the assumptions about having a document are valid and
that we really need to fix the null document problem. Oh, the PresShell has a
valid HTML document, as do most of the elements.

Updating summary to be a little more specific too ;)
Summary: Browser crashes → Browser asserts and crashes due to js-created elements with null document
Adding helpwanted kwd. If somebody can help narrow down the testcase it would be
greatly appreciated!
Keywords: helpwanted
Sending back to jst to check out: can you determine why there is no doc on an
inserted node? I'm not sure where to responsibility is, but I'm hoping you can
take a stab at the root of the problem here. Please feel free to send it back to
me if you want :)
Assignee: attinasi → jst
I tested with the second testcase and found something odd...
If i change the 4 spaces on line 210 into 2 tabs, or add a space, the crash
disappears for me.
Will be working on reducing testcase for a while (it's still a crasher)
using winMe cvs 20010416
Ok, need to change that... even if i only add a space to line 266 it doesnt
crash anymore, deleting that space will make it crash again.
This is one weird case to me
Attached file reduced testcase
OK, I've managed to reduce Ferdinand's testcase a bit further.  I think what I
have is pretty minimal.  Note the MM_findObj and company are no longer present,
so the blame lieth not with them.

If you open http://bugzilla.mozilla.org/showattachment.cgi?attach_id=31574 you
should see two input elements.  This is wrong.  There should only be one input
element.  Therein lies the heart of the crash.

Possibly relevant things:

1)  It does not matter what you set visibility to in crashme(). 'hidden' and
    'visible' both give the crash
2)  The <a> element is nice visually but irrelevant.  Putting that onMouseOver.
    on the <div> itself and deleting the <a> completely reproduces the crash
    just fine.
3)  The <font> tag can be replaced with <b> or <i> or <a> and we still get the
    crash.  Replacing with <strong> or <em> or <span> makes the crash go away.
    Seems odd..
4)  The business with <script> and document.write() calls is necessary
5)  The <form> is necessary
6)  The <span> can be replaced with other tags (tried <div> and <center>) and
    the crash still happens
7)  The <br> is necessary.

Also, document inspector only shows one <input> node in the DOM... why are we
drawing two of them??
Thanks Boris, my javascript knowledge is thin :)

-Note that the left input field is fixed, cant fill in anything, while the right
input field works as we would expect from an <input> tag.
*** Bug 75590 has been marked as a duplicate of this bug. ***
This bug has been around for a long time any possible resolution date?
Tested both TestCases as well as the link
WorksforMe on Windows 2000 Build 2001050812

Looks fine to me, too.
Marking fixed.
Scott Tran, please verify.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
*** Bug 79962 has been marked as a duplicate of this bug. ***
Worksforme also on Linux with a may 10 build.  I would bet that this was fixed
by the DOM rewrite.
There's no fix here, this should be worksforme
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
marking so
Status: REOPENED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → WORKSFORME
Ferdinand:
If you look at the discussion, there was definitely a reproducable bug here, 
which is now fixed - even if it was by accident :-)

So I'd mark this "fixed" rather than "worksforme".

I don't really care, though.
worskforme is also used when we don't know _what_ fixed the bug.  fixed is 
used when a known checkin fixed it.

In this case the "can't reproduce the bug, no clue why one would see it, no clue 
why I'm not seeing it" definition certainly applies.

verifying.
Status: RESOLVED → VERIFIED
>worskforme is also used when we don't know _what_ fixed the bug.  fixed is 
>used when a known checkin fixed it.

It is? Well, someone should write this on http://www.mozilla.org/bugs/. I'll 
mail Gerv.
While this appears to have been fixed by the DOM rewrite in that cursoring the
mouse over these items doesn't crash Mozilla anymore, it's still not totally
fixed, because now the menus have disappeared again in recent builds.  For
example, the 20010509 builds worked perfectly and displayed the drop down menus
on the geek.com web site, but subsequent builds don't display the drop down
menus on that site anymore.  While it's nice that cursoring the mouse over them
doesn't crash Mozilla anymore, it would be nicer if the drop down menus worked
again.
Philip: You should file a new bug about that problem. This bug is fixed, and 
there are already too many comments cluttering it.
Yes, file a new bug for other problems with that site.

However, this bug (as defined by Boris Zbarsky 2001-04-20 01:59 and the 
wacky testcase with doubled elements 
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=31574) has actually
reemerged. 

This did not have the second input element in 5/18 builds, and did not crash.
But in a 5/21 or 5/22 build (win2k), this has two input elements, crashes, and 
the stack is (essentially) the same as posted earlier in this bug.

It would be good to have a look at his before long (mozilla0.9.2). That 
duplicated content in the testcase is somewhat troubling.

Status: VERIFIED → REOPENED
Keywords: mozilla0.9.2
Resolution: WORKSFORME → ---
I think I have this one nailed down, I'll clean up the patch and attach the
patch tomorrow.
Nominated for 0.9.2, just curious whether this is still crash and therefore a
0.9.1 candidate?

/be
Dear Sir,

The testcase given by BZ 
(http://bugzilla.mozilla.org/showattachment.cgi?attach_id=31574)
is still crashing Mozilla (build 2001052320).  Ouch! Crash hurts...
In case you still need it, I just sent talkback-report

TB30864539Z

after hitting this bug.
Attached patch Proposed fix.Splinter Review
While I can't fully explain what the exact reason for this crash is (not knowing
the frame code in great detail), but here's basically what happens when we load
this page w/o the attached patch.

The html on the page contains absolutely positioned div's that are hidden
(visibility: hide) and contain input elements. As the page is parsed we realize
that the HTML on the page is not wellformed so we end up callin
DemoteContainer() in the html sink to clean up the document (by moving content
elements around in the document). In moving content around we take elements out
of the document and doing that ends up calling HTMLContentSink::BeginUpdate()
which ends up flushing content in the sink, which ends up creating frames for
more content. The content that is flushed is content that will be moved in the
document once we roll back to ::DemoteContainer() and the frames that were just
created will be (or should be) removed again.

The problem seems to be that the frames that are created (in DemoteContainer())
are not completely torn down (or initialized) after being created from within
DemoteContainer() but the frames are notified that they're about to go away so
the anonumous content for input elements, in this case, are notified that
they're going away so the anonymous content inside input elements are told that
their document is null, and so on. Later on when the hidden absolutely
positioned div's are shown the content in the div's is reflown and the frame for
an input is found and reflown, the content that's associated with this frame is
one of the anonumous elements in an input element whose frame has partially been
torn down, the document in this anonymous element is null, and that's where we
crash.

Now, the attached patch fixes this by avoiding creatinb the frames for the
content that is flushed in DemoteContainer() (through ::BeginUpdate()), in stead
of doing a full flush we just flush the content w/o creating the frames (which
would be immediately removed again later on in DemoteContainer() anyway), this
change has the nice side effect of speeding up page load times by ~3%, yay! The
attached patch also fixes a bunch of random weird DOM problems that prevented
this site (and others) from working correctly.

I'm nominating this for mozilla0.9.1 since this is a fix for a crash that can
happen on any page that happens to show and hide form controls, and this fix is
also a page load performance improvement.

Vidur, sr=? Harishd, r=?
Status: REOPENED → ASSIGNED
Component: DOM Content Models → DOM Level 0
Keywords: mozilla0.9.1
Hardware: PC → All
Whiteboard: [HAVE FIX]
Target Milestone: --- → mozilla0.9.1
Summary: Browser asserts and crashes due to js-created elements with null document → Browser asserts and crashes due to js-created elements with null document [@ nsBindingManager::WalkRules]
Adding [@ nsBindingManager::WalkRules] to summary for tracking.  This crash has 
been a topcrash in the past and showed up in recent Trunk builds, but I haven't 
seen any crashes with the nsBindingManager::WalkRules stack signature since 
build 2001052322.
Keywords: topcrash
Just to note: not all crashes in WalkRules are created equal. There have been
(are still) crashes at that code location; however, in this case, I believe
it's the case that that code is just the victim of a horked DOM tree. So, it
may be misleading to label this bug as "the" WalkRules topcrash. It's a 
special case.
thanks for the clarification john, i had a feeling this was a unique case and 
didn't mean to make this a topcrasher, so clearing the topcrash keyword.
Keywords: topcrash
In looking at Johnny's patch, it seems like we shouldn't flush at all if we're
in the process of demoting a container. The lines:

 if (mInScript && !mInNotification && mCurrentContext) {
-    result = mCurrentContext->FlushTags(PR_TRUE);
+    result = mCurrentContext->FlushTags(!mIsDemotingContainer);

should change to:

-if (mInScript && !mInNotification && mCurrentContext) {
+if (mInScript && !mInNotification && mCurrentContext && !mIsDemotingContainer) {
     result = mCurrentContext->FlushTags(PR_TRUE);

With that change, sr=vidur.
a=blizzard for 0.9.1
Harish sez r=harishd
Fix checked in.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → FIXED
Marking verified in the May 30th build.
Status: RESOLVED → VERIFIED
*** Bug 82083 has been marked as a duplicate of this bug. ***
Crashtest added as part of http://hg.mozilla.org/mozilla-central/rev/5a6def05ccbc
Flags: in-testsuite+
Crash Signature: [@ nsBindingManager::WalkRules]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: