Last Comment Bug 348492 - Frames not destroyed with <mmultiscripts> and <mtd>
: Frames not destroyed with <mmultiscripts> and <mtd>
Status: RESOLVED FIXED
[sg:critical]
: crash, testcase, verified1.8.0.12, verified1.8.1.4
Product: Core
Classification: Components
Component: MathML (show other bugs)
: Trunk
: PowerPC Mac OS X
: -- critical (vote)
: ---
Assigned To: rbs
: Hixie (not reading bugmail)
Mentors:
Depends on:
Blocks: 347580 framedest
  Show dependency treegraph
 
Reported: 2006-08-12 22:31 PDT by Jesse Ruderman
Modified: 2007-12-17 17:09 PST (History)
6 users (show)
dveditz: blocking1.8.1.4+
dveditz: wanted1.8.1.x+
dveditz: blocking1.8.0.12+
dveditz: wanted1.8.0.x+
jruderman: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
testcase that does not crash (517 bytes, application/xhtml+xml)
2006-08-12 22:34 PDT, Jesse Ruderman
no flags Details
patch (21.44 KB, patch)
2006-08-14 17:42 PDT, rbs
bzbarsky: review+
bzbarsky: superreview+
dveditz: approval1.8.1.4-
dveditz: approval1.8.0.12-
Details | Diff | Review
patch in sync with the 1.8.1.x 1.8.0.x branches (21.60 KB, patch)
2007-04-18 18:08 PDT, rbs
no flags Details | Diff | Review
262065: patch in sync with the 1.8.1.x & 1.8.0.x branches (21.60 KB, patch)
2007-04-18 20:53 PDT, rbs
dveditz: approval1.8.1.4+
dveditz: approval1.8.0.12+
Details | Diff | Review

Description Jesse Ruderman 2006-08-12 22:31:41 PDT
Upon loading the testcase:

###!!! ASSERTION: unexpected child list: 'PR_FALSE', file /Users/admin/trunk/mozilla/layout/tables/nsTableOuterFrame.cpp, line 243
(this assertion also fires in bug 348126, fwiw)

Upon reloading:

###!!! ASSERTION: Some frame destructors were not called: 'mFrameCount == 0', file /Users/admin/trunk/mozilla/layout/base/nsPresShell.cpp, line 625
(if you have the patch from bug 334514 that adds this assertion)

Crash dereferencing 0xdadadaf6
(if you use the crashing testcase, and you're using a debug build)
Comment 1 Jesse Ruderman 2006-08-12 22:33:43 PDT
Created attachment 233434 [details]
testcase that crashes on the second reload
Comment 2 Jesse Ruderman 2006-08-12 22:34:40 PDT
Created attachment 233435 [details]
testcase that does not crash
Comment 3 rbs 2006-08-13 23:45:58 PDT
(This bug shows up because the CSS rule to disable 'mtd' isn't being applied. As I explained in bug 347355 comment 33, this has been a blessing in disguise. With that CSS rule in effect, the bug doesn't show up.)

After much debugging, this bug turns out to be a case of mistaken identity. What is happening is that the content model moves from
1)
<mmultiscripts>
  <mtd/>

to 
2)
<mmultiscripts>
  <mtd/>
  <img/>

And here is the scenario that leads to the crash:

In 1), the lone <mtd/> causes the frame construction code to generate pseudo frames to make a complete table hierarchy.

In 2), when <img/> is appended, the frame construction code searches the frame that should handle the append. This rightly returns the MathML's mmultiscripts frame.

Now, here is where the drama unfolds (it is a bit technical here...). The frame construction code has different handlings depending on the 'type' of the frame. So it calls frame->GetType(), and for our particular context with frame=mmultiscripts, frame->GetType() returns a bogus 'outerTable', and because of this type, it passes the handling to frame->FirstChild(), that is, mmultiscripts->FirstChild(), which is the pseudo outerTable frame!

Then the unintended outerTable frame is given the task to append the <img/>. Since it can't do that, it therefore fires the assertion 
###!!! ASSERTION: unexpected child list: 'PR_FALSE'

The img frame is left unattached to the frame tree. And at tear down, it becomes a zombie -- we are back to what whas happening in bug 347355.

========================
On the whole, the crux of the matter boild down to why does mmultiscripts->GetType() return outerTable ?!?

The answer to that turns out to be a dangerous overload of GetType(). I include the function full below:

// For MathML, the 'type' will be used to determine the spacing between frames
// Subclasses can override this method to return a 'type' that will give
// them a particular spacing
nsIAtom*
nsMathMLContainerFrame::GetType() const
{
  // see if this is an embellished operator (mapped to 'Op' in TeX)
  if (mEmbellishData.coreFrame) {
    return mEmbellishData.coreFrame->GetType();
  }

  // if it has a prescribed base, just fetch the type from there
  if (mPresentationData.baseFrame) {
HERE>>HERE>>>>>>>>>>>>>>>>>>>>>>
    In the case of is bug, baseFrame is the outeTable!!!
>>>>>>>>>>>>>>>>>>>>>>
    return mPresentationData.baseFrame->GetType();
  }

  // everything else is treated as ordinary (mapped to 'Ord' in TeX)
  return nsMathMLAtoms::ordinaryMathMLFrame;
}

================
The fix will consist therefore in not overloading GetType() for internal MathML purposes! I will have to introduce another separate function. This might give a large patch, but I hope the explanation/necessity above will ease the review.
Comment 4 rbs 2006-08-14 17:42:24 PDT
Created attachment 233698 [details] [diff] [review]
patch

This patch prevents the crash in the testcase here. But in a sense, it is a continuation of bug 347355. It is needed to really nail down bug 347355 with C++.

As I hinted in my previous comment, it is a chnage of API and does not really change the functionality. The nsIFrame's GetType() is now left alone, and what it was overloaded for has been passed on to a newly added GetMathMLFrameType().
Comment 5 Boris Zbarsky [:bz] (Out June 25-July 6) 2006-08-14 17:57:26 PDT
Comment on attachment 233698 [details] [diff] [review]
patch

>Index: layout/mathml/base/src/nsMathMLContainerFrame.cpp
>+    aFrame->QueryInterface(NS_GET_IID(nsIMathMLFrame), (void**)&mathMLFrame);

CallQueryInterface(aFrame, &mathMLFrame);

>Index: layout/mathml/base/src/nsMathMLContainerFrame.h
>+                                 eMathMLFrameType*    aMathMLFrameType = nsnull);

Document this argument, please.

>Index: layout/mathml/base/src/nsMathMLFrame.cpp
>@@ -179,11 +194,13 @@ nsMathMLFrame::GetPresentationDataFrom(n
>+      frame->QueryInterface(NS_GET_IID(nsIMathMLFrame), (void**)&mathMLFrame);

CallQI.

>Index: layout/mathml/base/src/nsMathMLFrame.h
>+      aFrame->QueryInterface(NS_GET_IID(nsIMathMLFrame), (void**)&mathMLFrame);

CallQI.

r+sr=bzbarsky
Comment 6 rbs 2006-08-14 21:52:57 PDT
Checked in, with the suggested review changes.
Comment 7 Daniel Veditz [:dveditz] 2006-11-20 00:25:31 PST
On FF2 I get "invalid markup" from the testcase and don't crash. Can we do without this fix on the old branches?
Comment 8 rbs 2006-11-20 05:04:32 PST
The reason it doesn't show up on the branch is because the CSS rule to disable 'mtd' is applied there, C.f. comment #3.

So the question boils down to whether it is good enough to have prevented the issue via that CSS (as opposed to having this C++ patch).
Comment 9 Boris Zbarsky [:bz] (Out June 25-July 6) 2006-11-20 07:40:53 PST
Given XBL, probably not....  Unless the MathML frame construction enforces that its nodes are really MathML-namespaced before application of XBL?
Comment 10 Jesse Ruderman 2006-11-20 10:45:50 PST
See bug 347355 comment 15 for more reasons not to rely on mathml.css rules for memory safety.
Comment 11 Samuel Sidler (old account; do not CC) 2007-03-26 16:59:57 PDT
Per comments 8 through 10, requesting blocking on the branches.
Comment 12 rbs 2007-04-18 13:56:56 PDT
Comment on attachment 233698 [details] [diff] [review]
patch

If security eyes are okay with this, I guess we can put it in the 1.8 branch.
Comment 13 Daniel Veditz [:dveditz] 2007-04-18 16:37:30 PDT
Comment on attachment 233698 [details] [diff] [review]
patch

This patch doesn't appear to apply to the branch(es). Please provide a merged patch, and if significantly different it might need a re-review.
Comment 14 rbs 2007-04-18 18:08:35 PDT
Created attachment 262065 [details] [diff] [review]
patch in sync with the 1.8.1.x 1.8.0.x branches
Comment 15 rbs 2007-04-18 20:53:13 PDT
Created attachment 262079 [details] [diff] [review]
262065: patch in sync with the 1.8.1.x & 1.8.0.x branches

Please consider this instead. (After porting the earlier patch, I tested that it compiled and didn't crash the testcase as expected, but then forgot to "cvs update" before doing the "cvs diff". This new diff is the proper one in sync as it comes after the cvs update.)
Comment 16 Daniel Veditz [:dveditz] 2007-04-19 10:23:57 PDT
Comment on attachment 262079 [details] [diff] [review]
262065: patch in sync with the 1.8.1.x & 1.8.0.x branches

approved for 1.8.0.12 and 1.8.1.4, a=dveditz for release-drivers
Comment 17 rbs 2007-04-19 11:25:40 PDT
Checked in the branches: fixed1.8.0.12, fixed1.8.1.4
Comment 18 Jay Patel [:jay] 2007-05-15 13:28:10 PDT
v.fixed on 1.8 branch with Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4, no crash with first testcase, but I also see the "invalid markup" that dveditz saw in comment #7 with both testcases.  

I am assuming the "invalid markup" is expected for those testcases?
Comment 19 Jay Patel [:jay] 2007-05-15 14:38:46 PDT
v.fixed on 1.8.0 branch with 1.5.0.12rc2
Comment 20 Jesse Ruderman 2007-12-17 17:09:42 PST
Crashtest checked in.

Note You need to log in before you can comment on or make changes to this bug.