Closed Bug 341858 Opened 18 years ago Closed 18 years ago

Crash [@ GetNifOrSpecialSibling] with display: table-caption and display: table-row-group;

Categories

(Core :: Layout: Tables, defect)

defect
Not set
critical

Tracking

()

VERIFIED FIXED

People

(Reporter: martijn.martijn, Assigned: MatsPalmgren_bugz)

References

Details

(4 keywords, Whiteboard: [sg:critical?] regression from bug 309322. verify bug 343270 also)

Crash Data

Attachments

(6 files, 1 obsolete file)

See upcoming testcase, which crashes current trunk Mozilla builds on load.
It could be that it's only crashing after second load.

It consists of this:
<table style="display: table-caption;">
<keygen style="display: table-caption;">
<span style="display: table-caption;">
<span style="display: table-row-group;">
<body style="display: table-row-group;">
<input>

Talkback ID: TB19965177Z
GetNifOrSpecialSibling   nsCSSFrameConstructor::FindFrameWithContent   nsCSSFrameConstructor::FindPrimaryFrameFor   nsFrameManager::GetPrimaryFrameFor   nsCSSFrameConstructor::FindPrimaryFrameFor   nsFrameManager::GetPrimaryFrameFor   nsCSSFrameConstructor::FindPrimaryFrameFor   nsFrameManager::GetPrimaryFrameFor   nsGenericElement::GetPrimaryFrameFor   nsGenericHTMLElement::GetFormControlFrame   nsHTMLInputElement::SaveState   0xed6ccfb8

This regressed between 2005-11-30 and 2005-12-15. Ria, you have builds inbetween to look for a narrower regression range?
This is the original file from which the testcase is derived from.

Talkback ID: TB19965429M
nsGenericHTMLElement::SetScrollLeft   nsGenericHTMLElementTearoff::SetScrollLeft   XPCWrappedNative::CallMethod  
As you can see, it's crashing in quite some different area. Not sure what that means.
With this file I crash also in a different part of the code.
Talkback ID: TB19965484Z
nsCachedStyleData::GetStyleData   nsGenericHTMLElement::GetOffsetRect   nsGenericHTMLElement::GetOffsetHeight   nsGenericHTMLElementTearoff::GetOffsetHeight   XPCWrappedNative::CallMethod
Regression between 1.9a1_2005120305 and 1.9a1_2005120315.
Can't reproduce the crash in the first build at least.
Thanks, Ria! I guess bug 309322 is the 'guilty' one. 
Blocks: 309322
I'm seeing some weird looking stacks here and in bug 343270 so I'm
marking this Security-Sensitive for now...

I have a fix for this bug and bug 343270 (which is similar) and it also
fixes the assertions in bug 337476.
I need sleep before attempting to explain it... (table frame construction)
Assignee: nobody → mats.palmgren
Group: security
OS: Windows XP → All
Hardware: PC → All
Blocks: 343270
Blocks: 337476
> I need sleep before attempting to explain it... (table frame construction)

this is a well known problem ;-)
Mats, could you please attach the patch even if it is half baked. I think I hit the issue with a less severe case at bug 343946. I really need the patch to advance with  bug 339128.
Attached patch Patch rev. 1Splinter Review
We're trying to insert a SELECT. In ContentInserted() we find an insertion
point which happens to have a CaptionFrame and thus |parentFrame| is the
outer table frame when we call ConstructFrame() for the child.
I guess this is reasonable since we haven't resolved the child's style yet
so we can't say much about the final frame for the child.

Fix: make AdjustParentFrame() deal with the outer table case, adjusting
to the inner table as needed and then look at the result in
ContentInserted() adjusting the insertion point data accordingly.
I'll attach a content dump + frame dump before and after ContentInserted()
so you can see the result with this patch. I think the frame tree is the
correct one.
(I also removed |tempItems| because it's not needed)
Attachment #228528 - Flags: superreview?(bzbarsky)
Attachment #228528 - Flags: review?(bernd_mozilla)
I understand the first chunk and think it is correct. You have however to null check the new parent frame. I have seen several occasions where the outer table was reflown without an inner. I've put there a 0 deref stop and I would like to see here one too. I would like to avoid the next fire drill. You might add an assert however. 

I don't get the 

+  // If the final parent frame (decided by AdjustParentFrame()) is different
+  // from the parent of the insertion point we calculated above then
+  // parentFrame/prevSibling/appendAfterFrame are now invalid (bug 341858).
+  if (frameItems.childList &&
+      frameItems.childList->GetParent() != parentFrame) {
+    prevSibling = nsnull;
+    isAppend = PR_TRUE;
+    parentFrame =
+      ::AdjustAppendParentForAfterContent(mPresShell->GetPresContext(),
+                                          aContainer,
+                                          frameItems.childList->GetParent(),
+                                          &appendAfterFrame);
+  }
+
part. Wouldn't we hit this thing with normal  (pseudo) frame construction also 
when we execute in nsCSSFrameConstructor::AdjustParentFrame(nsFrameConstructorState&     aState,

aParentFrame = aState.mPseudoFrames.mCellInner.mFrame; ? 

If so, why did this not cause issues before?

The problem here is that in former times we did forbid the creation of a second caption, however this causes crashes with fixed pos etc. Now that a caption can have a sibling content we need to put them sometimes on a different place.
(In reply to comment #11)
> I understand the first chunk and think it is correct.
> You have however to null check the new parent frame.

I'll add an assertion. I think we should maintain the
invariant that an outer table frame always has an inner.


> I have seen several occasions where the outer table
> was reflown without an inner.

Please file a bug and CC me when that occurs.


> + // If the final parent frame (decided by AdjustParentFrame()) is different
> + // from the parent of the insertion point we calculated above then
>
> Wouldn't we hit this thing with normal (pseudo) frame construction also ...
> If so, why did this not cause issues before?

In case we create pseudo frames |frameItems.childList| is the top pseudo,
otherwise it's just the primary frame for the child.
In both cases |frameItems.childList->GetParent()| should be |parentFrame|.
This is the case uptil now. This patch adds the possibility that the new
frames (pseudo or not) is actually created in the inner table frame when the
original |parentFrame| is an outer table frame (which would only occur
in the specific case we have in this bug: trying to add a sibling caption,
in all other cases ConstructFrame() will be given the inner table frame and
we will "adjust" captions after the fact by inserting them in the outer).

I could have missed something though, so please elaborate if you think
there is a case where this is not true.

I can add the following assertion in the if-block if you want:

NS_ASSERTION(parentFrame->GetType() == nsLayoutAtoms::tableOuterFrame &&
  frameItems.childList->GetParent()->GetType() == nsLayoutAtoms::tableFrame,
  "bad parent frame change");
So wait.  Isn't the real issue that we end up with the caption as a "sane" prevsibling or nextsibling for the select?  In my opinion this is wrong.  Maybe we should just special-case this and not use the prevsibling to get the parent if the type of the prevsibling is tableCaptionFrame?  Would that fix things too?  Seems like it'd be simpler.

That said, it seems like nsCSSFrameConstructor::IsValidSibling should handle that already.  How is it failing?
Wait.  Is this trunk-only?  On trunk, IsValidSibling is semi-busted.  See bug 347898 comment 7.  So maybe this patch is indeed the right way to go.
Blocks: 347898
Flags: blocking1.9?
(In reply to comment #14)
> Wait.  Is this trunk-only?

The testcases don't crash on the latest 1.8.1 build or 1.8.0.6 build, only trunk builds.

Only trunk accepts multiple captions, all branches have the double caption supression which has its own merrits (aka crash bugs)
Bernd, what do you think of this patch?  And my comments in bug 347898 comment 7?
Sorry for the delay but I need to debug this first to understand whats going on. However this will go after bug 347725 which is regression crash on the 1.8.1 branch.
>bz: On trunk, IsValidSibling is semi-busted.

Hmm regardless how many beer I have, I can't understand what is not busted in this function. Its plain wrong to me. It uses display instead of frame types, it does some fancy caching of things it shouldn't cache. Reading back bug 136848 where the whole was introduced is pretty entertaining. My bet is that the function is not fixable. It uses style before xbl resolution. Its completely special content agnostic. This was good enough to get NS6 out of the door but I guess it has done its duties. 

So I guess, the way Mats is heading is correct. Adjust the parent once we know (after xbl) where we are.
> Its plain wrong to me.

OK, I buy that too.  The problem is that we're determining the parentFrame before we resolve XBL stuff, and determining the parentFrame involves doing this IsValidSibling stuff, right?  Or some equivalent?
Comment on attachment 228528 [details] [diff] [review]
Patch rev. 1

Boris just head over to bug 136848 and look if you would r/sr the original patch there. (All my experience shows that the patch would be faster r-'ed than written).
IsValidSibling (
http://lxr.mozilla.org/seamonkey/source/layout/base/nsCSSFrameConstructor.cpp#8548)
is a misnomer, its just Chris'HackToGetTheTableParentRight stuffed later with the fieldset legend, which is  very similiar to tables as Chris wrote it. My prediction is that Mats approach makes the table part in IsValidSibling obsolete if not we should move the logic at the place where Mats suggested to adjust the parent. With that in mind I would prefer if we split the first part of the patch into a seperate function that would later harbor the functionality of IsvalidSibling.
Attachment #228528 - Flags: review?(bernd_mozilla) → review+
OK.  I'll try to review this in detail ASAP.  Thanks for looking into it!
Comment on attachment 228528 [details] [diff] [review]
Patch rev. 1

>Index: layout/base/nsCSSFrameConstructor.cpp
>+  PRBool childIsSpecialContent =
>+    IsSpecialContent(aChildContent, aTag, aNameSpaceID, aChildStyle);

So... IsSpecialContent() is not exactly cheap.  I'd like to avoid calling it if we can.  Could we just call it lazily (and keep a boolean that indicates we don't need to call it again or something)?  In particular, I'd like to not have IsSpecialContent() slowing down all the non-table codepaths that go through here.

>+    aParentFrame = aParentFrame->GetFirstChild(nsnull);

I'd prefer aParentFrame->GetContentInsertionFrame(), I think....

>@@ -9457,25 +9467,24 @@ nsCSSFrameConstructor::ContentInserted(n
>+  ConstructFrame(state, aChild, parentFrame, frameItems);

We should probably assert here that frameItems.firstChild == frameItems.lastChild, since we assume it in this code.

>+      frameItems = nsFrameItems();

Or frameItems.RemoveChild(frameItems.firstChild), though that may actually be slower...

>+    prevSibling = nsnull;
>+    isAppend = PR_TRUE;

So... why do we suddenly decide this is an append?  That doesn't make much sense to me.  I guess it's an easy fallback if we feel that all such cases are error cases anyway or something... But it feels like with this code it's easy to have the same DOM have two very different frame structures depending on the order of DOM operations to get to it, which is a little odd to me....
Attachment #228528 - Flags: superreview?(bzbarsky) → superreview-
Mats any news on this? Or shall I take and fix bz's comments?
>>Index: layout/base/nsCSSFrameConstructor.cpp
>>+  PRBool childIsSpecialContent =
>>+    IsSpecialContent(aChildContent, aTag, aNameSpaceID, aChildStyle);
> 
> So... IsSpecialContent() is not exactly cheap.  I'd like to avoid calling it if
> we can.  Could we just call it lazily (and keep a boolean that indicates we
> don't need to call it again or something)?  In particular, I'd like to not have
> IsSpecialContent() slowing down all the non-table codepaths that go through
> here.

fixed, now we call IsSpecialContent() only if the parent is outer table frame, which shall be very seldom and if so we reuse the flag.

>>+    aParentFrame = aParentFrame->GetFirstChild(nsnull);
> 
> I'd prefer aParentFrame->GetContentInsertionFrame(), I think....
> 
You are the boss, :-)

fixed
>>@@ -9457,25 +9467,24 @@ nsCSSFrameConstructor::ContentInserted(n
>>+  ConstructFrame(state, aChild, parentFrame, frameItems);
> 
> We should probably assert here that frameItems.firstChild ==
> frameItems.lastChild, since we assume it in this code.
> 
Assertion added
>>+      frameItems = nsFrameItems();
> 
> Or frameItems.RemoveChild(frameItems.firstChild), though that may actually be
> slower...
> 
>>+    prevSibling = nsnull;
>>+    isAppend = PR_TRUE;
> 
> So... why do we suddenly decide this is an append?  That doesn't make much
> sense to me.  I guess it's an easy fallback if we feel that all such cases are
> error cases anyway or something... But it feels like with this code it's easy
> to have the same DOM have two very different frame structures depending on the
> order of DOM operations to get to it, which is a little odd to me....
In principle you are right but in pseudo reality we handle non of the pseudo DOM operations correctly. So there is no hope at this place to recover the correct insertion point.
Attached patch Patch rev2 (obsolete) — Splinter Review
Patch with review comments addressed (I hope so).
Attachment #242765 - Flags: superreview?(bzbarsky)
https://bugzilla.mozilla.org/attachment.cgi?id=242765&action=diff#mozilla/layout/base/nsCSSFrameConstructor.cpp_sec2
is a reminder in my tree not wrong in general but not part of the issue here.
IIUIC there is no firstChild for frameitems.
Attachment #242775 - Flags: superreview?(bzbarsky)
Attachment #242765 - Attachment is obsolete: true
Attachment #242765 - Flags: superreview?(bzbarsky)
Comment on attachment 242775 [details] [diff] [review]
rev2 that even compiles

> So there is no hope at this place to recover the correct insertion point.

OK.  File a followup bug to fix that....  It should probably depend on some of the frame construction arch bugs we have.
Attachment #242775 - Flags: superreview?(bzbarsky) → superreview+
Mats, I am not certain what is going on your side but I need this thing moving. I would prefer if you would check it in as it is your patch. I have enough cvs blame that I prefer not to watch tinderbox ;-) and don't wan't to steal patches. If you could just comment whether there is a issue with the patch or you will check in and when, I would feel much better.

Thanks
fixed on trunk
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Flags: blocking1.9?
None of the three testcases crashed for me using build 2006-11-13-09 of SeaMonkey trunk under Windows XP.

Verified FIXED.
Status: RESOLVED → VERIFIED
If bug 309322 lands on the branches then this will be needed too.
Flags: wanted1.8.1.x+
Flags: wanted1.8.0.x+
Flags: blocking1.8.1.4?
Flags: blocking1.8.0.12?
Whiteboard: [sg:critical?] regression from bug 309322
Flags: blocking1.8.1.4?
Flags: blocking1.8.1.4+
Flags: blocking1.8.0.12?
Flags: blocking1.8.0.12+
And if we do land bug 309322 on the branch we need to make sure bug 343270 come back or is still fixed by this patch.
Whiteboard: [sg:critical?] regression from bug 309322 → [sg:critical?] regression from bug 309322. verify bug 343270 also
Moving to 1.8.1.5 following bug 309322
Flags: blocking1.8.1.5+
Flags: blocking1.8.1.4+
Flags: blocking1.8.0.13+
Flags: blocking1.8.0.12+
Blocks: 330981
Moving to 1.8.1.6 following bug 309322
Flags: blocking1.8.1.5+ → blocking1.8.1.6+
Flags: in-testsuite?
Flags: blocking1.8.0.13+ → blocking1.8.0.14+
fix checked into branch with bug 309322
Keywords: fixed1.8.1.8
verified fixed 1.8.1.8 using Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
rv:1.8.1.8) Gecko/2007100816 Firefox/2.0.0.8 and Mozilla/5.0 (Macintosh; U;
Intel Mac OS X; ja-JP-mac; rv:1.8.1.8) Gecko/2007100816 Firefox/2.0.0.8 and the
testcase from this bug.

-> no crash on testcase -> adding verified keyword
Group: security
Flags: blocking1.8.0.14+ → blocking1.8.0.15?
Flags: blocking1.8.0.15? → blocking1.8.0.15+
Added crashtest: http://hg.mozilla.org/mozilla-central/rev/4d4e400e1301
Flags: in-testsuite? → in-testsuite+
Crash Signature: [@ GetNifOrSpecialSibling]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: