Closed Bug 812822 Opened 7 years ago Closed 7 years ago

"ASSERTION: Must be a box frame" and crash with fieldset, overflow: auto|scroll, & display: flex|inline-flex

Categories

(Core :: Layout, defect, critical)

defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla22

People

(Reporter: jruderman, Assigned: dholbert)

References

(Blocks 2 open bugs)

Details

(Keywords: assertion, crash, testcase, Whiteboard: [fuzzblocker])

Crash Data

Attachments

(7 files, 4 obsolete files)

Attached file testcase
With:
  user_pref("layout.css.flexbox.enabled", true);

The testcase hits:

###!!! ASSERTION: Must be a box frame!: '!mScrollCornerBox || mScrollCornerBox->IsBoxFrame()', file layout/generic/nsGfxScrollFrame.cpp, line 3506

###!!! ASSERTION: A box layout method was called but InitBoxMetrics was never called: 'metrics', file layout/generic/nsFrame.cpp, line 8124

Crash [@ nsFrame::BoxReflow]
Attached file stacks
Nightly: bp-dcdd7db0-a875-43a8-aa06-6b2fe2121117
Depends on: css3-flexbox
On Windows: bp-27231b3e-9f59-497f-b4f3-309de2121117.
Crash Signature: [@ nsFrame::BoxReflow(nsBoxLayoutState&, nsPresContext*, nsHTMLReflowMetrics&, nsRenderingContext*, int, int, int, int, bool)]
OS: Mac OS X → All
Hardware: x86_64 → All
This is a null-deref.

The null pointer is "metrics", whose value comes from GetBoxMetrics(), which returns the object that InitBoxMetrics() creates. (and in this case InitBoxMetrics was never called, per the second assertion in comment 0, so it's returning null)

I'll look into this more in the next few days. Taking.
Assignee: nobody → dholbert
Status: NEW → ASSIGNED
I think this happens because we don't do anything special for "overflow" on flex container frames. I suspect bug 782441 will fix this.
Depends on: 782441
Summary: "ASSERTION: Must be a box frame" and crash with inline-flex → "ASSERTION: Must be a box frame" and crash with fieldset, overflow: auto|scroll, & display: flex|inline-flex
Whiteboard: [Note dependency bug 782441]
OK, so it turns out the issue here isn't overflow-support-dependent after all.  We're actually never even creating a nsFlexContainerFrame, because <legend> mandates that it get a nsLegendFrame, so this testcase almost doesn't exercise any flexbox code at all.

The one chunk of relevant flexbox code that it *does* exercise is the ApplyStyleFixups() code that makes children-of-a-flex-container compute to "display:block".  This causes trouble because the <legend> frame tries to make itself into a XUL scrollframe, with its child-scrollcorners having NS_STYLE_DISPLAY_BOX, and our ApplyStyleFixups() code converts them to "display:block", and then things go wrong when we get around to reflowing them.

If I prevent NS_STYLE_DISPLAY_BOX from being blockified in ApplyStyleFixups(), then the assertions & crash here go away.

So, I think we need some way to figure out that, even though our parent is "display:flex", it's not actually going to be a flex container, so we don't want to blockify our style... or something.
Blocks: 783415
No longer depends on: 782441
Whiteboard: [Note dependency bug 782441]
Attached patch hackaround (not an actual fix) (obsolete) — Splinter Review
Here's a hackaround-patch that fixes the assertion & crash.

This is almost certainly not the right fix; I'm just posting it to illustrate my previous comment.
Also, for reference, the content nodes whose style-resolution hits this issue are anonymous-content-nodes generated from the NS_TrustedNewXULElement calls for scroll widgets around here:
https://mxr.mozilla.org/mozilla-central/source/layout/generic/nsGfxScrollFrame.cpp?rev=504d70c1ca9c#2633
Actually... maybe an extended version of the hackaround (e.g. using IsXULDisplayType) wouldn't be such a bad idea.  At least for the non-"inline" XUL display types, it seems like we should support them as flex items rather than force-converting them to "display:block".

The only reason we call EnsureBlockDisplay there is to honor this chunk of flexbox spec-text:
> The computed ‘display’ of a flex item is determined by
> applying the table in CSS 2.1 Chapter 9.7. [ http://www.w3.org/TR/CSS2/visuren.html#dis-pos-flo ]

...and technically XUL display types would fall into the "others: same as specified" category there, so it's not too crazy to make them skip the forced blockification here. (Though it'd make us diverge w.r.t. what "float"/positioning does vs. what being a flex item does)
Attached patch fix v1 (obsolete) — Splinter Review
Actually, maybe whitelisting DISPLAY_BOX (as the attached hackaround does) is better than whitelisting XUL elements in general... because we know there can be anonymous children with DISPLAY_BOX which need to keep that display type (these scrollbox controls), but that may not be true of other XUL display types. (I don't know) & probably better to err on the side of being conservative.

Or there may be a better solution altogether... bz/dbaron, any ideas?
Attachment #698333 - Attachment is obsolete: true
Attachment #698347 - Flags: review?(dbaron)
Flags: in-testsuite?
Duplicate of this bug: 827076
Whiteboard: [fuzzblocker]
Hmm.  So arguably, we shouldn't be applying any sort of fixups to the scrollbar stuff at all, since it's not normal layout...

Unfortunately we don't know at that point what element our style context is for, right?

I wonder whether it makes any sense to parent the scrollbar styles to the scrollframe's anon box style instead of the actual element style or something.
Comment on attachment 698347 [details] [diff] [review]
fix v1

It feels a little odd to me to skip display: -moz-box but not skip the other XUL display types (-moz-inline-box, -moz-grid, -moz-inline-grid, -moz-grid-group, -moz-grid-line, -moz-stack, -moz-inline-stack, -moz-deck, -moz-popup, -moz-groupbox).  But I also suspect we probably don't want to deal with having all of those things inside a flexbox.

Is there a way we could make the exception apply only to the scrollbar anonymous content styles, and not to -moz-box in general?  (The problem is the scrollbar anonymous content stuff, right?)
Attachment #698347 - Flags: review?(dbaron)
(In reply to David Baron [:dbaron] from comment #12)
> It feels a little odd to me to skip display: -moz-box but not skip the other
> XUL display types (-moz-inline-box, -moz-grid, -moz-inline-grid,
> -moz-grid-group, -moz-grid-line, -moz-stack, -moz-inline-stack, -moz-deck,
> -moz-popup, -moz-groupbox).

Yeah -- I initially targeted -moz-box because that was the specific type of anonymous content that we have here.

But you're right -- that's not specific enough -- we need this to work for anonymous "display: inline" content, too, as noted in bug 824297 comment 3 (for the helper-frames inside of a <button> w/ "display:flex").

> Is there a way we could make the exception apply only to the scrollbar
> anonymous content styles, and not to -moz-box in general?

I'm not sure, but I agree that's the right sort of thing to be targeting.

> (The problem is the scrollbar anonymous content stuff, right?)

Yes.
Would it work to check GetPseudoType() for an anonymous box pseudo-type?  (Maybe that's what you were getting at, at the end of comment 12?)
(er, sorry, I meant to say GetPseudo() and s/pseudo-type/pseudo-tag/)

Looks like that strategy (from comment 14) doesn't work -- not on its own, at least -- because the anonymous frames in question have a null pseudo-tag.  Maybe that's a bug?

How do we end up with a null pseudo-tag, you might ask? Well, several levels below nsCSSFrameConstructor::CreateAnonymousFrames(), we make a call to nsStylSet::GetContext with an explicitly-null pseudo-tag -- the "nullptr" in this call:
> 948   return GetContext(aParentContext, ruleNode, visitedRuleNode,
> 949                     nsCSSRuleProcessor::IsLink(aElement),
> 950                     nsCSSRuleProcessor::GetContentState(aElement, aTreeMatchContext).
> 951                       HasState(NS_EVENT_STATE_VISITED),
> 952                     nullptr, nsCSSPseudoElements::ePseudo_NotPseudoElement,
> 953                     true, aElement);

https://mxr.mozilla.org/mozilla-central/source/layout/style/nsStyleSet.cpp?rev=079d46ec65d2&mark=948-953#948

and we create the nsStyleContext inside of there.
These are frames for anonymous elements, which is not the same thing as anonymous boxes.... So yeah, they have a null pseudo-tag.
Perhaps we could just pass a flag from CreateAnonymousFrames() down through all the methods in the backtrace in comment 16, and have that flag control whether we honor the flexbox-specific chunk of ApplyStyleFixups...?  Not beautiful, but I suspect it'd do the job.
We could certainly set something in the frame constructor state or tree match context and thus sneak it all the way down to where we call ResolveStyleFor.  At that point we could just pass it in (possible as part of the tree match context).  And then the style set would need to not lose it internally.

I agree it's kinda hacky, but I have no better ideas so far.  :(
So, I'm going with passing a flag in the treeMatchContext.  To ultimately get that noticed by a newly-constructed styleContext, we have to pass a new bit of information into nsStyleSet::GetContext().  That function already takes two boolean values, and I'd rather not add a third (and add confusion about where each boolean value goes in the function-call).  In other cases (e.g. nsIFrame::GetSize()) where a function wants to take several boolean values, we merge the bools into a 'flags' argument, so that you can just pass in a bitfield w/ the right flags set and don't have to worry about order. I think we should make GetContext() use a 'flags' parameter like that -- I've got a patch to do that, and I'll post it in a helper-bug.
Depends on: 836604
(In reply to Daniel Holbert [:dholbert] from comment #20)
> That function already takes two boolean values,

(er, make that three boolean values. So this would be the fourth, if it weren't done as a flag.)
Attachment #698347 - Attachment is obsolete: true
(In reply to Daniel Holbert [:dholbert] from comment #20)
> To ultimately
> get that noticed by a newly-constructed styleContext, we have to pass a new
> bit of information into nsStyleSet::GetContext().

This first patch adds this flag "eIgnoreFlexItemStyleFixup" to GetContext()'s list of allowed flags, and makes GetContext() pass it down (as a bool) to NS_NewStyleContext(), which passes it to the nsStyleContext() constructor, which passes it to ApplyStyleFixups().

The next patch will add code above this, to actually pass eIgnoreFlexItemStyleFixup into GetContext().
Attachment #709352 - Flags: review?(bzbarsky)
Blocks: 824297
I don't think flex containers shouldn't have pseudo-element children.[1]

But it's possible we'll have an element that has "display:flex" and yet is not a flex container because it's got a mandated frame type, e.g. a <button>, and *those* elements might have pseudo-element children.  We need to make sure that any such children ignore flex item style fixup.  So, this patch makes us pass in the new flag "eIgnoreFlexItemStyleFixup" from the functions that create style contexts for pseudo elements.

This should fix bug 824297 & bug 827168 (but I'm posting the patch here, because it's tied up in the stuff we're fixing here).

[1] http://lists.w3.org/Archives/Public/www-style/2013Jan/0056.html
Attachment #709420 - Flags: review?(bzbarsky)
(In reply to Daniel Holbert [:dholbert] from comment #23)
> I don't think flex containers shouldn't have pseudo-element children.[1]

Sorry -- meant to say "I don't think flex containers should be able to have pseudo-element children."
...and this part adds a guard object "AutoFlexItemStyleFixupIgnorer" that sets a flag on the TreeMatchContext while it's in scope, and declares an instance of this object in CreateAnonymousFrames.

(This might be overkill -- we could also just have CreateAnonymousFrames directly instantiate a local AutoRestore<bool> that controls the flag.  Let me know if you'd prefer that.  I'm currently going with a helper-class because it looks like nsCSSFrameConstructor doesn't directly mess with any TreeMatchContext member-variables anywhere else, and I didn't want to break that layer of abstraction here.)
Attachment #709421 - Flags: review?(bzbarsky)
Attached patch part 4: crashtests (obsolete) — Splinter Review
This patch adds crashtests, using the testcase from this bug, bug 824297, and bug 827168.

I verified that they all fail before this bug's patches are applied, but pass after. (This bug's crashtest crashes, and the other two crashtests fail the assertions that are mentioned in the corresponding bugs.)
Attachment #709422 - Flags: review?(bzbarsky)
Blocks: 826532
Blocks: 827168
Added one more crashtest for bug 826532, which this also fixes.

(It's very similar to other crashtests here, but it triggers a slightly different set of assertions.)
Attachment #709422 - Attachment is obsolete: true
Attachment #709422 - Flags: review?(bzbarsky)
Attachment #709916 - Flags: review?(bzbarsky)
Sorry for the lag here... At work week, so slightly less review time than usual.  :(
Comment on attachment 709352 [details] [diff] [review]
part 1: Add flag "eIgnoreFlexItemStyleFixup" to nsStyleSet::GetContext()

I think aIgnoreFlexItemStyleFixup might be better named aSkipFlexItemStyleFixup.

And perhaps s/ignore/skip/ in the comments for the nsStyleContext constructor, and naming eIgnoreFlexItemStyleFixup "eSkipFlexItemStyleFixup" instead.

r=me with that.  And I'm really sorry for the long lag on this.  :(
Attachment #709352 - Flags: review?(bzbarsky) → review+
Comment on attachment 709420 [details] [diff] [review]
part 2: Pass eIgnoreFlexItemStyleFixup when making style context for a pseudo element

r=me
Attachment #709420 - Flags: review?(bzbarsky) → review+
Comment on attachment 709421 [details] [diff] [review]
part 3: add TreeMatchContext::AutoFlexItemStyleFixupIgnorer, and use it in CreateAnonymousFrames

So this is a bit worrisome because this will skip the fixup on all descendants of the anonymous content, no?  It may be better to only do the suppression while doing the AddFrameConstructionItemsInternal in ConstructFrame (which is the thing that resolves the style context for the anon content) but not during the ConstructFramesFromItem call there...

I'd like to see something a little more targeted here.
Attachment #709421 - Flags: review?(bzbarsky) → review-
Comment on attachment 709916 [details] [diff] [review]
part 4 v2: crashtests (with one more)

r=me
Attachment #709916 - Flags: review?(bzbarsky) → review+
(In reply to Boris Zbarsky (:bz) from comment #32)
> I'd like to see something a little more targeted here.

So, we *could* just instantiate an AutoFlexItemStyleFixupIgnorer in ConstructFrame, around the AddFrameConstructionItemsInternal call.  (that seems to be what you were suggesting in comment 32)

That implicitly assumes that ConstructFrame is only used for anonymous content, which is currently a valid assumption since that's the sole caller[1], but I'm not sure if it always will be true.

Maybe I'll do that, and add an assertion that this is anonymous content, so we'll catch this if more callers are added later w/ different expectations about whether flex-item-fixup should happen.

[1] https://mxr.mozilla.org/mozilla-central/search?string=ConstructFrame%28&case=1
> but I'm not sure if it always will be true.

It should be; I've killed all the other uses.  We can add asserts to make sure it's working on native anon content if we want, and document that that's what it's for....
This does what I described in comment 34, w/ an assertion that we've got anonymous content.  I also changed "ignore" to "skip" in all of the variable/class-names & comments, per comment 30)
Attachment #714192 - Flags: review?(bzbarsky)
Attachment #709421 - Attachment is obsolete: true
Depends on: 841876
Comment on attachment 714192 [details] [diff] [review]
part 3 v2: more targeted, w/ assertion and s/ignore/skip/

>+  NS_PRECONDITION(aContent->HasFlag(NODE_IS_ANONYMOUS),

Can you make that aContent->IsRootOfNativeAnonymousSubtree() or at least aContent->IsInNativeAnonymousSubtree()?

r=me with that
Attachment #714192 - Flags: review?(bzbarsky) → review+
Changed to IsRootOfNativeAnonymousSubtree(). Try push w/ that changed:
  https://tbpl.mozilla.org/?tree=Try&rev=96c0190aadaf
Last try push had a typo in crashtest manifest. Fixed:
 https://tbpl.mozilla.org/?tree=Try&rev=9e8b428c00aa
Note to self: dependent bugs can likely all be marked fixed and in-testsuite+ when this is resolved (since the last cset in prev comment included crashtests for some or all of them).
Duplicate of this bug: 824297
Duplicate of this bug: 826532
Duplicate of this bug: 827168
Blocks: 844529
Blocks: 849407
Blocks: 841876
No longer depends on: 841876
Blocks: 851396
Blocks: 867454
Blocks: 926670
You need to log in before you can comment on or make changes to this bug.