Closed Bug 1099807 Opened 5 years ago Closed 5 years ago

Meet the specification for intra-level whitespace pairing in ruby

Categories

(Core :: Layout, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla37

People

(Reporter: xidorn, Assigned: xidorn)

References

()

Details

Attachments

(5 files, 7 obsolete files)

10.16 KB, patch
dholbert
: review+
Details | Diff | Splinter Review
6.79 KB, patch
dholbert
: review+
Details | Diff | Splinter Review
6.33 KB, patch
dholbert
: review+
Details | Diff | Splinter Review
9.56 KB, patch
dholbert
: review+
Details | Diff | Splinter Review
7.65 KB, patch
dholbert
: review+
Details | Diff | Splinter Review
The spec defines some special pairing rules for intra-level whitespaces.
Assignee: nobody → quanxunzhen
Attachment #8530069 - Flags: review?(dholbert)
Attachment #8530072 - Flags: review?(dholbert)
Comment on attachment 8530069 [details] [diff] [review]
patch 1 - add common superclass for ruby base and ruby text

>+++ b/layout/generic/nsRubyContentFrame.h
>@@ -0,0 +1,29 @@
>+/* -*- Mode: C++; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
>+/* vim: set ts=2 et sw=2 tw=80: */
>+/* This Source Code is subject to the terms of the Mozilla Public License
>+ * version 2.0 (the "License"). You can obtain a copy of the License at
>+ * http://mozilla.org/MPL/2.0/. */
>+
>+#ifndef nsRubyContentFrame_h___
>+#define nsRubyContentFrame_h___

We need a short header-comment between the license block and the #ifndef here, to describe the purpose of this class/file. This comment is particularly handy, because it gets picked up by MXR and displayed alongside the file in directory-listings -- see the comments on the right side here:
 https://mxr.mozilla.org/mozilla-central/source/layout/generic/

Using nsSplittableFrame & nsContainerFrame as a reference for an abstract frame base-class, the comment here should say something like:
 /*
  * base class for ruby rendering objects that directly contain content
  * (ruby text & ruby bases)
  */

The comment should go in both the .h and the .cpp file, just after the license block.

>+class nsRubyContentFrame : public nsRubyContentFrameSuper
>+{
>+public:
>+  NS_DECL_FRAMEARENA_HELPERS
>+  NS_DECL_QUERYFRAME_TARGET(nsRubyContentFrame)
>+  NS_DECL_QUERYFRAME

You don't need the QUERYFRAME stuff here, unless you're actually intending to use do_QueryFrame for safe (semi-expensive) casts to this frame-type via do_QueryFrame. I do see one do_QueryFrame usage in your next patch here (in an assertion), but I think we should just check GetType() there (more details on that in my comments on that patch).

So, I think we should drop the NS_DECL_QUERYFRAME lines here, as well as the NS_QUERYFRAME_* block in nsRubyContentFrame.cpp.

>diff --git a/layout/generic/nsRubyTextFrame.cpp b/layout/generic/nsRubyTextFrame.cpp
> // Frame class boilerplate
> // =======================
> 
> NS_QUERYFRAME_HEAD(nsRubyTextFrame)
>   NS_QUERYFRAME_ENTRY(nsRubyTextFrame)
>-NS_QUERYFRAME_TAIL_INHERITING(nsInlineFrame)
>+NS_QUERYFRAME_TAIL_INHERITING(nsRubyTextFrameSuper)

(FWIW: this change is still good & needed, in both of the concrete classes, even with the NS_QUERYFRAME stuff removed from the superclass.)
Comment on attachment 8530072 [details] [diff] [review]
patch 2 - implement intra-level whitespace pairing

> # HG changeset patch
> # Parent ed4d0368bcb9972a0087072988076ff9a12f2bbd
> # User Xidorn Quan <quanxunzhen@gmail.com>
> Bug 1099807 - Implement intra-level whitespace pairing.

Maybe want "part 2" in the commit message? (and "part 1" in the first part)

> diff --git a/layout/generic/nsRubyBaseContainerFrame.cpp
b/layout/generic/nsRubyBaseContainerFrame.cpp
> --- a/layout/generic/nsRubyBaseContainerFrame.cpp
> +++ b/layout/generic/nsRubyBaseContainerFrame.cpp
[...]
> void
> PairEnumerator::Next()
> {
[...]
> +    nsRubyContentFrame* frame = mFrames[i];
> +    if (frame && (!mAtIntraLevelWhitespace ||
> +                  frame->IsIntraLevelWhitespace())) {
> +      nsIFrame* nextSibling = frame->GetNextSibling();
> +      MOZ_ASSERT(nextSibling ==
> + static_cast<nsRubyContentFrame*>(do_QueryFrame(nextSibling)));
> +      mFrames[i] = frame = static_cast<nsRubyContentFrame*>(nextSibling);

Per previous comment, I don't think we need do_QueryFrame here -- can we instead assert that nextSibling->GetType() == frame->GetType()?

(i.e. is it true that the next-sibling of a nsRubyTextFrame must *also* be nsRubyTextFrame, and the same for ruby base frames? If so, that seems like a stronger assertion, and it removes the need for nsRubyContentFrame to be QueryFrame-target.)

> +      if (!nextHaveIntraLevelWhitespace &&
> +          frame && frame->IsIntraLevelWhitespace()) {
> +        nextHaveIntraLevelWhitespace = true;
> +      }
>     }
>   }
> +  // An intra-level whitespace pair should never be followed by another.
> +  MOZ_ASSERT(!nextHaveIntraLevelWhitespace || !mAtIntraLevelWhitespace);

Where possible/convenient, it's best to pass a warning-message/explanation inside the MOZ_ASSERT() itself.

So, let's turn the comment into a string passed to the MOZ_ASSERT here -- e.g. maybe:
  MOZ_ASSERT(!nextHaveIntraLevelWhitespace || !mAtIntraLevelWhitespace,
             "Should never have adjacent intra-level whitespace pairs");
(In reply to Daniel Holbert [:dholbert] from comment #4)
> Comment on attachment 8530072 [details] [diff] [review]
> patch 2 - implement intra-level whitespace pairing
> 
> > # HG changeset patch
> > # Parent ed4d0368bcb9972a0087072988076ff9a12f2bbd
> > # User Xidorn Quan <quanxunzhen@gmail.com>
> > Bug 1099807 - Implement intra-level whitespace pairing.
> 
> Maybe want "part 2" in the commit message? (and "part 1" in the first part)

It seems other people do not require that... IMO it makes the first line longer without adding information, since patches in the same bug are usually in a same push.

> > diff --git a/layout/generic/nsRubyBaseContainerFrame.cpp
> b/layout/generic/nsRubyBaseContainerFrame.cpp
> > --- a/layout/generic/nsRubyBaseContainerFrame.cpp
> > +++ b/layout/generic/nsRubyBaseContainerFrame.cpp
> [...]
> > void
> > PairEnumerator::Next()
> > {
> [...]
> > +    nsRubyContentFrame* frame = mFrames[i];
> > +    if (frame && (!mAtIntraLevelWhitespace ||
> > +                  frame->IsIntraLevelWhitespace())) {
> > +      nsIFrame* nextSibling = frame->GetNextSibling();
> > +      MOZ_ASSERT(nextSibling ==
> > + static_cast<nsRubyContentFrame*>(do_QueryFrame(nextSibling)));
> > +      mFrames[i] = frame = static_cast<nsRubyContentFrame*>(nextSibling);
> 
> Per previous comment, I don't think we need do_QueryFrame here -- can we
> instead assert that nextSibling->GetType() == frame->GetType()?
> 
> (i.e. is it true that the next-sibling of a nsRubyTextFrame must *also* be
> nsRubyTextFrame, and the same for ruby base frames? If so, that seems like a
> stronger assertion, and it removes the need for nsRubyContentFrame to be
> QueryFrame-target.)

Yes, it's true. I'll change that.

> > +      if (!nextHaveIntraLevelWhitespace &&
> > +          frame && frame->IsIntraLevelWhitespace()) {
> > +        nextHaveIntraLevelWhitespace = true;
> > +      }
> >     }
> >   }
> > +  // An intra-level whitespace pair should never be followed by another.
> > +  MOZ_ASSERT(!nextHaveIntraLevelWhitespace || !mAtIntraLevelWhitespace);
> 
> Where possible/convenient, it's best to pass a warning-message/explanation
> inside the MOZ_ASSERT() itself.
> 
> So, let's turn the comment into a string passed to the MOZ_ASSERT here --
> e.g. maybe:
>   MOZ_ASSERT(!nextHaveIntraLevelWhitespace || !mAtIntraLevelWhitespace,
>              "Should never have adjacent intra-level whitespace pairs");

OK
(In reply to Xidorn Quan [:xidorn] (UTC+11) from comment #5)
> It seems other people do not require that... IMO it makes the first line
> longer without adding information,

Sure -- not required, but just suggesting because I personally find it useful.  I think it definitely adds information, for the use-case where someone views your changeset directly (at hg.mozilla.org/mozilla-central/rev/whatever) -- which is an important use-case to consider (particularly for people going back through hg-blame / file-history in the future). "Part N" makes it clearer that this is just one component of a multi-stage bug, which hints that the viewer should dig further if they want to find out what the *real* point of the changeset is. (it's serving a larger goal than just what its commit message says)

> patches in the same bug are usually in a same push.

Not necessarily -- particularly in cases where one piece is backed out & re-landed, which does happen from time to time. Then the re-landed part won't necessarily be adjacent to the other parts in the commit-log, which then strongly gives the mistaken impression that it's the only commit for that bug.

Also: hg archeologists will generally be using m-c, where there aren't per-bug pushes -- just huge "merge m-i to m-c" pushes, which group together tons of csets. (so there's no way to view "just the push for this bug" there.)

Anyway: I do find it useful, but if you disagree, I won't push any more on it.
OK, that sounds reasonable. I'll add the part number.
Comment on attachment 8530072 [details] [diff] [review]
patch 2 - implement intra-level whitespace pairing


>+++ b/layout/generic/nsRubyBaseContainerFrame.cpp
> PairEnumerator::PairEnumerator(
[...]
> {
>   const uint32_t rtcCount = aTextContainers.Length();
>   mFrames.SetCapacity(rtcCount + 1);
[...]
>+  for (uint32_t i = 0; i <= rtcCount; i++) {
>+    nsRubyContentFrame* frame = mFrames[i];
>+    if (frame && frame->IsIntraLevelWhitespace()) {
>+      mAtIntraLevelWhitespace = true;
>+      break;
>+    }
>   }

Instead of the current "i <= rtcCount" loop-condition here, I'd prefer we use "i < mFrames.Length()" (or equivalently "iend = mFrames.Length(); i < iend" as you have elsewhere),

The current formulation feels like a slight footgun, since it looks very similar to the standard "iterate over an array, up to the array-length" idiom -- but it's subtly different, with "<=" instead of "<" and with rtcCount really being arrayLength - 1.  I'm worried that if someone tweaks/refactors this code & doesn't notice the unusual "<=" in the condition, we'll end up with an off-by-one bug and a potential security risk.  (Or if someone naively 'fixes' the "<=" to "<", we'll end up with an off-by-one bug in the other direction.) 

This change requires 1 extra call to Length(), which is cheap on a nsTArray -- it's a small price to pay for improved readability/consistency/foolproofing. :)

(Also: technically, you should be using "size_t" as the loop-counter/index-variable for nsTArrays -- not uint32_t -- per https://groups.google.com/d/msg/mozilla.dev.platform/ZAr3fHBpeLM/mpXAas07P8sJ . Doesn't matter too much -- we have lots of existing code that uses uint32_t, and it's mostly functionally equivalent to uint32_t in this context (since IIRC nsTArrays can't grow larger than max-uint32), but there are some cases where uint32_t will get you into trouble, as laid out in that post. Anyway, the uint32_t usage is fine for now, but maybe file a followup on doing this conversion for the ruby frame classes?)
I'm fine with changing that to size_t. And I also believe that we will transit to range-based for loop in most of such cases once all the compilers support :)
While you're reworking the PairEnumerator class, can you add documentation for it? In particular, it needs:
 - documentation for what the class is for, at a high level
 - documentation for its constructor (particularly the relationship between its args)
 - documentation for what the "Next()" function does (it looks like it advances every frame in mFrames to its next sibling, basically?)

Semi-tangent: I think the "PairEnumerator" class might really want to be called "SegmentEnumerator" or "RubyColumnEnumerator"...  The term "Pair" in "PairEnumerator" is a bit confusing, because it sounds like we're enumerating pairs of things, but we're not. (We are "pairing", on a many-to-one basis, but we're definitely not "enumerating pairs".)  The spec says:
  # Once pairing is complete, ruby "column" units are defined, each [with] a
  # single ruby base and one ruby annotation [...] from each annotation level.
...and it looks to me like this Enumerator is walking across these ruby "column" objects.  (I think they're also known as "Segments" -- I'm unclear on the difference (if any) between a segment and a column.) So maybe we should be using the term "Segment" or "Column" here? (That could be done in a separate bug; just mentioning it here while I'm trying to grok this code & how it maps to the spec.)
Ah... I think I should have found that term when I discussed with dbaron in bug 1052924... He suggested me use pair instead of column...
OK I think it worth an independent patch to rename "pair" back to "column".

To clarify, "column" is the object consists of one ruby base and all ruby annotations paired with it, and "segment" is one ruby base *container* and all ruby annotation *containers* paired with it. There is a SegmentEnumerator in nsRubyFrame for enumerating ruby segments.
(In reply to Xidorn Quan [:xidorn] (UTC+11) from comment #11)
> Ah... I think I should have found that term when I discussed with dbaron in
> bug 1052924... He suggested me use pair instead of column...
> OK I think it worth an independent patch to rename "pair" back to "column".

Looks like he was concerned that it might be confused with multi-column terminology, per bug 1052924 comment 43.  I think that's a valid concern -- that's why I was thinking "RubyColumnEnumerator" (where "ruby columns" are the things being enumerated, not just "columns").

And if you add a link to http://dev.w3.org/csswg/css-ruby/#ruby-layout in the RubyColumnEnumerator's documentation, it'll connect this directly & clearly to a term defined in the spec ('ruby “column”'), which I think would make dbaron comfortable with this. (since he said he wanted the code to match the spec terminology.)

Anyway, this rename can all happen in a separate bug, right. (I'd still like to see some documentation added in this bug (or before this bug lands), though, per first part of comment 10, to make sure I'm understanding the code's assumptions/intentions/goals correctly.)

> To clarify, "column" is the object consists of one ruby base and all ruby
> annotations paired with it, and "segment" is one ruby base *container*

Gotcha -- thanks! My eyes keep accidentally filtering out the word "container" when I'm reading the spec. :)
Attachment #8537609 - Flags: review?(dholbert)
Comment on attachment 8530072 [details] [diff] [review]
patch 2 - implement intra-level whitespace pairing

>+++ b/layout/generic/nsRubyBaseContainerFrame.cpp
>@@ -57,68 +57,111 @@ class MOZ_STACK_CLASS PairEnumerator
> {
[...]
>+  // The returned value includes the base level
>+  inline uint32_t GetLevelCount() const { return mFrames.Length(); }
>+
>+  inline nsRubyContentFrame* GetFrame(uint32_t aIndex) const;

Since GetFrame() isn't a one-liner (as of this patch), it probably shouldn't be made inline. (We should be able to trust the compiler to inline it automatically, if it's a good idea.  It's unclear to me that we gain anything by *forcing* it to be inline.)

> PairEnumerator::PairEnumerator(
>+  nsRubyBaseContainerFrame* aBaseContainer,
>+  const nsTArray<nsRubyTextContainerFrame*>& aTextContainers)
>+  : mAtIntraLevelWhitespace(false)
> {
[...]
>+  // The first pair can be an intra-level whitespace in two cases:
>+  // 1. an inter-segment whitespace;
>+  // 2. there was a line break before this intra-level whitespace.
>+  for (uint32_t i = 0; i <= rtcCount; i++) {
>+    nsRubyContentFrame* frame = mFrames[i];
>+    if (frame && frame->IsIntraLevelWhitespace()) {
>+      mAtIntraLevelWhitespace = true;
>+      break;
>+    }

I don't understand case (2) here (and I don't see how it relates to the code).

This comment seems to say:
  Something can be a $FOO in two cases:
   (1) ...
   (2) There was a line break before this $FOO.

This doesn't make sense to me -- I don't see how there being a linebreak before this $FOO is a way for something to "be a $FOO".

(In this case $FOO is "intra-level whitespace"; I'm just abstracting it away to show the logical confusion.)

Anyway -- can you clarify this comment?

> void
> PairEnumerator::Next()
> {
>+  bool nextHaveIntraLevelWhitespace = false;

"nextHave" seems incorrect, grammatically.  Maybe rename to "advancingToIntraLevelWhitespace"?  For me at least, that seems easier to understand in the context of this function.


>   for (uint32_t i = 0, iend = mFrames.Length(); i < iend; i++) {
>-    if (mFrames[i]) {
>-      mFrames[i] = mFrames[i]->GetNextSibling();
>+    nsRubyContentFrame* frame = mFrames[i];
>+    if (frame && (!mAtIntraLevelWhitespace ||
>+                  frame->IsIntraLevelWhitespace())) {

What I'm understanding from this check (after staring at it & the spec) is:
 - If we've got one (or more) intra-level whitespace frames at some level(s) in our current ruby column, then we'll "fake" an anonymous box for all the other levels, **for this column**. SO, when we advance *off* of the intra-level whitespace frame, we don't advance any of the other frames, because we're just pretending to advance those ones across a "fake" frame. Is that correct?

In any case, please add a comment here explaining what's going on here; this is too opaque to be un-documented.

>+nsRubyContentFrame*
>+PairEnumerator::GetFrame(uint32_t aIndex) const
>+{
>+  nsRubyContentFrame* frame = mFrames[aIndex];
>+  return !mAtIntraLevelWhitespace ||
>+         (frame && frame->IsIntraLevelWhitespace()) ? frame : nullptr;

Add a comment explaining what's going on here, too. (Again, after staring at this and the spec, I think the idea is: if we've got an intra-level whitespace in the current column, we pretend as if we have an anonymous (null) box paired with that whitespace, in every level except those that have the intra-level whitespace -- right?)

>+++ b/layout/generic/nsRubyContentFrame.cpp
>+bool
>+nsRubyContentFrame::IsIntraLevelWhitespace() const
>+{
>+  return StyleContext()->GetPseudo() && mFrames.OnlyChild() &&
>+         mFrames.FirstChild()->GetContent()->TextIsOnlyWhitespace();
>+}

Two points:
 - Per IRC, let's capture StyleContext()->GetPseudo() in a local-var, and assert that it's either null or one of the two expected pseudos here. (Or maybe early-return if it's null, and then assert that it's one of the two expected pseudos.)

 - I think I'd marginally prefer that we capture mFrames.OnlyChild() in a local-var and then (if it's non-null) use it for the TextIsOnlyWhitespace() call, instead of calling FirstChild(). (FirstChild() seems a little unnecessary, after OnlyChild has already returned that frame for us.)  This is a bit more verbose, though, so it's probably just a matter of taste; I think it's fine the way you've got it too.

>+++ b/layout/generic/nsRubyContentFrame.h
>@@ -16,14 +16,16 @@ class nsRubyContentFrame : public nsRuby
>   // nsIFrame overrides
>   virtual bool IsFrameOfType(uint32_t aFlags) const MOZ_OVERRIDE;
> 
>+  bool IsIntraLevelWhitespace() const;

Add documentation for this function, at least hinting at what "intra-level whitespace" means.  e.g.:
  // Indicates whether this is an "intra-level whitespace" frame, i.e. an
  // anonymous box that was created to contain raw whitespace inside of a
  // ruby container or a ruby base container, and wasn't discarded for being
  // "inter-level whitespace". (This impacts ruby pairing behavior.)
  // See http://dev.w3.org/csswg/css-ruby/#anon-gen-discard-space for more.

>+++ b/layout/reftests/css-ruby/intra-level-whitespace-ref.html
[...]
>+<body>
>+  <p><ruby>
>+    <rb>a</rb><rb pseudo><span> </span></rb><rb>b</rb>
>+    <rt>x</rt><rt pseudo><span> </span></rt><rt>y</rt>
>+  </ruby></p>

It looks like the "pseudo" attributes here are non-functional here, right? (looks like they only exist as hints for the reader, about what these elements are expecting to match in the testcase)

Please add an HTML <!--  --> comment briefly mentioning this, because otherwise it kinda looks like <rb pseudo> is expected to do something special.
Comment on attachment 8537609 [details] [diff] [review]
patch 0 - use "column" instead of "pair"

Thanks for doing this rename!

># Parent  8401afdb6e6ce386dabe1138af96f38d24de48b3
>Bug 1099807 part 0 - Use "column" instead of "pair" to match the spec term.

This is a bit too vague. (Seeing this in the hg log, I'd have no idea what part of the code it's even about.)  Should at least mention Ruby somewhere.

Maybe:
  Bug 1099807 part 0 - Use the term "column" instead of "pair" in some css ruby code, to match language in the spec.

>+++ b/layout/generic/nsRubyBaseContainerFrame.cpp
>-class MOZ_STACK_CLASS PairEnumerator
>+/**
>+ * Ruby column is a unit consists of one ruby base and all ruby
>+ * annotations paired with it.
>+ * See http://dev.w3.org/csswg/css-ruby/#ruby-pairing
>+ */
>+class MOZ_STACK_CLASS RubyColumnEnumerator

s/consists/consisting/

Also: this is a good comment -- it explains what a "ruby column" is -- but it doesn't quite function as a header-comment for the "RubyColumnEnumerator" class. Maybe add at the end (or beginning):
 This class assists in enumerating ruby columns, starting
  with the first ruby column in a ruby segment.

> public:
>-  PairEnumerator(nsRubyBaseContainerFrame* aRBCFrame,
>-                 const nsTArray<nsRubyTextContainerFrame*>& aRTCFrames);
>+  RubyColumnEnumerator(nsRubyBaseContainerFrame* aRBCFrame,
>+                       const nsTArray<nsRubyTextContainerFrame*>& aRTCFrames);

This still needs documentation (though maybe you're adding that in a different patch). I think the args here are supposed to be the frames for the first ruby column, right? (We should probably assert somewhere that this assumption holds up, too -- i.e. that none of the passed-in frames have prev-siblings.)

(And Next() still needs documentation here, too, though maybe that's coming in a different patch.)

>@@ -336,22 +342,22 @@ nsRubyBaseContainerFrame::Reflow(nsPresC
>-  // Reflow pairs excluding any span
>-  nscoord pairsISize = ReflowPairs(aPresContext, aReflowState,
>-                                   rtcReflowStates, aStatus);
>-  nscoord isize = pairsISize;
>+  // Reflow columns excluding any span
>+  nscoord columnsISize = ReflowColumns(aPresContext, aReflowState,
>+                                       rtcReflowStates, aStatus);
>+  nscoord isize = columnsISize;
> 
>-  // If there exists any span, the pairs must either be completely
>+  // If there exists any span, the columns must either be completely
>   // reflowed, or be not reflowed at all.

Use "ruby columns" in both comments here, to err on the side of being overly-explicit about the flavor of "column" that we're talking about. This applies to most mentions of "column" after this point in the patch (in code-comments) -- they should mostly be clarified to "ruby column" (mostly because side-by-side comments will be clearer from context), to avoid ambiguity. (& to make it clear that "ruby columns" are their own special thing, distinct from multicol columns)

(I'm less concerned about this higher-up in this patch, where we have "RubyColumnEnumerator" in the contextual code, because "RubyColumn" is close by the comment, clarifying the type of column. Feel free to clarify those too, though, if you like.)

> nscoord
>-nsRubyBaseContainerFrame::ReflowPairs(nsPresContext* aPresContext,
>-                                      const nsHTMLReflowState& aReflowState,
>-                                      nsTArray<nsHTMLReflowState*>& aReflowStates,
>-                                      nsReflowStatus& aStatus)
>+nsRubyBaseContainerFrame::ReflowColumns(
>+  nsPresContext* aPresContext, const nsHTMLReflowState& aReflowState,
>+  nsTArray<nsHTMLReflowState*>& aReflowStates, nsReflowStatus& aStatus)
> {

Could you keep 1-line-per-parameter on extreme-de-indented function prototypes, like the one here? With arguments starting at the far-left and extending across the full line, it's just too hard to visually parse (in my opinion at least).

> nscoord
>-nsRubyBaseContainerFrame::ReflowOnePair(nsPresContext* aPresContext,
>-                                        const nsHTMLReflowState& aReflowState,
>-                                        nsTArray<nsHTMLReflowState*>& aReflowStates,
>-                                        nsIFrame* aBaseFrame,
>-                                        const nsTArray<nsIFrame*>& aTextFrames,
>-                                        nsReflowStatus& aStatus)
>+nsRubyBaseContainerFrame::ReflowOneColumn(
>+  nsPresContext* aPresContext, const nsHTMLReflowState& aReflowState,
>+  nsTArray<nsHTMLReflowState*>& aReflowStates, nsIFrame* aBaseFrame,
>+  const nsTArray<nsIFrame*>& aTextFrames, nsReflowStatus& aStatus)
> {

Same here -- I find the parameter list really hard to read/skim here.  It was much cleaner-looking before -- let's keep this one param per line (while de-indenting it).

r=me with the above.
Attachment #8537609 - Flags: review?(dholbert) → review+
(In reply to Daniel Holbert [:dholbert] from comment #14)
> Comment on attachment 8530072 [details] [diff] [review]
> patch 2 - implement intra-level whitespace pairing
> 
> > PairEnumerator::PairEnumerator(
> >+  nsRubyBaseContainerFrame* aBaseContainer,
> >+  const nsTArray<nsRubyTextContainerFrame*>& aTextContainers)
> >+  : mAtIntraLevelWhitespace(false)
> > {
> [...]
> >+  // The first pair can be an intra-level whitespace in two cases:
> >+  // 1. an inter-segment whitespace;
> >+  // 2. there was a line break before this intra-level whitespace.
> >+  for (uint32_t i = 0; i <= rtcCount; i++) {
> >+    nsRubyContentFrame* frame = mFrames[i];
> >+    if (frame && frame->IsIntraLevelWhitespace()) {
> >+      mAtIntraLevelWhitespace = true;
> >+      break;
> >+    }
> 
> I don't understand case (2) here (and I don't see how it relates to the
> code).
> 
> This comment seems to say:
>   Something can be a $FOO in two cases:
>    (1) ...
>    (2) There was a line break before this $FOO.
> 
> This doesn't make sense to me -- I don't see how there being a linebreak
> before this $FOO is a way for something to "be a $FOO".
> 
> (In this case $FOO is "intra-level whitespace"; I'm just abstracting it away
> to show the logical confusion.)
> 
> Anyway -- can you clarify this comment?

I just want to justify that it is necessary to check whether the first column is intra-level whitespace here. Maybe I should add this conclusion to the comment.

> >+++ b/layout/reftests/css-ruby/intra-level-whitespace-ref.html
> [...]
> >+<body>
> >+  <p><ruby>
> >+    <rb>a</rb><rb pseudo><span> </span></rb><rb>b</rb>
> >+    <rt>x</rt><rt pseudo><span> </span></rt><rt>y</rt>
> >+  </ruby></p>
> 
> It looks like the "pseudo" attributes here are non-functional here, right?
> (looks like they only exist as hints for the reader, about what these
> elements are expecting to match in the testcase)
> 
> Please add an HTML <!--  --> comment briefly mentioning this, because
> otherwise it kinda looks like <rb pseudo> is expected to do something
> special.

For now, I think they do affect the behavior, at least <rt pseudo> does: it inherits styles from the parent instead of having font-size/line-height itself. This behavior is not desirable though. They might be useless after bug 1111463 lands.
(In reply to Xidorn Quan [:xidorn] (UTC+11) from comment #16)
> I just want to justify that it is necessary to check whether the first
> column is intra-level whitespace here. Maybe I should add this conclusion to
> the comment.

Maybe just elaborate a bit on why a line-break triggers this? I don't immediately see any direct connection between line-breaks and intra-level whitespace at http://dev.w3.org/csswg/css-ruby/.  (Maybe you're saying the line-break itself is the whitespace? I'm not sure.)

(Also, I'd at least adjust the comment to avoid calling it "intra-level whitespace" in case (2).  As noted above, it's logically confusing to already call something a $FOO, when you're clarifying the conditions under which it might be considered a $FOO.)

> > It looks like the "pseudo" attributes here are non-functional here, right?
[...]
> For now, I think they do affect the behavior, at least <rt pseudo> does: it
> inherits styles from the parent

Ah, that's from a rule in common.css -- ok.
Comment on attachment 8530072 [details] [diff] [review]
patch 2 - implement intra-level whitespace pairing

Marking "r-" on part to, to keep my review queue straight. Though I think this is near-r+ with comment 14 addressed (w/ clarifications in comment 17).
Attachment #8530072 - Flags: review?(dholbert) → review-
Comment on attachment 8537609 [details] [diff] [review]
patch 0 - use "column" instead of "pair"

This has been done as part of bug 1116037.
Attachment #8537609 - Attachment is obsolete: true
Add a macro undef in nsTextFrame to avoid conflict with windows.h in unified compiling. Also see bug 1117223.
Attachment #8530069 - Attachment is obsolete: true
Attachment #8543689 - Flags: review?(dholbert)
Attachment #8530072 - Attachment is obsolete: true
Attachment #8543690 - Flags: review?(dholbert)
Comment on attachment 8543690 [details] [diff] [review]
patch 2 - implement intra-level whitespace pairing

There is a line-breaking problem in this patch.
Attachment #8543690 - Flags: review?(dholbert)
Attachment #8543690 - Attachment is obsolete: true
Attachment #8543771 - Flags: review?(dholbert)
Attached patch patch 5 - add reftests (obsolete) — Splinter Review
Attachment #8543776 - Flags: review?(dholbert)
Comment on attachment 8543689 [details] [diff] [review]
patch 1 - add common superclass for ruby base and ruby text

>diff --git a/layout/generic/nsRubyTextFrame.cpp b/layout/generic/nsRubyTextFrame.cpp
>@@ -91,9 +83,9 @@ nsRubyTextFrame::Reflow(nsPresContext* a
> 
>   if (GetStateBits() & NS_RUBY_TEXT_FRAME_AUTOHIDE) {
>     // Reset the ISize. The BSize is not changed so that it won't
>     // affect vertical positioning in unexpected way.
>     WritingMode lineWM = aReflowState.mLineLayout->GetWritingMode();
>     aDesiredSize.ISize(lineWM) = 0;
>     aDesiredSize.SetOverflowAreasToDesiredBounds();
>   }
>-}
>+}
>\ No newline at end of file

This patch is removing the newline at the end of this file. Please add the newline back (which should result in this patch-chunk going away).

>diff --git a/layout/generic/nsTextFrame.h b/layout/generic/nsTextFrame.h
[...]
>+#ifdef DrawText
>+#undef DrawText
>+#endif
>+
> class nsTextPaintStyle;

Two nits:
 (1) This needs a code comment mentioning windows.h, to hint at what's going on.
 (2) Since it's a windows-specific issue, we should wrap it in a windows-specific #if check.

So, we should match, for example: https://mxr.mozilla.org/mozilla-central/source/dom/media/webspeech/recognition/SpeechRecognition.cpp#27, which has the following:
> 27 // Undo the windows.h damage
> 28 #if defined(XP_WIN) && defined(GetMessage)
> 29 #undef GetMessage
> 30 #endif

You want basically that exact chunk of code, with s/GetMessage/DrawText/, I think.

r=me with the above addressed.
Attachment #8543689 - Flags: review?(dholbert) → review+
Comment on attachment 8543689 [details] [diff] [review]
patch 1 - add common superclass for ruby base and ruby text

># HG changeset patch
># User Xidorn Quan <quanxunzhen@gmail.com>
># Date 1418709071 -39600
>#      Tue Dec 16 16:51:11 2014 +1100

Also -- it looks like all of these patches have a datestamp header -- e.g. this one is from December. You probably want to remove that header, or else that'll be displayed as the commit date when you push the patch to mozilla-central, and that can cause confusion for people trying to track down when stuff landed.  (though the "pushloghtml" view still shows the push date correctly, since that's tracked outside of hg)
(In reply to Daniel Holbert [:dholbert] from comment #29)
> Comment on attachment 8543689 [details] [diff] [review]
> patch 1 - add common superclass for ruby base and ruby text
> 
> ># HG changeset patch
> ># User Xidorn Quan <quanxunzhen@gmail.com>
> ># Date 1418709071 -39600
> >#      Tue Dec 16 16:51:11 2014 +1100
> 
> Also -- it looks like all of these patches have a datestamp header -- e.g.
> this one is from December. You probably want to remove that header, or else
> that'll be displayed as the commit date when you push the patch to
> mozilla-central, and that can cause confusion for people trying to track
> down when stuff landed.  (though the "pushloghtml" view still shows the push
> date correctly, since that's tracked outside of hg)

But that's a field automatically tracked by hg...
(In reply to Xidorn Quan [:xidorn] (UTC+11) from comment #30)
> But that's a field automatically tracked by hg...

(Closing the loop for anyone not following along on IRC: given that you're using "hg graft" / "hg rebase" for patch management & landing instead of MQ, it sounds like you just need to add "-D" to your hg graft commands.)
Comment on attachment 8543771 [details] [diff] [review]
patch 2 - implement intra-level whitespace pairing

Review of attachment 8543771 [details] [diff] [review]:
-----------------------------------------------------------------

r=me on part 2, with the following addressed:

::: layout/generic/nsRubyBaseContainerFrame.cpp
@@ +87,4 @@
>    void GetColumn(RubyColumn& aColumn) const;
>  
>  private:
> +  nsAutoTArray<nsRubyContentFrame*, RTC_ARRAY_SIZE + 1> mFrames;

Add a comment here warning that the frames in this array are *not* necessarily part of the current column.  When in doubt, it should only be accessed via GetFrame/GetBaseFrame/GetTextFrame. (and maybe "see ::GetFrame and ::Next, which manage this array, for more info")

(One would naturally expect this array to contain only the frames in this column, but as of this patch, that expectation doesn't hold up.)

@@ +100,5 @@
>    const uint32_t rtcCount = aTextContainers.Length();
>    mFrames.SetCapacity(rtcCount + 1);
> +
> +  nsIFrame* rbFrame = aBaseContainer->GetFirstPrincipalChild();
> +  MOZ_ASSERT(!rbFrame || rbFrame->GetType() == nsGkAtoms::rubyBaseFrame);

For readability & useful error messages, it's generally best to include a message in your MOZ_ASSERT calls.

This one probably wants something like: "base container should only have ruby base frames as children"

@@ +108,5 @@
>      // If the container is for span, leave a nullptr here.
>      // Spans do not take part in pairing.
>      nsIFrame* rtFrame = !container->IsSpanContainer() ?
> +      container->GetFirstPrincipalChild() : nullptr;
> +    MOZ_ASSERT(!rtFrame || rtFrame->GetType() == nsGkAtoms::rubyTextFrame);

As above, please include a message with this assertion.

@@ +112,5 @@
> +    MOZ_ASSERT(!rtFrame || rtFrame->GetType() == nsGkAtoms::rubyTextFrame);
> +    mFrames.AppendElement(static_cast<nsRubyContentFrame*>(rtFrame));
> +  }
> +
> +  // It is necessary to check whether the first pair is an intra-level

Should this say "first column" now? (instead of "first pair")

@@ +116,5 @@
> +  // It is necessary to check whether the first pair is an intra-level
> +  // whitespace, since it could happen in two cases:
> +  // 1. an inter-segment whitespace;
> +  // 2. there was a line break before this intra-level whitespace.
> +  // Handling the second case properly is important for correctness.

This formulation is still confusing to me, as noted in my $FOO reduction in comment 14, though I think I understand what it's saying now. (I was initially reading it as saying "these are the reasons we might classify this thing as intra-level whitespace", but really you were saying "these are the reasons something that we've already classified as intra-level whitespace might end up in this particular position".)

I think something like the following would be clearer, and would require less staring back and forth between the code & the spec to understand (for me at least). :)

  // We have to init mAtIntraLevelWhitespace to be correct for our first column.
  //
  // There are two ways that we could end up with intra-level whitespace in our
  // first column:
  // (1) If there's *inter-segment* whitespace, at the beginning of the segment;
  //     e.g. whitespace between <rbc> and <rb>.
  // (2) If our ruby segment is split across multiple lines, and some intra-level
  //     whitespace happens to fall right after a line-break. Each line will get
  //     its own nsRubyBaseContainerFrame, and the nsRubyBaseContainerFrame right
  //     after the line-break will end up with its first column containing that
  //     intra-level whitespace.

If this makes sense to you, I'd prefer something like this. (feel free to copypaste, though please tweak to correct anything that I'm mistaken on here.)

@@ +141,5 @@
> +    if (frame && (!mAtIntraLevelWhitespace ||
> +                  frame->IsIntraLevelWhitespace())) {
> +      nsIFrame* nextSibling = frame->GetNextSibling();
> +      MOZ_ASSERT(!nextSibling || nextSibling->GetType() == frame->GetType());
> +      mFrames[i] = frame = static_cast<nsRubyContentFrame*>(nextSibling);

As above, please add a message for the assertion. Something like: "Ruby content frames' siblings should be same type of frame"

(and maybe also "If this fails, the static_cast below may be invalid!", in the assertion-message or in a comment, to explain why we're bothering to assert it -- since otherwise, it's non-obvious why we care to assert about this here.)

@@ +150,4 @@
>      }
>    }
> +  MOZ_ASSERT(!advancingToIntraLevelWhitespace || !mAtIntraLevelWhitespace,
> +             "Should never have adjacent intra-level whitespace pairs");

I think "pairs" really wants to say "columns" here, now.

@@ +165,5 @@
>    return true;
>  }
>  
> +nsRubyContentFrame*
> +RubyColumnEnumerator::GetFrame(uint32_t aIndex) const

Maybe rename this to "GetFrameAtLevel" (or something similar), to make it clearer at the call-sites what the parameter means.  (and also to make it clearer that there are *multiple* frames, not just a single frame with a getter -- which "GetFrame()" seems to imply)

(I think it makes sense to consider "level 0" of a column to be the base frame, but if I'm wrong on that, maybe "AtLevel" is the wrong suffix.)

@@ +169,5 @@
> +RubyColumnEnumerator::GetFrame(uint32_t aIndex) const
> +{
> +  // If the current ruby column is for intra-level whitespaces, we
> +  // return nullptr for the fake frames in levels which do not have
> +  // corresponding intra-level whitespaces.

I'd prefer we reword this to avoid using "fake frames", since it's not really defined anywhere, except sort of in a comment in ::Next() -- but that's not adjacent to this function so you have to go looking for it. In context here, it's unclear what it means.  It'd also be nice to to hint at what the frames we're skipping actually represent. (the next column)

Something like this:
  // If the current ruby column is for intra-level whitespaces, then
  // we return nullptr for any levels that do not have an *actual*
  // intra-level whitespace frame in this column. This nullptr
  // represents an anonymous empty intra-level whitespace box.
  // (In this case, it's important that we *not* return mFrames[aIndex],
  // because it's really part of the next column -- not the current one.)

(Note that "anonymous empty intra-level whitespace box" is the term used in the spec, so it's a bit better-grounded/defined than "fake" is.)

::: layout/generic/nsRubyContentFrame.cpp
@@ +41,5 @@
> +  if (!pseudoType) {
> +    return false;
> +  }
> +  MOZ_ASSERT(pseudoType == nsCSSAnonBoxes::rubyBase ||
> +             pseudoType == nsCSSAnonBoxes::rubyText);

As above, this assertion needs a message to explain why we think it should hold. *But*, now that I think about it, I suspect it's not actually a valid assertion -- I think it could be made to fail with ::before or ::after. I'm pretty sure that this style...
  div::before {
    display: ruby-base;
  }

...will generate a nsRubyBaseFrame whose StyleContext()->GetPseudo() is equal to nsCSSPseudoElements::before. (inside of any div).  And if we can get ourselves to call IsIntraLevelWhitespace() on that frame, we'd fail this assertion (and incorrectly treat the frame as intra-level whitespace, in opt builds, I think).

So, I think this function's logic needs tweaking -- probably just to convert the assertion into an early-return for non-rubyBase/rubyText psuedoTypes.  (Also -- you can trip this assertion with a testcase, please include an automated version of it -- even just a crashtest, maybe -- to give us a chance of catching stuff like this in the testsuite, in the future.)
Attachment #8543771 - Flags: review?(dholbert) → review+
(In reply to Daniel Holbert [:dholbert] from comment #32)
> Comment on attachment 8543771 [details] [diff] [review]
> patch 2 - implement intra-level whitespace pairing
> 
> Review of attachment 8543771 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r=me on part 2, with the following addressed:
> 
> ::: layout/generic/nsRubyBaseContainerFrame.cpp
> @@ +100,5 @@
> >    const uint32_t rtcCount = aTextContainers.Length();
> >    mFrames.SetCapacity(rtcCount + 1);
> > +
> > +  nsIFrame* rbFrame = aBaseContainer->GetFirstPrincipalChild();
> > +  MOZ_ASSERT(!rbFrame || rbFrame->GetType() == nsGkAtoms::rubyBaseFrame);
> 
> For readability & useful error messages, it's generally best to include a
> message in your MOZ_ASSERT calls.
> 
> This one probably wants something like: "base container should only have
> ruby base frames as children"

I agree that it is generally a good idea to include a message in MOZ_ASSERT calls, however in these cases, I think the code itself has been clear enough, then the message might be kind of redundant. This kind of asserts appear all the time in this file, and in most of the cases, I think the context of code has been clear enough for readers.
Address most of the comments.
Attachment #8543771 - Attachment is obsolete: true
Attachment #8545045 - Flags: review?(dholbert)
Attached patch patch 5 - add reftests (obsolete) — Splinter Review
Add a reftest for pseudo element as ruby box.
Attachment #8543776 - Attachment is obsolete: true
Attachment #8543776 - Flags: review?(dholbert)
Attachment #8545047 - Flags: review?(dholbert)
(In reply to Xidorn Quan [:xidorn] (UTC+11) from comment #33)
> I agree that it is generally a good idea to include a message in MOZ_ASSERT
> calls, however in these cases, I think the code itself has been clear
> enough, then the message might be kind of redundant.

I'm OK with omitting it in the case you quoted (and the analogous rtFrame case), though I disagree that it's redundant there.  Technically, the assertion is simply checking that "either there are no children, or the first child is a ruby base frame", which is a pretty limited-in-scope thing to claim.  The  assertion-message that I suggested adding ("base container should only have ruby base frames as children") isn't just a re-stating of the assertion condition (if it were, I'd agree RE redundancy) -- instead, it expresses the higher-level reason that we expect the assertion to hold.

But anyway, it seems like you're leaning against a message on that one, so I won't push more on it, since I agree it's not too tricky to figure out why that particular assertion should hold up.
Comment on attachment 8545045 [details] [diff] [review]
patch 2 - implement intra-level whitespace pairing

r=me on part 2.
Attachment #8545045 - Flags: review?(dholbert) → review+
Comment on attachment 8543772 [details] [diff] [review]
patch 3 - use specific frame type in RubyColumn

>+++ b/layout/generic/nsRubyBaseContainerFrame.cpp
> void
> RubyColumnEnumerator::GetColumn(RubyColumn& aColumn) const
> {
>-  aColumn.mBaseFrame = GetFrame(0);
>+  nsRubyContentFrame* rbFrame = GetFrame(0);

(Looks like this needs some s/GetFrame/GetFrameAtLevel/ unbitrotting, to work with most recent version of patch 2; but you probably already noticed that.)

r=me
Attachment #8543772 - Flags: review?(dholbert) → review+
Comment on attachment 8543775 [details] [diff] [review]
patch 4 - handle line breaking properly

Review of attachment 8543775 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/generic/nsRubyBaseContainerFrame.cpp
@@ +588,5 @@
> +    bool nextColumnAvailable = false;
> +    if (column.mIsIntraLevelWhitespace && !e.AtEnd()) {
> +      e.Next();
> +      e.GetColumn(nextColumn);
> +      nextColumnAvailable = true;

I don't think I need this extra variable... I should just check column.mIsIntraLevelWhitespace below. They are always equal.
Actually, I was going to suggest wrapping nextColumn in a Maybe<>, which would make the code-chunk you quoted:
   nextColumn.emplace();
   e.Next();
   e.GetColumn(nextColumn);

and later usages would just check "nextColumn" itself for truthiness instead of nextColumnAvailable. (And use "->" instead of ".")

This has the (small) benefit of skipping the constructor for nextColumn & its nsAutoTArray member-var, if it doesn't end up being needed; and it also builds in sanity checks that we *only* attempt to use nextColumn if we've actually initialized it via the clause that you quoted.
(I also don't think it's true that nextColumnAvailable must be equal to column.mIsIntraLevelWhitespace, in the case where e.AtEnd() returns true.)
(In reply to Daniel Holbert [:dholbert] from comment #41)
> (I also don't think it's true that nextColumnAvailable must be equal to
> column.mIsIntraLevelWhitespace, in the case where e.AtEnd() returns true.)

Hmm, you're right. I guess that's what I thought when I wrote this code yesterday...
Comment on attachment 8543775 [details] [diff] [review]
patch 4 - handle line breaking properly

Review of attachment 8543775 [details] [diff] [review]:
-----------------------------------------------------------------

Since it sounds like you're working on this code right now, I'll post what I've got so far for part 4. (I just need to review the last bit, nsRubyBaseContainerFrame::PullOneColumn -- I'll post any notes on that in a bit.)

::: layout/generic/nsRubyBaseContainerFrame.cpp
@@ +580,5 @@
>      aStatus = NS_INLINE_LINE_BREAK_AFTER(aStatus);
>      MOZ_ASSERT(NS_FRAME_IS_COMPLETE(aStatus) || aReflowState.mAllowLineBreak);
>  
> +    // If we are on an intra-level whitespace column, nullptr might not
> +    // mean end of the container. It is likely that there are frames in

Please clarify the first sentence here -- it's unclear what variable(s) you're talking about being null.

I think you really mean to say (after the comma): "null values in column.mBaseFrame and in column.mTextFrames don't represent the end of the frame-sibling-chain at that level -- instead, they represent an anonymous empty intra-level whitespace box."

@@ +581,5 @@
>      MOZ_ASSERT(NS_FRAME_IS_COMPLETE(aStatus) || aReflowState.mAllowLineBreak);
>  
> +    // If we are on an intra-level whitespace column, nullptr might not
> +    // mean end of the container. It is likely that there are frames in
> +    // the next non-intra-level-whitespace column. Those frames should

The phrase "next non-intra-level-whitespace column" here is a bit misleading -- it suggests that the next column could itself be intra-level whitespace, and we might have to look several columns ahead to find the "next *non-intra-level whitespace* column". (but that's not the case, because such columns can't be adjacent)

Please clarify to say something like:
  "the next column (which can't be intra-level-whitespace)."

@@ +584,5 @@
> +    // mean end of the container. It is likely that there are frames in
> +    // the next non-intra-level-whitespace column. Those frames should
> +    // be pushed as well.
> +    RubyColumn nextColumn;
> +    bool nextColumnAvailable = false;

As noted in comment 39, these two local-vars should probably be combined as:
  Maybe<RubyColumn> nextColumn; // constructed below, if needed & available

(You'll need to add #include "mozilla/Maybe.h" up top, too.)
Comment on attachment 8543775 [details] [diff] [review]
patch 4 - handle line breaking properly

Review of attachment 8543775 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with the following & comment 43 addressed:

::: layout/generic/nsRubyBaseContainerFrame.cpp
@@ +781,5 @@
> +
> +  aColumn.mIsIntraLevelWhitespace = pullingIntraLevelWhitespace;
> +  if (pullingIntraLevelWhitespace) {
> +    // We are pulling an intra-level whitespace. Drop all frames which
> +    // are not part of this intra-level whitespace column.

Add: " (Those frames are really part of the *next* column, after the pulled one.)" to make it clearer that we're not just dropping them on the floor.

@@ +796,5 @@
> +
> +  // Pull the frames of this column.
> +  if (aColumn.mBaseFrame) {
> +    PullNextInFlowChild(aPullFrameState.mBase);
> +  }

Add an assertion that the PullNextInFlowChild() calls here return the same frame that we null-checked (the frame that we're expecting to pull). e.g.:

  DebugOnly<nsIFrame*> pulledFrame = PullNextInFlowChild(aPullFrameState.mBase);
  MOZ_ASSERT(pulledFrame == aColumn.mBaseFrame, "pulled the wrong frame?");

(Same goes for the textContainers PullNextInFlowChild() call lower down.)
Attachment #8543775 - Flags: review?(dholbert) → review+
Comment on attachment 8545047 [details] [diff] [review]
patch 5 - add reftests

Review of attachment 8545047 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/reftests/css-ruby/intra-level-whitespace-2-ref.html
@@ +1,5 @@
> +<!DOCTYPE html>
> +<html lang="en">
> +<head>
> +  <meta charset="UTF-8">
> +  <title></title>

This <title> is empty -- you probably meant for it to contain something.

@@ +16,5 @@
> +  <div class="container"><ruby><rb>12345</rb><rb>67890</rb><rt>09876</rt> <rt>54321</rt></ruby></div>
> +  <div class="container"><ruby><rb>12345</rb> <rb>67890</rb><rt>09876</rt> <rt>54321</rt></ruby></div>
> +  <script type="text/javascript">
> +    var width2 = ref2.getBoundingClientRect().width + 'px';
> +    style.textContent = '.container { width: ' + width2 + '; }';

Please add a comment explaining what you're trying to achieve with this scripting here.

(also, general note: I'm a bit uneasy with using the automatically-created global variables "style", "ref2", etc in tests like this. Most tests that I've seen explicitly use getElementById(), which makes them less fragile & more declarative/readable -- i.e. it's more obvious what "style" refers to if it's set via getElementById, and if someone were to change that element's "id" attribute for some reason, they know that they should similarly adjust all getElementById() calls that pass in the same string.  I won't mandate that you adjust this in these tests, because I'm not aware of this being an official requirement in our tests -- I'm mostly mentioning it because I think it's a best-practice for making tests maintainable, particularly when they're more complex.)

::: layout/reftests/css-ruby/intra-level-whitespace-2.html
@@ +1,5 @@
> +<!DOCTYPE html>
> +<html lang="en">
> +<head>
> +  <meta charset="UTF-8">
> +  <title></title>

As above, this <title> is empty.

@@ +17,5 @@
> +  <div class="container"><ruby><rb>12345</rb> <rb>67890</rb><rt>09876</rt> <rt>54321</rt></ruby></div>
> +  <script type="text/javascript">
> +    var width1 = ref1.getBoundingClientRect().width + 'px';
> +    var width2 = ref2.getBoundingClientRect().width + 'px';
> +    style.textContent = '.container { width: ' + width1 + '; }';

This (combined with the ruby markup above it) is a bit complicated. Can you add a comment somewhere explaining what scenario(s) you're trying to test here and how you're achieving it with this scripting?

@@ +18,5 @@
> +  <script type="text/javascript">
> +    var width1 = ref1.getBoundingClientRect().width + 'px';
> +    var width2 = ref2.getBoundingClientRect().width + 'px';
> +    style.textContent = '.container { width: ' + width1 + '; }';
> +    document.body.getBoundingClientRect();  // force reflow

Nit: Normally we use offsetHeight (or one of the other "offset*" variables) to trigger reflows in tests. So, unless there's a particular reason you prefer getBoundingClientRect, it's probably better to be consistent (& more concise) and use:
  document.body.offsetHeight; // force reflow

::: layout/reftests/css-ruby/intra-level-whitespace-3.html
@@ +10,5 @@
> +      content: "before";
> +    }
> +    rbc::after {
> +      display: ruby-base;
> +      content: " ";

This test needs a comment somewhere explaining its connection to intra-level whitespace (since there isn't actually any intra-level whitespace in the test at all, which makes its filename/title confusing).

e.g. <!-- This test ensures that we don't treat all-whitespace ::before/::after content the same as intra-level whitespace -->

Also: Could you extend this test to check that we don't pair ::before/::after whitespace with intra-level whitespace? I think something like:
  <ruby>
    <rbc> <rb>base</rb></rbc>
    <rtc><rt>text</rt> </rtc>
  </ruby
(the idea being to make sure we don't pair the rbc::after whitespace-node with the intra-level-whitespace after the rt. Though I'm actually not sure that this example would 100% prove that we're not doing that, since you can't actually see where the ::after node ends up. So maybe it's not even possible to test. Still, probably worth including if only to exercise the pairing code with ::before/::after being actually intermixed with intra-level whitespace.)

::: layout/reftests/css-ruby/reftest.list
@@ +20,5 @@
>  == inlinize-blocks-3.html inlinize-blocks-3-ref.html
>  == inlinize-blocks-4.html inlinize-blocks-4-ref.html
>  == inlinize-blocks-5.html inlinize-blocks-5-ref.html
> +== intra-level-whitespace-1.html intra-level-whitespace-1-ref.html
> +== intra-level-whitespace-2.html intra-level-whitespace-2-ref.html

So I viewed these tests locally, and it actually looks like intra-level-whitespace-2.html already "passes" (matches its reference) in current Nightly builds, with ruby enabled. (I assume because we mis-render both the testcase and the reference case in the same way.)

That's kinda undesirable, and it means we're relying on this code here to render the testcase *and* the reference case. Is it possible to tweak the reference case to produce the expected rendering, without relying on the code that's being added in this bug?  If not, it's probably OK, but it's worth striving for.

(Incidentally, intra-level-whitespace-3.html also already passes, but I think that's because it doesn't actually include any intra-level whitespace & hence doesn't rely on the patches here, as noted above in my comments for that test.)
(In reply to Daniel Holbert [:dholbert] from comment #45)
> Is it possible to tweak the
> reference case to produce the expected rendering, without relying on the
> code that's being added in this bug?

(Or at least, something closer to the expected rendering than what the testcase produces.)
(Tests patch is mostly r+, but I'd like to see the requested comments & tweaks before I'm confident that I understand the tests correctly, so I'm holding off on r+ until there's an updated patch w/ comment 45 addressed.)
(In reply to Daniel Holbert [:dholbert] from comment #45)
> Comment on attachment 8545047 [details] [diff] [review]
> patch 5 - add reftests
> 
> Review of attachment 8545047 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: layout/reftests/css-ruby/intra-level-whitespace-3.html
> @@ +10,5 @@
> > +      content: "before";
> > +    }
> > +    rbc::after {
> > +      display: ruby-base;
> > +      content: " ";
> 
> Also: Could you extend this test to check that we don't pair
> ::before/::after whitespace with intra-level whitespace? I think something
> like:
>   <ruby>
>     <rbc> <rb>base</rb></rbc>
>     <rtc><rt>text</rt> </rtc>
>   </ruby
> (the idea being to make sure we don't pair the rbc::after whitespace-node
> with the intra-level-whitespace after the rt. Though I'm actually not sure
> that this example would 100% prove that we're not doing that, since you
> can't actually see where the ::after node ends up. So maybe it's not even
> possible to test. Still, probably worth including if only to exercise the
> pairing code with ::before/::after being actually intermixed with
> intra-level whitespace.)

Actually it is impossible to test this, because we never have intra-level whitespace at the beginning or the end of a ruby level container -- any leading/trailing pure whitespaces are trimmed in frame-constructing stage. However, ::before/::after can only be at those places.

> ::: layout/reftests/css-ruby/reftest.list
> @@ +20,5 @@
> >  == inlinize-blocks-3.html inlinize-blocks-3-ref.html
> >  == inlinize-blocks-4.html inlinize-blocks-4-ref.html
> >  == inlinize-blocks-5.html inlinize-blocks-5-ref.html
> > +== intra-level-whitespace-1.html intra-level-whitespace-1-ref.html
> > +== intra-level-whitespace-2.html intra-level-whitespace-2-ref.html
> 
> So I viewed these tests locally, and it actually looks like
> intra-level-whitespace-2.html already "passes" (matches its reference) in
> current Nightly builds, with ruby enabled. (I assume because we mis-render
> both the testcase and the reference case in the same way.)

Yes because before all the patches, intra-level whitespaces are not handled at all. The second test was failed after patch 2, but then be fixed by patch 4.

> That's kinda undesirable, and it means we're relying on this code here to
> render the testcase *and* the reference case. Is it possible to tweak the
> reference case to produce the expected rendering, without relying on the
> code that's being added in this bug?  If not, it's probably OK, but it's
> worth striving for.
> 
> (Incidentally, intra-level-whitespace-3.html also already passes, but I
> think that's because it doesn't actually include any intra-level whitespace
> & hence doesn't rely on the patches here, as noted above in my comments for
> that test.)

This test crashed with attachment 8543771 [details] [diff] [review].
(In reply to Xidorn Quan [:xidorn] (UTC+11) from comment #48)
> Actually it is impossible to test this, because we never have intra-level
> whitespace at the beginning or the end of a ruby level container -- any
> leading/trailing pure whitespaces are trimmed in frame-constructing stage.
> However, ::before/::after can only be at those places.

Ah, ok -- right. I was misreading the "previous box/next box" whitespace table when I wrote my sample-extra-chunk for this test.  Never mind about extending this test, then. (though it still needs a comment explaining its connection to intra-segment whitespace)

> Yes because before all the patches, intra-level whitespaces are not handled
> at all. The second test was failed after patch 2, but then be fixed by patch
> 4.

OK, makes sense. Never mind about that part then.

> > (Incidentally, intra-level-whitespace-3.html also already passes, but I
> This test crashed with attachment 8543771 [details] [diff] [review].

(Yup, I figured; I wasn't as concerned with why that test already passes in trunk.)
Attachment #8545047 - Attachment is obsolete: true
Attachment #8545561 - Flags: review?(dholbert)
Comment on attachment 8545561 [details] [diff] [review]
patch 5 - add reftests

Review of attachment 8545561 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with the below addressed:

::: layout/reftests/css-ruby/intra-level-whitespace-2-ref.html
@@ +18,5 @@
> +  <script type="text/javascript">
> +    var ref2 = document.getElementById('ref2');
> +    var style = document.getElementById('style');
> +    var width2 = ref2.getBoundingClientRect().width + 'px';
> +    style.textContent = '.container { width: ' + width2 + '; }';

This script still needs a brief explanatory comment, to indicate what we're doing here.

Something like: "This is the same as the script in the corresponding testcase, except that here we skip the intermediate size and jump straight to the final size."

::: layout/reftests/css-ruby/intra-level-whitespace-2.html
@@ +21,5 @@
> +    var style = document.getElementById('style');
> +    var width1 = ref1.getBoundingClientRect().width + 'px';
> +    var width2 = ref2.getBoundingClientRect().width + 'px';
> +    // Change the width of the container to test whether line-breaking
> +    // is handled correctly with intra-level whitespaces.

Still needs a little more detail to explain what's going on.  This isn't just trying to test line-breaking; it's specifically trying to test incremental layout, with intra-level whitespace columns being pulled up *across* a line-break (I think).  Please extend the comment a bit more to mention this.

(The fact that we're testing incremental layout is important, since it explains why the reference case is so similar and is just missing some middle steps.)
Attachment #8545561 - Flags: review?(dholbert) → review+
You need to log in before you can comment on or make changes to this bug.