Implement longhand East Asian counter styles in CSS3

RESOLVED FIXED in mozilla28

Status

()

Core
CSS Parsing and Computation
RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: xidorn, Assigned: xidorn)

Tracking

(Depends on: 1 bug, Blocks: 1 bug, {css3, dev-doc-complete})

Trunk
mozilla28
css3, dev-doc-complete
Points:
---
Dependency tree / graph
Bug Flags:
sec-review +
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(9 attachments, 26 obsolete attachments)

1.36 KB, text/html
Details
1.96 KB, text/html
Details
322 bytes, text/html
Details
26.55 KB, image/png
Details
3.69 KB, text/html
Details
33.94 KB, patch
xidorn
: review+
Details | Diff | Splinter Review
17.93 KB, patch
xidorn
: review+
Details | Diff | Splinter Review
11.20 KB, patch
xidorn
: review+
Details | Diff | Splinter Review
26.31 KB, patch
xidorn
: review+
Details | Diff | Splinter Review
Including -moz-{simp,trad}-chinese-{formal,informal}, CJK ideographic numbers are not used for negative number.

When use
<ol start="-1">
  <li>minus one
  <li>zero
  <li>one
</ol>

It gives:
-1. minus one
零. zero
一. one

but it should give:
负一. minus one
零. zero
一. one
Assignee: nobody → quanxunzhen
Created attachment 826255 [details] [diff] [review]
bug934072.patch
Attachment #826255 - Flags: review?(mwu)
Comment on attachment 826255 [details] [diff] [review]
bug934072.patch

The table in the spec requires different negative signs for traditional and simplified Chinese:
http://dev.w3.org/csswg/css-counter-styles-3/#valuedef-simp-chinese-informal
and yet further options for the Japanese and Korean styles (though we don't implement the Korean ones).

This could also use tests, e.g., in layout/reftests/counters/.

(Also, local style involves putting {} around single-line if bodies.)

Otherwise the patch looks good -- thanks.  Are you willing to make the needed revisions?
Attachment #826255 - Flags: review-
Oh, I didn't find this spec. I'm interested in implementing the whole section 7.1 (except supporting number outside int32, which seems to be a limitation in the whole core). Is there an exist bug about this? Or should I change this bug to "Implement longhand east Asian counter styles"?
I don't know of another bug, but you could search, or this one could be a perfectly fine bug for it.
Blocks: 843718
Component: Layout → CSS Parsing and Computation
Keywords: css3
Summary: "list-style-type: cjk-ideographic" don't support negative number → Implement longhand East Asian counter styles in CSS3
Attachment #826255 - Attachment is obsolete: true
Attachment #826255 - Flags: review?(mwu)
Created attachment 826391 [details] [diff] [review]
patch 1 - bullet suffix

This patch allows using other suffix for predefined counter style.

It uses space char in suffix instead of margin-end in ua.css since the margin makes the space look too wild for some suffix like "\3001".
Attachment #826391 - Flags: review?(dbaron)
Created attachment 826397 [details] [diff] [review]
patch 2 - cjk-decimal
Attachment #826397 - Flags: review?(dbaron)
Created attachment 826565 [details] [diff] [review]
patch 1 v2 - bullet suffix

Compared to previous version in comment 5, this optimizes strings and adds one reftest.
Attachment #826391 - Attachment is obsolete: true
Attachment #826391 - Flags: review?(dbaron)
Attachment #826565 - Flags: review?(dbaron)
Created attachment 826576 [details] [diff] [review]
patch 3 - implement complex cjk counter styles (final part)

Note that this implementation generates a slightly different output with the current spec draft (20131103) according to the mailing list discussion, replies from my native friends, and result of Google Translate.

For japanese-informal and korean-hanja-informal, this impl, compared to the draft, removes "one" before "thousand", but preserves it if "one thousand" is just before 10 thousand unit. So it generates 一万千百十一 and 萬 千百十一 instead of 一万一千百十一 and 萬 一千百十一 for 11,111 in japanese-informal and korean-hanja-informal respectively, and generates 一千万 for 10,000,000 in japanese-informal.

Compared to the current impl in Gecko, this new impl fixes a bug that "one" before "ten" is removed mistakenly in {simp,trad}-chinese-formal style.
Attachment #826576 - Flags: review?(dbaron)
Comment on attachment 826397 [details] [diff] [review]
patch 2 - cjk-decimal

I'm going to transfer these review requests to Jonathan; he's likely to have time to get to them before I will (hopefully within a few days).
Attachment #826397 - Flags: review?(dbaron) → review?(jfkthame)
Attachment #826565 - Flags: review?(dbaron) → review?(jfkthame)
Attachment #826576 - Flags: review?(dbaron) → review?(jfkthame)
(In reply to Xidorn Quan from comment #5)

> It uses space char in suffix instead of margin-end in ua.css since the
> margin makes the space look too wild for some suffix like "\3001".

Replacing the use of margin-end with a space character (included as part of the "suffix" text) will make the positioning of the visible text of the marker vary significantly, depending on the font in use. That's a significant change from current rendering for all numbered lists - and I'm not sure it's one we want.
Created attachment 826745 [details]
Testcase for item-number placement in various fonts

Testcase for bullet (list item number) placement in various fonts. On OS X, at least, the green (real <ol>) and red (which fakes the layout using a <table>) examples render virtually identically, as can be seen when they're overlaid with partial opacity.

Replacing the use of margin-end with a space results in significant variations in spacing.
Comment on attachment 826565 [details] [diff] [review]
patch 1 v2 - bullet suffix

Review of attachment 826565 [details] [diff] [review]:
-----------------------------------------------------------------

Generalizing the suffix support, instead of using a hard-coded ".", is a good thing; but I have a couple of concerns about the details here, see below. Hence marking r- for now; I think we need an alternative approach to this aspect of it.

Regarding the excessive spacing after the "ideographic comma" suffix, would it help to use U+FF64 (the halfwidth version of this character) instead of U+3001?

::: layout/generic/nsBulletFrame.cpp
@@ +1322,5 @@
>    if (success && aListStyle.mListStyleType == NS_STYLE_LIST_STYLE_HEBREW)
>      mTextIsRTL = true;
>  
> +  if (!success || suffix.IsEmpty())
> +    suffix.AssignLiteral(". ");

Nit: the style guide calls for braces around the if() body, even when it's a single statement. (I realize the existing code isn't consistent about this. If you want to fix the example immediately above while you're here, that'd be fine.)

However, I think it'd be better to initialize suffix to its default value -before- calling AppendCounterText(). Then if any style within that function ever wants to explicitly make it empty, it can do so without being overwritten here.

Two more serious issues here:

(1) This will not work together with the "fake bidi" handling for list item markers (see the code immediately following).

(2) Using a space instead of CSS-defined margin here will tend to disrupt spacing; see comment and example above.

So I think this aspect of the patch needs to be reconsidered.
Attachment #826565 - Flags: review?(jfkthame) → review-
Created attachment 826749 [details]
Testcase for rendering of list item numbers in bidi contexts

This illustrates the problems that arise with bidi text and the bullet-suffix patch.
Attachment #826745 - Attachment description: 934072-test-1.html → Testcase for item-number placement in various fonts
(In reply to Jonathan Kew (:jfkthame) from comment #12)
> Regarding the excessive spacing after the "ideographic comma" suffix, would
> it help to use U+FF64 (the halfwidth version of this character) instead of
> U+3001?

The two characters do not look the same at least on my machine, U+FF64 is only about 3/4 long as U+3001. It shows that at least in some fonts, they are different. And since U+3001 is used more widely, it should have been supported and further optimized in more fonts. So IMHO it is not a good idea.

> ::: layout/generic/nsBulletFrame.cpp
> @@ +1322,5 @@
> >    if (success && aListStyle.mListStyleType == NS_STYLE_LIST_STYLE_HEBREW)
> >      mTextIsRTL = true;
> >  
> > +  if (!success || suffix.IsEmpty())
> > +    suffix.AssignLiteral(". ");
> 
> Nit: the style guide calls for braces around the if() body, even when it's a
> single statement. (I realize the existing code isn't consistent about this.
> If you want to fix the example immediately above while you're here, that'd
> be fine.)

I will fix it. I just forgot what the style guide said and followed the style used in the file, sorry :)

All the patches have the same style problem, I will fix them.

> However, I think it'd be better to initialize suffix to its default value
> -before- calling AppendCounterText(). Then if any style within that function
> ever wants to explicitly make it empty, it can do so without being
> overwritten here.

In fact, I did use this way before, but I found that I had to write an extra if-statement for every branch to check whether the subroutine executed successfully, which seems to be redundant.

Maybe we do not need to check whether they are succeeded, just use the suffix directly?

> Two more serious issues here:
> 
> (1) This will not work together with the "fake bidi" handling for list item
> markers (see the code immediately following).

I see. I plan to add a string reverse for fake bidi.

BTW, does there exist any reverse function for nsString? I failed to find one.

> (2) Using a space instead of CSS-defined margin here will tend to disrupt
> spacing; see comment and example above.

I don't think it is a big problem. It should not be surprising for web designers that different fonts produce different spacing. And few people use different fonts in one list.

In fact, WebKit does something like what I do in this patch - it uses a font-dependent spacing instead of a fixed one. (the width seems to be about 1ex, wider than one space)

According to my experiment, it seems to be better to set 0.5ex to margin-end.
Created attachment 826797 [details]
Testcase for item-number placement in various fonts

In your 934072-test-1.html the two lists are not overlaid perfectly on my macine. I adjusted it a bit, and it should be ok now.
Attachment #826745 - Attachment is obsolete: true
Comment on attachment 826397 [details] [diff] [review]
patch 2 - cjk-decimal

Review of attachment 826397 [details] [diff] [review]:
-----------------------------------------------------------------

This looks fine. It'd be nice to add a reftest for the new counter style, too.
Attachment #826397 - Flags: review?(jfkthame) → review+
Created attachment 826800 [details]
Compare of U+3001 & U+FF64

This page shows the difference between U+3001 and U+FF64.
Created attachment 826801 [details]
Comparison of U+3001 & U+FF64

Sorry for spam. Fix a typo.
Attachment #826800 - Attachment is obsolete: true
(In reply to Xidorn Quan from comment #14)
> (In reply to Jonathan Kew (:jfkthame) from comment #12)
> > Regarding the excessive spacing after the "ideographic comma" suffix, would
> > it help to use U+FF64 (the halfwidth version of this character) instead of
> > U+3001?
> 
> The two characters do not look the same at least on my machine, U+FF64 is
> only about 3/4 long as U+3001. It shows that at least in some fonts, they
> are different. And since U+3001 is used more widely, it should have been
> supported and further optimized in more fonts. So IMHO it is not a good idea.

OK, fair enough.

(In the default font used on my system, the glyph shapes look identical, only the spacing differs. But if that's not necessarily the case for other systems, I agree we should stay with the more commonly used one.)

> > However, I think it'd be better to initialize suffix to its default value
> > -before- calling AppendCounterText(). Then if any style within that function
> > ever wants to explicitly make it empty, it can do so without being
> > overwritten here.
> 
> In fact, I did use this way before, but I found that I had to write an extra
> if-statement for every branch to check whether the subroutine executed
> successfully, which seems to be redundant.
> 
> Maybe we do not need to check whether they are succeeded, just use the
> suffix directly?

Hmm. Well, if CharListToText fails (i.e. falls back to decimal) then I don't think we'd want to use gCJKIdeographicSuffix as the suffix with it.

So I think you should still check the "success" of AppendCounterText, and reset the suffix to the default "." if it failed. Just remove the check for IsEmpty() here, and initialize the suffix string with the default when it's first created.

> 
> > Two more serious issues here:
> > 
> > (1) This will not work together with the "fake bidi" handling for list item
> > markers (see the code immediately following).
> 
> I see. I plan to add a string reverse for fake bidi.
> 
> BTW, does there exist any reverse function for nsString? I failed to find
> one.

Not AFAIK.

> 
> > (2) Using a space instead of CSS-defined margin here will tend to disrupt
> > spacing; see comment and example above.
> 
> I don't think it is a big problem. It should not be surprising for web
> designers that different fonts produce different spacing. And few people use
> different fonts in one list.

OK. As long as we're aware of the change in behavior here, I guess it's an acceptable result.

> In fact, WebKit does something like what I do in this patch - it uses a
> font-dependent spacing instead of a fixed one. (the width seems to be about
> 1ex, wider than one space)
> 
> According to my experiment, it seems to be better to set 0.5ex to margin-end.

It's generally better to express font-dependent horizontal spacing in terms of em rather than ex; maybe somewhere around 0.2 or 0.3em?

If you use a font-dependent margin-end setting, will you still need to use a multi-character suffix string ". " for the default style, or can we avoid the bidi complication there?
Comment on attachment 826576 [details] [diff] [review]
patch 3 - implement complex cjk counter styles (final part)

Review of attachment 826576 [details] [diff] [review]:
-----------------------------------------------------------------

Leaving r? as I haven't been through this fully yet, but I did have one question already....

::: layout/generic/nsBulletFrame.cpp
@@ +804,5 @@
> +    0x0000, 0x5341, 0x767e, 0x5343
> +  },
> +  .unit10K = {
> +    0x0000, 0x4e07, 0x5104
> +  }

Is this named-initializer syntax legal C++ and supported by all compilers (that are relevant for the mozilla codebase)? (I thought it was a C99 extension, and hence might not work in MSVC, at least - or is it now part of a newer C++ standard?)

If in doubt, it might be safer to avoid this construct and use traditional initializer syntax (with plenty of comments!) Of course, if you can confirm it's OK, that's fine.
(In reply to Jonathan Kew (:jfkthame) from comment #19)
> (In reply to Xidorn Quan from comment #14)
> >
> > In fact, I did use this way before, but I found that I had to write an extra
> > if-statement for every branch to check whether the subroutine executed
> > successfully, which seems to be redundant.
> > 
> > Maybe we do not need to check whether they are succeeded, just use the
> > suffix directly?
> 
> Hmm. Well, if CharListToText fails (i.e. falls back to decimal) then I don't
> think we'd want to use gCJKIdeographicSuffix as the suffix with it.
> 
> So I think you should still check the "success" of AppendCounterText, and
> reset the suffix to the default "." if it failed. Just remove the check for
> IsEmpty() here, and initialize the suffix string with the default when it's
> first created.

There is another thing should be take into account that the fallback for most complex cjk styles is cjk-decimal, though currently they never fail because of range of int32, and suffix for cjk-decimal is still gCJKIdeographicSuffix.

I have a new idea about how to implement it, will submit soon.

> > I see. I plan to add a string reverse for fake bidi.
> > 
> > BTW, does there exist any reverse function for nsString? I failed to find
> > one.
> 
> Not AFAIK.

So I have to submit another independent patch for such function if necessary.

> > In fact, WebKit does something like what I do in this patch - it uses a
> > font-dependent spacing instead of a fixed one. (the width seems to be about
> > 1ex, wider than one space)
> > 
> > According to my experiment, it seems to be better to set 0.5ex to margin-end.
> 
> It's generally better to express font-dependent horizontal spacing in terms
> of em rather than ex; maybe somewhere around 0.2 or 0.3em?

0.25em seems to be good. It makes the spacing nearly perfectly match the original fixed spacing in default font and font-size on my machine (8px roughly equals to 0.5em and space is about 0.25em), and the total spacing seems not to be too wide for the ideographic comma.

> If you use a font-dependent margin-end setting, will you still need to use a
> multi-character suffix string ". " for the default style, or can we avoid
> the bidi complication there?

I am not sure. I meant to move some of the spacing from margin-end to suffixes since ideographic comma has spacing itself but others do not. It is unrelated to font-dependence, and you can replace 0.25em with 4px here (the old value is 8px). It's ok for ideographic comma but might be too narrow for dot w/o space char.

Using font-dependence or not is another question. The current fixed spacing might be too narrow with large fonts but too wide with small fonts. You can adjust the font-size to 8px and 48px, and see the result.
(In reply to Jonathan Kew (:jfkthame) from comment #20)
> Comment on attachment 826576 [details] [diff] [review]
> patch 3 - implement complex cjk counter styles (final part)
> 
> Review of attachment 826576 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Leaving r? as I haven't been through this fully yet, but I did have one
> question already....
> 
> ::: layout/generic/nsBulletFrame.cpp
> @@ +804,5 @@
> > +    0x0000, 0x5341, 0x767e, 0x5343
> > +  },
> > +  .unit10K = {
> > +    0x0000, 0x4e07, 0x5104
> > +  }
> 
> Is this named-initializer syntax legal C++ and supported by all compilers
> (that are relevant for the mozilla codebase)? (I thought it was a C99
> extension, and hence might not work in MSVC, at least - or is it now part of
> a newer C++ standard?)
> 
> If in doubt, it might be safer to avoid this construct and use traditional
> initializer syntax (with plenty of comments!) Of course, if you can confirm
> it's OK, that's fine.

You are right that it is not legal in C++, though my compiler didn't complain about it. I will change it.
Created attachment 827378 [details] [diff] [review]
part 1 - nsAString reverse helpers
Attachment #827378 - Flags: review?(jfkthame)
Created attachment 827379 [details] [diff] [review]
part 2 v3 - bullet suffix
Attachment #826565 - Attachment is obsolete: true
Attachment #827379 - Flags: review?(jfkthame)
Created attachment 827381 [details] [diff] [review]
part 3 v2 - implement cjk-decimal
Attachment #826397 - Attachment is obsolete: true
Attachment #827381 - Flags: review?(jfkthame)
Created attachment 827382 [details] [diff] [review]
part 4 v2 - implement complex cjk counter styles
Attachment #827382 - Flags: review?(jfkthame)
Attachment #826576 - Attachment is obsolete: true
Attachment #826576 - Flags: review?(jfkthame)
Created attachment 827400 [details] [diff] [review]
part 3 v3 - implement cjk-decimal
Attachment #827381 - Attachment is obsolete: true
Attachment #827381 - Flags: review?(jfkthame)
Attachment #827400 - Flags: review?(jfkthame)
Comment on attachment 827378 [details] [diff] [review]
part 1 - nsAString reverse helpers

Review of attachment 827378 [details] [diff] [review]:
-----------------------------------------------------------------

I think this ought to be reviewed by an xpcom/string peer; dbaron, back to you (or whoever you think appropriate).

::: xpcom/string/src/nsReadableUtils.cpp
@@ +1134,5 @@
> +      return;
> +    }
> +
> +    auto old_dest_length = aDest.Length();
> +    aDest.GetMutableData(&dest, old_dest_length + length);

What happens if the addition here overflows?

@@ +1137,5 @@
> +    auto old_dest_length = aDest.Length();
> +    aDest.GetMutableData(&dest, old_dest_length + length);
> +    dest += old_dest_length;
> +
> +    for (int32_t i = length - 1; i >= 0; i--) {

Length of a string is a uint32_t; if the length exceeds 0x80000000, the loop here won't work. (Do we care about supporting such huge lengths in the string code, or do we assume string lengths always fit within int32_t?)
Attachment #827378 - Flags: review?(jfkthame) → review?(dbaron)
(In reply to Jonathan Kew (:jfkthame) from comment #28)
> Comment on attachment 827378 [details] [diff] [review]
> part 1 - nsAString reverse helpers
> 
> Review of attachment 827378 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I think this ought to be reviewed by an xpcom/string peer; dbaron, back to
> you (or whoever you think appropriate).
> 
> ::: xpcom/string/src/nsReadableUtils.cpp
> @@ +1134,5 @@
> > +      return;
> > +    }
> > +
> > +    auto old_dest_length = aDest.Length();
> > +    aDest.GetMutableData(&dest, old_dest_length + length);
> 
> What happens if the addition here overflows?

Other functions in the same file are written in this way, so I think it should be ok.

> @@ +1137,5 @@
> > +    auto old_dest_length = aDest.Length();
> > +    aDest.GetMutableData(&dest, old_dest_length + length);
> > +    dest += old_dest_length;
> > +
> > +    for (int32_t i = length - 1; i >= 0; i--) {
> 
> Length of a string is a uint32_t; if the length exceeds 0x80000000, the loop
> here won't work. (Do we care about supporting such huge lengths in the
> string code, or do we assume string lengths always fit within int32_t?)

But if we change the type of i to uint32_t, it will become an infinite loop since unsigned intergers are never less than 0.
(In reply to Xidorn Quan from comment #29)
> > > +    auto old_dest_length = aDest.Length();
> > > +    aDest.GetMutableData(&dest, old_dest_length + length);
> > 
> > What happens if the addition here overflows?
> 
> Other functions in the same file are written in this way, so I think it
> should be ok.

Certainly it's not a concern for the specific use case (of 1-2 characters!) in this bug. So if that's an accepted pattern in the string code, I guess it's fine.

> > > +    for (int32_t i = length - 1; i >= 0; i--) {
> > 
> > Length of a string is a uint32_t; if the length exceeds 0x80000000, the loop
> > here won't work. (Do we care about supporting such huge lengths in the
> > string code, or do we assume string lengths always fit within int32_t?)
> 
> But if we change the type of i to uint32_t, it will become an infinite loop
> since unsigned intergers are never less than 0.

Right - so the loop would need to be adapted somewhat. But maybe we just don't even try to support such long strings correctly, in which case it'd be fine. Let's see what the string module guys think.
(In reply to Xidorn Quan from comment #21)
> 0.25em seems to be good. It makes the spacing nearly perfectly match the
> original fixed spacing in default font and font-size on my machine (8px
> roughly equals to 0.5em and space is about 0.25em), and the total spacing
> seems not to be too wide for the ideographic comma.
> 
> > If you use a font-dependent margin-end setting, will you still need to use a
> > multi-character suffix string ". " for the default style, or can we avoid
> > the bidi complication there?
> 
> I am not sure. I meant to move some of the spacing from margin-end to
> suffixes since ideographic comma has spacing itself but others do not. It is
> unrelated to font-dependence, and you can replace 0.25em with 4px here (the
> old value is 8px). It's ok for ideographic comma but might be too narrow for
> dot w/o space char.

I'm still not really comfortable with the use of a space character to provide (some of) the spacing between the item label and the list item itself.

The width of the space varies greatly between fonts; it's around 0.25em in a typical proportional, serif face such as Times, but is about 0.6em in several fixed-space/typewriter fonts I checked. This means that with such fonts, the label ends up much further away, and IMO that looks substantially worse than our current rendering.

I notice that Chrome does appear to behave this way, probably using the space character as a basis for the spacing here, at least in my testing on OS X.

Comparing the rendering of:

 data:text/html,
   <div style="font-family:serif;font-size:16px"><ol><li>one<li>two<li>three</div>
   <div style="font-family:monospace;font-size:16px"><ol><li>one<li>two<li>three

in Firefox and Chrome, I see that the "serif" lists look virtually identical; but the "monospace" list in Chrome has its labels much further to the left. I think the Firefox rendering looks better, so I'd be sad to lose that.

I wonder if there's some way we can avoid relying on the space character here, and find an alternative approach to control the space appropriately for the different styles? I'd like to think about this some more....

> Using font-dependence or not is another question. The current fixed spacing
> might be too narrow with large fonts but too wide with small fonts. You can
> adjust the font-size to 8px and 48px, and see the result.

Yes - I'm happy with allowing the space to be font-size-dependent (by expressing it in em rather than px units). I think that's strictly an improvement over the current behavior, and it shouldn't cause any significant disruption or controversy (especially as Chrome, for example, already behaves that way).
Attachment #827400 - Flags: review?(jfkthame) → review+
(In reply to Jonathan Kew (:jfkthame) from comment #31)
> (In reply to Xidorn Quan from comment #21)
> > 0.25em seems to be good. It makes the spacing nearly perfectly match the
> > original fixed spacing in default font and font-size on my machine (8px
> > roughly equals to 0.5em and space is about 0.25em), and the total spacing
> > seems not to be too wide for the ideographic comma.
> > 
> > > If you use a font-dependent margin-end setting, will you still need to use a
> > > multi-character suffix string ". " for the default style, or can we avoid
> > > the bidi complication there?
> > 
> > I am not sure. I meant to move some of the spacing from margin-end to
> > suffixes since ideographic comma has spacing itself but others do not. It is
> > unrelated to font-dependence, and you can replace 0.25em with 4px here (the
> > old value is 8px). It's ok for ideographic comma but might be too narrow for
> > dot w/o space char.
> 
> I'm still not really comfortable with the use of a space character to
> provide (some of) the spacing between the item label and the list item
> itself.
> 
> The width of the space varies greatly between fonts; it's around 0.25em in a
> typical proportional, serif face such as Times, but is about 0.6em in
> several fixed-space/typewriter fonts I checked. This means that with such
> fonts, the label ends up much further away, and IMO that looks substantially
> worse than our current rendering.
> 
> I notice that Chrome does appear to behave this way, probably using the
> space character as a basis for the spacing here, at least in my testing on
> OS X.
> 
> Comparing the rendering of:
> 
>  data:text/html,
>    <div
> style="font-family:serif;font-size:16px"><ol><li>one<li>two<li>three</div>
>    <div
> style="font-family:monospace;font-size:16px"><ol><li>one<li>two<li>three
> 
> in Firefox and Chrome, I see that the "serif" lists look virtually
> identical; but the "monospace" list in Chrome has its labels much further to
> the left. I think the Firefox rendering looks better, so I'd be sad to lose
> that.

Well, I agree with you. The spacing inside U+3001 has been 0.5em, which is in fact the current margin for bullet for the default font-size, and adding any extra margin will make it be less optimal. But for dot, it is necessary to append a 0.5em spacing.

> I wonder if there's some way we can avoid relying on the space character
> here, and find an alternative approach to control the space appropriately
> for the different styles? I'd like to think about this some more....

We may need to design and implement a much more complex mechanism for this problem.

The expected perfect result should be that, the renderer will recognize the width of right padding inside the suffix character, and produce a spacing make it be 0.5em in total. But it seems to be hard, if not impossible.

Are there any font specialists who can provide some suggestions?
(In reply to Jonathan Kew (:jfkthame) from comment #19)
> (In reply to Xidorn Quan from comment #14)
> > Maybe we do not need to check whether they are succeeded, just use the
> > suffix directly?
> 
> Hmm. Well, if CharListToText fails (i.e. falls back to decimal) then I don't
> think we'd want to use gCJKIdeographicSuffix as the suffix with it.
> 
> So I think you should still check the "success" of AppendCounterText, and
> reset the suffix to the default "." if it failed. Just remove the check for
> IsEmpty() here, and initialize the suffix string with the default when it's
> first created.

I rechecked the current draft of spec. In note of section 2, it says:

> This also implies that the prefix and suffix always come from the specified counter-style, even if the actual representation is constructed by a fallback style.

link: http://dev.w3.org/csswg/css-counter-styles-3/#counter-styles

So I may have to rollback my patch to the previous version. Nonetheless, I won't submit a new one until we make some progress on the spacing problem.
Another issue to keep in mind in the implementation here is how we will handle the behavior of counter styles defined with the @counter-style rule, when this is implemented (AFAIK, not yet). In this case, the suffix is provided as part of the @-rule, and is not expected to include a following space. So it will be necessary for the space to be produced by CSS margin (or some other mechanism), separately from the actual text of the suffix (if any).

Considering the various factors here, it looks like a "perfect" implementation is going to be quite tricky. To move forward, I think we should:

* change the existing margin-end from 8px to 0.5em, so that it respects the font size

* take your patch to customize the suffix instead of using hard-coded "." everywhere, but do not include trailing space in any of the suffix strings

* take your patch to implement the new counter styles, with "、" as suffix

This will provide the new functionality without regressing existing styles. The main problem at this point will be that the spacing after the CJK styles with "、" suffix is larger than we'd like.

To address this, a possible solution might be to apply the OpenType 'hwid' feature to the suffix string, if supported in the font. This should give improved rendering (without the excessive space) for fonts that support 'hwid'; for those that don't, we'd still have the larger space. IMO, that may be a reasonable compromise: basically, the rule is that the best typography is only achieved with full-featured fonts; older or lesser-quality fonts may give somewhat inferior results.

This would also work well in conjunction with @counter-style rules that might use a fullwidth CJK suffix such as "、" or "。", where mapping to the halfwidth "、" or "。" could reduce the excess white space.

However, AFAIK I don't think we currently have any architecture in place that will easily allow us to specify custom styling (the 'hwid' feature) just for the suffix of the counter string, which would be just one part of the content of the ::marker pseudo-element. And we certainly can't apply 'hwid' to the entire ::marker.

As a workaround for the time being, therefore, I come back to the suggestion from comment 12 of using the halfwidth ideographic comma U+FF64 in place of U+3001. While the glyph itself may not be identical in all fonts, I think it will normally be close enough; in many fonts the shape itself -is- identical, only the spacing is changed. Maybe you could try this, and compare the rendering across a number of fonts to see whether it would be an acceptable workaround for the time being, until a more sophisticated approach can be devised?
Comment on attachment 827382 [details] [diff] [review]
part 4 v2 - implement complex cjk counter styles

Review of attachment 827382 [details] [diff] [review]:
-----------------------------------------------------------------

This looks basically fine (I confess I didn't try to check all the data!) A few small notes below; in particular, please rearrange the CJK struct for more compact packing.

(Also, if you post an updated version here, my personal preference would be to have the tests split into a separate patch from the code changes.)

::: layout/generic/nsBulletFrame.cpp
@@ +790,1 @@
>  };

This struct ends up very inefficiently packed - on my (64-bit) build, for example, sizeof(CJKIdeographicData) is 56 bytes.

If you

(a) move the small lang and formal fields to the end of the struct, so they don't cause padding to be added for alignment before the pointer field

(b) declare lang as a uint8_t rather than the enum type

(c) omit the (never-used) zeroth element from unit[] and unit10K[], and adjust the index by -1 when accessing them

then its sizeof() drops to 40 bytes.

@@ +798,5 @@
> +  }, {
> +    0x0000, 0x5341, 0x767e, 0x5343
> +  }, {
> +    0x0000, 0x4e07, 0x5104
> +  }

And then please add comments to the fields in these initializers, as we don't have the actual field names any more. Something like

static const CJKIdeographicData gDataJapaneseInformal = {
  gJapaneseNegative,          // PRUnichar string for negative
  {                           // chars for digits 0-9
    0x3007, 0x4e00, 0x4e8c, 0x4e09, 0x56db,
    0x4e94, 0x516d, 0x4e03, 0x516b, 0x4e5d
  },
  { 0x5341, 0x767e, 0x5343 }, // units markers
  { 0x4e07, 0x5104 },         // unit10K markers
  JAPANESE,                   // language-specific handling
  false                       // formal?
};

@@ +933,5 @@
> +        if (!data.formal) {
> +          switch (data.lang) {
> +            case CHINESE:
> +              if (unitidx == 1 &&
> +                  (ordinal == 1 || (pos > 4 && ordinal % 1000 == 1))) {

Adding parens around (ordinal % 1000) here, like in the Korean case below, would be good for readability IMO (even though it's not necessary for operator-precedence reasons).

@@ +1265,1 @@
>      case NS_STYLE_LIST_STYLE_MOZ_TRAD_CHINESE_INFORMAL: 

I assume we plan on removing the -moz-prefixed versions of these styles eventually (though we should keep them available for a transition period, to give authors time to adapt).

So please file a followup bug about removing the -moz- versions of the styles for which you're implementing standard properties here.
Attachment #827382 - Flags: review?(jfkthame) → review+
(In reply to Jonathan Kew (:jfkthame) from comment #34)
> Another issue to keep in mind in the implementation here is how we will
> handle the behavior of counter styles defined with the @counter-style rule,
> when this is implemented (AFAIK, not yet). In this case, the suffix is
> provided as part of the @-rule, and is not expected to include a following
> space. So it will be necessary for the space to be produced by CSS margin
> (or some other mechanism), separately from the actual text of the suffix (if
> any).
> 
> Considering the various factors here, it looks like a "perfect"
> implementation is going to be quite tricky. To move forward, I think we
> should:
> 
> * change the existing margin-end from 8px to 0.5em, so that it respects the
> font size
> 
> * take your patch to customize the suffix instead of using hard-coded "."
> everywhere, but do not include trailing space in any of the suffix strings
> 
> * take your patch to implement the new counter styles, with "、" as suffix
> 
> This will provide the new functionality without regressing existing styles.
> The main problem at this point will be that the spacing after the CJK styles
> with "、" suffix is larger than we'd like.

I'd like to do the basic support as you mentioned above, and lay the spacing problem aside for now. And I think we can add ::marker support first (which seems to be identical with -moz-list-number or -moz-list-bullet), so that users can adjust the margin themselves if they think it is too wide. When we start work on @counter-style, we may have new considerations about this.

> To address this, a possible solution might be to apply the OpenType 'hwid'
> feature to the suffix string, if supported in the font. This should give
> improved rendering (without the excessive space) for fonts that support
> 'hwid'; for those that don't, we'd still have the larger space. IMO, that
> may be a reasonable compromise: basically, the rule is that the best
> typography is only achieved with full-featured fonts; older or
> lesser-quality fonts may give somewhat inferior results.
> 
> This would also work well in conjunction with @counter-style rules that
> might use a fullwidth CJK suffix such as "、" or "。", where mapping to the
> halfwidth "、" or "。" could reduce the excess white space.
> 
> However, AFAIK I don't think we currently have any architecture in place
> that will easily allow us to specify custom styling (the 'hwid' feature)
> just for the suffix of the counter string, which would be just one part of
> the content of the ::marker pseudo-element. And we certainly can't apply
> 'hwid' to the entire ::marker.

The problem may be even complex that what you mentioned here. Even if we are able to specify custom style for suffix, we cannot do so: there may be more than one characters in suffix of a custom counter style. We can only apply the style to the last character. And for such punctuations, IMHO, "halt" should be better than "hwid".

However, a bad news is that, I just tested such features in my system (Mac OS X 10.9), and the default Simplified Chinese font STHeiti has neither "hwid" nor "halt" support for "、", another font Hiragino Sans GB has only "halt" support for that character. A friend who is familiar with fonts told me it seems that most Chinese fonts do not support "halt", and most support of such features are from Japanese fonts.

> As a workaround for the time being, therefore, I come back to the suggestion
> from comment 12 of using the halfwidth ideographic comma U+FF64 in place of
> U+3001. While the glyph itself may not be identical in all fonts, I think it
> will normally be close enough; in many fonts the shape itself -is-
> identical, only the spacing is changed. Maybe you could try this, and
> compare the rendering across a number of fonts to see whether it would be an
> acceptable workaround for the time being, until a more sophisticated
> approach can be devised?

I would rather let it be too wide than use the halfwidth one as replacement for the time being. I'm using the default Chinese font, and U+FF64 is not as pretty as U+3001 for me. It's too tiny. I'll upload an screenshot of the comparison for reference.
Created attachment 828657 [details]
Screenshot of comparison between U+3001 and U+FF64
Hmm, yes, I agree that it doesn't look like U+FF64 is a good option. It also seems that it's not supported in many fonts, so then it falls back to a different font that may be stylistically quite wrong.

Another consideration, though, is that the glyph for U+3001 in some fonts is centered within the em-square, rather than being positioned to the left with >0.5em of space on its right-hand side.

E.g., compare the zh-cn and zh-hk renderings (on OS X, at least) for

data:text/html,<div style="font:120px sans-serif">
  <span lang="zh-cn" style="color:green;border:1px solid green">&%23x3001;</span><br>
  <span lang="zh-hk" style="color:red;border:1px solid red">&%23x3001;

Here, the "、" in the zh-hk content (similarly for zh-tw also) is centered in its square. A solution that assumes the U+3001 glyph always includes a half-em of whitespace as its right sidebearing could result in worse (too-close) spacing for this case.
(In reply to Jonathan Kew (:jfkthame) from comment #35)
> Comment on attachment 827382 [details] [diff] [review]
> part 4 v2 - implement complex cjk counter styles
> 
> Review of attachment 827382 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This looks basically fine (I confess I didn't try to check all the data!) A
> few small notes below; in particular, please rearrange the CJK struct for
> more compact packing.
> 
> (Also, if you post an updated version here, my personal preference would be
> to have the tests split into a separate patch from the code changes.)
> 
> ::: layout/generic/nsBulletFrame.cpp
> @@ +790,1 @@
> >  };
> 
> This struct ends up very inefficiently packed - on my (64-bit) build, for
> example, sizeof(CJKIdeographicData) is 56 bytes.
> 
> If you
> 
> (a) move the small lang and formal fields to the end of the struct, so they
> don't cause padding to be added for alignment before the pointer field
> 
> (b) declare lang as a uint8_t rather than the enum type
> 
> (c) omit the (never-used) zeroth element from unit[] and unit10K[], and
> adjust the index by -1 when accessing them
> 
> then its sizeof() drops to 40 bytes.

Does the space so important? It's only 14 bytes decrease per instance, and since they are global constants, there are only 14 * 9 = 126 bytes more. Moving lang and formal to the end is acceptable, but using uint8_t instead of enum type and omitting zeroth elements are over-optimization for me. Especially the latter which will increase both the complexity of code and number of instructions to execute.

> @@ +798,5 @@
> > +  }, {
> > +    0x0000, 0x5341, 0x767e, 0x5343
> > +  }, {
> > +    0x0000, 0x4e07, 0x5104
> > +  }
> 
> And then please add comments to the fields in these initializers, as we
> don't have the actual field names any more. Something like
> 
> static const CJKIdeographicData gDataJapaneseInformal = {
>   gJapaneseNegative,          // PRUnichar string for negative
>   {                           // chars for digits 0-9
>     0x3007, 0x4e00, 0x4e8c, 0x4e09, 0x56db,
>     0x4e94, 0x516d, 0x4e03, 0x516b, 0x4e5d
>   },
>   { 0x5341, 0x767e, 0x5343 }, // units markers
>   { 0x4e07, 0x5104 },         // unit10K markers
>   JAPANESE,                   // language-specific handling
>   false                       // formal?
> };

OK.

> @@ +933,5 @@
> > +        if (!data.formal) {
> > +          switch (data.lang) {
> > +            case CHINESE:
> > +              if (unitidx == 1 &&
> > +                  (ordinal == 1 || (pos > 4 && ordinal % 1000 == 1))) {
> 
> Adding parens around (ordinal % 1000) here, like in the Korean case below,
> would be good for readability IMO (even though it's not necessary for
> operator-precedence reasons).

OK.

> @@ +1265,1 @@
> >      case NS_STYLE_LIST_STYLE_MOZ_TRAD_CHINESE_INFORMAL: 
> 
> I assume we plan on removing the -moz-prefixed versions of these styles
> eventually (though we should keep them available for a transition period, to
> give authors time to adapt).
> 
> So please file a followup bug about removing the -moz- versions of the
> styles for which you're implementing standard properties here.

In fact I'm wondering whether I should implement them without -moz- prefix. Since the Counter Style Level 3 has not been CR, it might be better to add prefix for all those things... Hope the spec would become CR before my patches get merged :)
(In reply to Xidorn Quan from comment #39)
> (In reply to Jonathan Kew (:jfkthame) from comment #35)
> > Comment on attachment 827382 [details] [diff] [review]
> > part 4 v2 - implement complex cjk counter styles
> > 
> > Review of attachment 827382 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > This looks basically fine (I confess I didn't try to check all the data!) A
> > few small notes below; in particular, please rearrange the CJK struct for
> > more compact packing.
> > 
> > (Also, if you post an updated version here, my personal preference would be
> > to have the tests split into a separate patch from the code changes.)
> > 
> > ::: layout/generic/nsBulletFrame.cpp
> > @@ +790,1 @@
> > >  };
> > 
> > This struct ends up very inefficiently packed - on my (64-bit) build, for
> > example, sizeof(CJKIdeographicData) is 56 bytes.
> > 
> > If you
> > 
> > (a) move the small lang and formal fields to the end of the struct, so they
> > don't cause padding to be added for alignment before the pointer field
> > 
> > (b) declare lang as a uint8_t rather than the enum type
> > 
> > (c) omit the (never-used) zeroth element from unit[] and unit10K[], and
> > adjust the index by -1 when accessing them
> > 
> > then its sizeof() drops to 40 bytes.
> 
> Does the space so important? It's only 14 bytes decrease per instance, and
> since they are global constants, there are only 14 * 9 = 126 bytes more.

(16 * 9 = 144, actually.) It's not much, I know, but as a general rule I think we should be careful to avoid needlessly wasting space. If we keep ignoring the waste of 100 bytes here and there, across dozens of patches, it soon starts adding up to kilobytes... which might not matter on a typical desktop, but on a small B2G device every byte is precious.

> Moving lang and formal to the end is acceptable, but using uint8_t instead
> of enum type

Without the enum->uint8_t change, the enum gets allocated 4 bytes, even though it only has three possible values; this prevents it being packed into the same word with either the bool or the preceding PRUnichar.

> and omitting zeroth elements are over-optimization for me.
> Especially the latter which will increase both the complexity of code and
> number of instructions to execute.

Not significantly; it just requires an added "- 1" in the lines that use data.unit10K[] and data.unit[], which occurs in just one place for each field, IIRC. (And the compiler ought to be able to optimize this away at compile time anyway.)

So I think reducing the size of the global data here by 144 bytes (which will all need to be initialized during loading, even if most of it is never referenced) is worth the slight fiddling with the code.


> In fact I'm wondering whether I should implement them without -moz- prefix.
> Since the Counter Style Level 3 has not been CR, it might be better to add
> prefix for all those things... Hope the spec would become CR before my
> patches get merged :)

Good point. The patches are really close to being ready to land, IMO, but we should check on the status of that.
Attachment #827378 - Attachment is obsolete: true
Attachment #827378 - Flags: review?(dbaron)
Created attachment 829235 [details] [diff] [review]
part 1 v4 - bullet suffix

This new version creates a new function for getting suffix, and appends a inner padding for builtin suffixes which have no spacing inside.

Though I think this is the best method up to now, it still have some problems. The most outstanding one is that it will be hard for web developers to remove the spacing. I tried to append the spacing only when padding of bullet is not set explicitly, but I failed.

Any suggestion?
Attachment #827379 - Attachment is obsolete: true
Attachment #827379 - Flags: review?(jfkthame)
Attachment #829235 - Flags: review?(jfkthame)
Created attachment 829236 [details] [diff] [review]
part 2 v4 - implement cjk-decimal
Created attachment 829240 [details] [diff] [review]
part 3 v3 - implement complex cjk counter styles
Attachment #827382 - Attachment is obsolete: true
Attachment #827400 - Attachment is obsolete: true
Created attachment 829241 [details] [diff] [review]
part 4 - reftests
Attachment #829241 - Flags: review?(jfkthame)
Attachment #829236 - Flags: review?(jfkthame)
Attachment #829240 - Flags: review?(jfkthame)
Note that, this implementation is still slightly different with the spec draft.

The difference and reasons are explained in the following mails:
http://lists.w3.org/Archives/Public/www-style/2013Nov/0101.html
http://lists.w3.org/Archives/Public/www-style/2013Nov/0102.html
(In reply to Jonathan Kew (:jfkthame) from comment #40)
> (In reply to Xidorn Quan from comment #39)
> > In fact I'm wondering whether I should implement them without -moz- prefix.
> > Since the Counter Style Level 3 has not been CR, it might be better to add
> > prefix for all those things... Hope the spec would become CR before my
> > patches get merged :)
> 
> Good point. The patches are really close to being ready to land, IMO, but we
> should check on the status of that.

I saw the mail. Since the editor said that he would request CR after TPAC which will end after a week, I think it should not be a problem to use the unprefixed version.
(In reply to Xidorn Quan from comment #41)
> Created attachment 829235 [details] [diff] [review]
> part 1 v4 - bullet suffix
> 
> This new version creates a new function for getting suffix, and appends a
> inner padding for builtin suffixes which have no spacing inside.
> 
> Though I think this is the best method up to now, it still have some
> problems. The most outstanding one is that it will be hard for web
> developers to remove the spacing.

Also, the result is not good if the CJK font being used has a "centered" rather than "left-aligned" glyph for U+3001. This may depend on many factors: platform, default font settings, fonts specified in CSS, language, ..., but at least on OS X, I'm seeing a centered form for U+3001 when the page is tagged as zh-hk or zh-tw, while the fonts used for zh-cn and ja both have left-aligned forms.
Created attachment 831632 [details]
Lists tagged with various languages

Testcase for list items displayed under various language settings.

In current Firefox, the list items all have "." and padding, and look OK (although "." isn't really the right suffix to be using).

With the current bullet-suffix patch providing U+3001 as suffix, the zh-cn and ja cases look OK on OS X, but the zh-hk and zh-tw cases don't, as there's insufficient space separating the (centered) comma from the actual text of the item.

I think you should revert to using 0.5em of margin in the CSS to provide the standard spacing, and drop the "smart padding" attempt, as it doesn't really know what the suffix looks like - in addition to the problem that it would be difficult for authors to customize or override.

This will give spacing that is arguably wider than necessary after 、 in the zh-cn and ja cases, but that's better than allowing it to become excessively narrow.
(In reply to Jonathan Kew (:jfkthame) from comment #47)
> (In reply to Xidorn Quan from comment #41)
> > Created attachment 829235 [details] [diff] [review]
> > part 1 v4 - bullet suffix
> > 
> > This new version creates a new function for getting suffix, and appends a
> > inner padding for builtin suffixes which have no spacing inside.
> > 
> > Though I think this is the best method up to now, it still have some
> > problems. The most outstanding one is that it will be hard for web
> > developers to remove the spacing.
> 
> Also, the result is not good if the CJK font being used has a "centered"
> rather than "left-aligned" glyph for U+3001. This may depend on many
> factors: platform, default font settings, fonts specified in CSS, language,
> ..., but at least on OS X, I'm seeing a centered form for U+3001 when the
> page is tagged as zh-hk or zh-tw, while the fonts used for zh-cn and ja both
> have left-aligned forms.

No, the result is good even if U+3001 is a centered glyph. In Chinese, whether in zh-cn or zh-tw, we hardly, if not never, use space between fullwidth characters. For reference, this is the official version of constitution of ROC, which is certainly written in zh-tw: http://www.na.gov.tw/ch/law/LawAllView.jsp?itemid=8 . You can find that the ideographic comma is used without any space around.

So the only problem of the "smart padding" is the difficulty of customizing, which is a common problem of all our past solution, too.
In addition, I think we can defer the solving of the customizing problem until we implement the standard ::marker pesudo-element defined in CSS Lists and Counters Module Lever 3. Before that, authors have no standard method to change the padding & margin of bullets.
(In reply to Xidorn Quan from comment #49)
> (In reply to Jonathan Kew (:jfkthame) from comment #47)
> > (In reply to Xidorn Quan from comment #41)
> > > Created attachment 829235 [details] [diff] [review]
> > > part 1 v4 - bullet suffix
> > > 
> > > This new version creates a new function for getting suffix, and appends a
> > > inner padding for builtin suffixes which have no spacing inside.
> > > 
> > > Though I think this is the best method up to now, it still have some
> > > problems. The most outstanding one is that it will be hard for web
> > > developers to remove the spacing.
> > 
> > Also, the result is not good if the CJK font being used has a "centered"
> > rather than "left-aligned" glyph for U+3001. This may depend on many
> > factors: platform, default font settings, fonts specified in CSS, language,
> > ..., but at least on OS X, I'm seeing a centered form for U+3001 when the
> > page is tagged as zh-hk or zh-tw, while the fonts used for zh-cn and ja both
> > have left-aligned forms.
> 
> No, the result is good even if U+3001 is a centered glyph. In Chinese,
> whether in zh-cn or zh-tw, we hardly, if not never, use space between
> fullwidth characters. For reference, this is the official version of
> constitution of ROC, which is certainly written in zh-tw:
> http://www.na.gov.tw/ch/law/LawAllView.jsp?itemid=8 . You can find that the
> ideographic comma is used without any space around.

Yes, within a line of text that's how it is used. The glyph sits within a standard character-sized square, whether it is centered in that square or placed to the left, with more space on the right; and no additional space is expected within running text.

However, when it is part of a list item marker, my assumption (which may be faulty!) is that there should be some *additional* space separating the marker from the beginning of the item text itself.

When I view attachment 831632 [details] on OS X, it seems to me that the zh-hk and zh-tw lists really need some additional space between marker and text. I'm not sure something like http://www.na.gov.tw/ch/law/LawAllView.jsp?itemid=8 is a good comparison, as the "lists" there are not really formatted as such; they're just simple paragraphs/lines of text that happen to begin with numbers. This is particularly obvious when the numbers become multi-character: they are not right-aligned like list item markers would be, but rather run-in to the line of text.
(In reply to Jonathan Kew (:jfkthame) from comment #51)
>
> However, when it is part of a list item marker, my assumption (which may be
> faulty!) is that there should be some *additional* space separating the
> marker from the beginning of the item text itself.
> 
> When I view attachment 831632 [details] on OS X, it seems to me that the
> zh-hk and zh-tw lists really need some additional space between marker and
> text. I'm not sure something like
> http://www.na.gov.tw/ch/law/LawAllView.jsp?itemid=8 is a good comparison, as
> the "lists" there are not really formatted as such; they're just simple
> paragraphs/lines of text that happen to begin with numbers. This is
> particularly obvious when the numbers become multi-character: they are not
> right-aligned like list item markers would be, but rather run-in to the line
> of text.

Without the support from typesetting system, no one will right-align list item markers manually, so I can never find an example which matches your requirement here perfectly. But you can hardly find one for your standpoint as well.

The usage of punctuation in lists in such text can reflect the usual practice. e.g. You will never write "1.something" manually anywhere but "1. something". Similarly, either in zh-cn or zh-tw, we always use the ideographic comma directly to connect the number and text without space around.
OK, let's not worry further about that for now. I'll accept a solution that includes the margin/padding for non-CJK markers, and omits it for the CJK ones with U+3001 as suffix, without regard for whether the U+3001 glyph is a "centered" or "left" form.

One remaining point, then: if possible, I'd like to keep the spacing that's added (for non-CJK suffixes) in the ua.css file, rather than hard-coding it in nsBulletFrame.

Maybe we can have it defined as margin-end:0.5em (or padding-end, I'm not sure which is better) in ua.css, but let GetListItemSuffix return a bool indicating whether to actually use the margin. So we'd have

  /* static */ bool
  nsBulletFrame::GetListItemSuffix(int32_t aListStyleType,
                                   nsString& aResult)
  {
    switch (aListStyleType) {
      case NS_STYLE_LIST_STYLE_NONE: // etc
        return true;
      case NS_STYLE_LIST_STYLE_CJK_DECIMAL: // etc
        aResult = 0x3001;
        return false;
      case NS_STYLE_LIST_STYLE_KOREAN_HANGUL_FORMAL: // etc
        aResult = ',';
        return true;
      default:
        aResult = '.';
        return true;
    }
  }

and then GetDesiredSize will use the return value to decide whether to let the padding (or margin?) from CSS be used as-is, or to suppress it.

For extra points, perhaps we should do this "magic" of suppressing the padding for CJK counter styles only if it's the default padding from ua.css; if the author has explicitly provided padding, we should assume it's deliberate and not try to change it. It might be possible to use nsPresContext::HasAuthorSpecifiedRules() to control this.
Created attachment 833293 [details] [diff] [review]
part 1 v5 - bullet suffix
Attachment #829235 - Attachment is obsolete: true
Attachment #829236 - Attachment is obsolete: true
Attachment #829240 - Attachment is obsolete: true
Attachment #829235 - Flags: review?(jfkthame)
Attachment #829236 - Flags: review?(jfkthame)
Attachment #829240 - Flags: review?(jfkthame)
Attachment #833293 - Flags: review?(jfkthame)
Created attachment 833294 [details] [diff] [review]
part 2 v5 - implement cjk-decimal
Attachment #833294 - Flags: review?(jfkthame)
Created attachment 833295 [details] [diff] [review]
part 3 v4 - implement complex cjk counter styles
Attachment #833295 - Flags: review?(jfkthame)
Created attachment 833296 [details] [diff] [review]
part 4 v2 - reftests
Attachment #829241 - Attachment is obsolete: true
Attachment #829241 - Flags: review?(jfkthame)
Attachment #833296 - Flags: review?(jfkthame)
Comment on attachment 833293 [details] [diff] [review]
part 1 v5 - bullet suffix

Review of attachment 833293 [details] [diff] [review]:
-----------------------------------------------------------------

Found some problems myself after submission. I'll fix them in the next version of patches.

::: layout/base/nsCounterManager.cpp
@@ +85,5 @@
>  
>      for (uint32_t i = stack.Length() - 1;; --i) {
>          nsCounterNode *n = stack[i];
> +        bool isTextRTL;
> +        nsAutoString suffix;

Oops, forgot to remove this unused variable.

::: layout/generic/nsBulletFrame.cpp
@@ +1309,5 @@
> +    case NS_STYLE_LIST_STYLE_MOZ_TRAD_CHINESE_FORMAL: 
> +    case NS_STYLE_LIST_STYLE_MOZ_SIMP_CHINESE_INFORMAL: 
> +    case NS_STYLE_LIST_STYLE_MOZ_SIMP_CHINESE_FORMAL: 
> +    case NS_STYLE_LIST_STYLE_MOZ_JAPANESE_INFORMAL: 
> +    case NS_STYLE_LIST_STYLE_MOZ_JAPANESE_FORMAL: 

Tailing spaces.
Comment on attachment 833293 [details] [diff] [review]
part 1 v5 - bullet suffix

Review of attachment 833293 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good, thanks; appears to handle the various issues as well as we can reasonably manage. I noticed one small issue to fix, and it'd be good to have some comments added explaining what's going on with padding and the mSuppressPadding flag here, to help the next person reading this code.

r=jfkthame with these tweaks.

::: layout/generic/nsBulletFrame.cpp
@@ +1485,3 @@
>    SetFontSizeInflation(inflation);
>  
>    // Get the base size

Add to the comment here, something like: "This will also set mSuppressPadding appropriately (via GetListItemText()) for the CJK counter styles where the default padding from ua.css is not desired."

@@ +1499,5 @@
> +    mPadding.left += borderPadding.left;
> +  }
> +  aMetrics.width += mPadding.left + mPadding.right;
> +  aMetrics.height += mPadding.top + mPadding.bottom;
> +  aMetrics.ascent += mPadding.top + mPadding.bottom;

I don't think you meant to include mPadding.bottom in aMetrics.ascent here; that will misalign the number and item text, if top/bottom padding is used on the .

::: layout/generic/nsBulletFrame.h
@@ +123,5 @@
>  
>    nsSize mIntrinsicSize;
>    int32_t mOrdinal;
>    bool mTextIsRTL;
> +  bool mSuppressPadding;

Please include a comment explaining what this is for; something along the lines of "If set to true, any padding defined by the UA style sheet will be suppressed; this is used for CJK numbering styles where added space after the number is not desired. Any author-specified padding overriding the UA style sheet will NOT be suppressed, however."

::: layout/style/ua.css
@@ +79,5 @@
>  
>  *|*::-moz-list-bullet, *|*::-moz-list-number {
>    display: inline;
>    vertical-align: baseline;
> +  -moz-padding-end: 0.5em;

Also add a comment here:
/* Note that padding is suppressed for CJK numbering styles; see bug 934072 */
Attachment #833293 - Flags: review?(jfkthame) → review+
Attachment #833294 - Flags: review?(jfkthame) → review+
Comment on attachment 833295 [details] [diff] [review]
part 3 v4 - implement complex cjk counter styles

Review of attachment 833295 [details] [diff] [review]:
-----------------------------------------------------------------

One small suggestion (you can take it or leave it). r=jfkthame, anyhow.

::: layout/generic/nsBulletFrame.cpp
@@ +782,5 @@
> +  PRUnichar digit[10];
> +  PRUnichar unit[3];
> +  PRUnichar unit10K[2];
> +  uint8_t lang;
> +  bool formal;

What would you think of inverting this to be an 'informal' flag, rather than 'formal', given that it's only the informal case that has a block of special-case code? IMO it seems more natural to treat that as the marked option with a special flag.

I'll leave it up to you whether to make that swap, though; it doesn't really affect anything, just my personal preference for how the code reads.

@@ +906,5 @@
> +  bool isNegative = (ordinal < 0);
> +  bool needZero = (ordinal == 0);
> +  int32_t unitidx = 0, unit10Kidx = 0;
> +  if (isNegative)
> +    ordinal = -ordinal;

nit: braces around the if-statement body, please.
Attachment #833295 - Flags: review?(jfkthame) → review+
Comment on attachment 833296 [details] [diff] [review]
part 4 v2 - reftests

Review of attachment 833296 [details] [diff] [review]:
-----------------------------------------------------------------

Looks fine (I confess haven't attempted to manually verify all the numbers!)

As an extra, can we come up with reftests for the default -moz-padding-end:0.5em, the suppression of this padding in CJK styles, and overriding it in the author style?
Attachment #833296 - Flags: review?(jfkthame) → review+
Based on Tab's response[1] to my recent question, I think it'd be OK for us to go ahead and land these new counter styles, in the expectation that the spec will move to CR by the time this reaches our release channels.

David, are you comfortable with that, or do you think we should hold back until we see the spec actually progress?

[1] http://lists.w3.org/Archives/Public/www-style/2013Nov/0094.html
Flags: needinfo?(dbaron)
There is one more thing which might prevent these patches from being landed. As I mentioned in comment 45, this implementation does not perfectly match the current draft. I have proposed the changes, but they haven't made a decision. It might be better to wait for their response of the issue.

http://dev.w3.org/csswg/css-counter-styles/issues-lc-20130718.html#issue-11
Jonathan, what flags should I set for the modified patches which you have granted?
Created attachment 833341 [details] [diff] [review]
part 4 v3 - reftests

This new version improves the reftest of counter suffix.
Attachment #833296 - Attachment is obsolete: true
Attachment #833341 - Flags: review?(jfkthame)
Comment on attachment 833341 [details] [diff] [review]
part 4 v3 - reftests

Review of attachment 833341 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good, thanks! (Do you have tryserver access, to check that the tests pass across all platforms? I think it's time we should do that - if you haven't, I'll push a try job for these patches when the trees are open again.)
Attachment #833341 - Flags: review?(jfkthame) → review+
(In reply to Xidorn Quan from comment #64)
> Jonathan, what flags should I set for the modified patches which you have
> granted?

If you're just fixing a few code-style issues, or making changes that were requested in an earlier review that gave "r+ with such-and-such things fixed", then feel free to set the flag to r+ directly, and mention "carrying forward r=jfkthame" when attaching the new patch.

If there are substantial modifications that you think I should look at, then setting r? again for the updated patch is appropriate; a comment mentioning what's been changed is also helpful. (As you've already done in earlier revisions - thanks.)
(In reply to Jonathan Kew (:jfkthame) from comment #62)
> Based on Tab's response[1] to my recent question, I think it'd be OK for us
> to go ahead and land these new counter styles, in the expectation that the
> spec will move to CR by the time this reaches our release channels.
> 
> David, are you comfortable with that, or do you think we should hold back
> until we see the spec actually progress?
> 
> [1] http://lists.w3.org/Archives/Public/www-style/2013Nov/0094.html

Sounds fine with me.
Flags: needinfo?(dbaron)
(In reply to Jonathan Kew (:jfkthame) from comment #66)
> Comment on attachment 833341 [details] [diff] [review]
> part 4 v3 - reftests
> 
> Review of attachment 833341 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This looks good, thanks! (Do you have tryserver access, to check that the
> tests pass across all platforms? I think it's time we should do that - if
> you haven't, I'll push a try job for these patches when the trees are open
> again.)

I don't have the access, but I'd like to apply for it. I would appreciate it if you could be my voucher in bug 939473.
Created attachment 833381 [details] [diff] [review]
part 1 v6 - bullet suffix, r=jfkthame
Attachment #833293 - Attachment is obsolete: true
Attachment #833381 - Flags: review+
Created attachment 833382 [details] [diff] [review]
part 2 v5 - implement cjk-decimal, r=jfkthame
Attachment #833294 - Attachment is obsolete: true
Attachment #833382 - Flags: review+
Created attachment 833383 [details] [diff] [review]
part 3 v5 - implement complex cjk counter styles, r=jfkthame
Attachment #833295 - Attachment is obsolete: true
Attachment #833383 - Flags: review+
Tryserver run with current patches:
https://tbpl.mozilla.org/?tree=Try&rev=e051fc0cd612
Oops, it seems that this patchset makes some other reftests fail, and the reftest counter-suffix can not be passed in many platforms.
(In reply to Xidorn Quan from comment #74)
> Oops, it seems that this patchset makes some other reftests fail

That seems likely to be because the tests need to be fixed to match the new behavior.
In the counter-suffix test, we're seeing a line-spacing difference between testcase and reference for some reason; this may be easy to fix once we confirm exactly what's causing it.

The failure in the two font-inflation-related tests (bullet-1.html and list-1.html) looks to me like it may be a real problem; I suspect this is related to how padding is being handled, and we may need to reconsider that. But I haven't investigated in any depth yet...
(In reply to Jonathan Kew (:jfkthame) from comment #76)
> The failure in the two font-inflation-related tests (bullet-1.html and
> list-1.html) looks to me like it may be a real problem; I suspect this is
> related to how padding is being handled, and we may need to reconsider that.
> But I haven't investigated in any depth yet...

Right it is related to how padding is handled. I didn't take the font inflation into account when I wrote the code. The problem might be that we use padding, which is handled inside nsBulletFrame by some fragile code, instead of margin which is handled in nsBlockFrame. In other words, it is not really a bug introduced by my patch, but an undiscovered old bug.
Created attachment 833453 [details] [diff] [review]
part 1 v7 - bullet suffix

This patch fixes the font inflation problem.
Attachment #833381 - Attachment is obsolete: true
Attachment #833453 - Flags: review?(jfkthame)
(In reply to Jonathan Kew (:jfkthame) from comment #76)
> In the counter-suffix test, we're seeing a line-spacing difference between
> testcase and reference for some reason; this may be easy to fix once we
> confirm exactly what's causing it.

I have found the reason for the failure of the counter-suffix test. I don't know whether it is a bug or not.

The reason is that, the height (and ascent, maybe) of bullet will affect the height of content box of li element, and the height of bullet is set to the max height of the given font metrics, which may be larger than the given line-height in some fonts.

Simply increasing the line-height in counter-suffix test could fix the failure here, and I will do that. However, I think the behavior I mentioned above should be reconsider. If you agree that it is a bug, then we may create a new bug for it.
Created attachment 833473 [details] [diff] [review]
part 4 v4 - reftests

Increase the line-height in counter-suffix to 150%.
Attachment #833341 - Attachment is obsolete: true
Attachment #833473 - Flags: review?(jfkthame)
I've started a new tryserver run with these fixes:
https://tbpl.mozilla.org/?tree=Try&rev=53194ffb1111

I haven't really looked at the actual changes yet, but will try to do that shortly.

(By the way, if you add the bug number at the beginning of the commit messages when creating patches, that's one less thing to handle for whoever eventually checks them in.)
(In reply to Xidorn Quan from comment #79)
> (In reply to Jonathan Kew (:jfkthame) from comment #76)
> > In the counter-suffix test, we're seeing a line-spacing difference between
> > testcase and reference for some reason; this may be easy to fix once we
> > confirm exactly what's causing it.
> 
> I have found the reason for the failure of the counter-suffix test. I don't
> know whether it is a bug or not.
> 
> The reason is that, the height (and ascent, maybe) of bullet will affect the
> height of content box of li element, and the height of bullet is set to the
> max height of the given font metrics, which may be larger than the given
> line-height in some fonts.
> 
> Simply increasing the line-height in counter-suffix test could fix the
> failure here, and I will do that. However, I think the behavior I mentioned
> above should be reconsider. If you agree that it is a bug, then we may
> create a new bug for it.

The behavior does seem a bit surprising, but it's not directly related to the changes you're making here, so yes - filing a new bug would be best, I think. Then we can examine it more carefully and decide whether to make a change.

(In most cases, I expect it doesn't make any difference anyhow. The issue showed up because in the earlier suffix-test patch, you had explicitly set line-height to 100%, which is tighter than the usual default. But it turns out that was a good thing, as it made us aware of this other issue!)

For the purposes of the test here, simply setting the line-height larger is fine. Tests all look green on tryserver now. :)
Attachment #833453 - Flags: review?(jfkthame) → review+
Attachment #833473 - Flags: review?(jfkthame) → review+
Let's hold this for a few days to see if there's any further feedback from the CSS or HTML groups regarding the informal CJK styles; if nothing more comes up, I think we can land these patches soon. Thanks for your patience as we've worked through the various issues here!
Created attachment 8333660 [details] [diff] [review]
part 4 v5 - reftests, r=jfkthame

Add one more line in reftests which reflects the change I proposed for Chinese informal styles.
Attachment #833473 - Attachment is obsolete: true
Attachment #8333660 - Flags: review+
Created attachment 8340758 [details] [diff] [review]
part 1 v7 - bullet suffix, r=jfkthame

Prepend bug number to commit messages.
Attachment #833453 - Attachment is obsolete: true
Attachment #8340758 - Flags: review+
Created attachment 8340759 [details] [diff] [review]
part 2 v5 - implement cjk-decimal, r=jfkthame
Attachment #833382 - Attachment is obsolete: true
Attachment #8340759 - Flags: review+
Created attachment 8340760 [details] [diff] [review]
part 3 v5 - implement complex cjk counter styles, r=jfkthame
Attachment #833383 - Attachment is obsolete: true
Attachment #8340760 - Flags: review+
Keywords: checkin-needed
https://hg.mozilla.org/integration/mozilla-inbound/rev/0ec4cc06f49f
https://hg.mozilla.org/integration/mozilla-inbound/rev/7580aeb129c6
https://hg.mozilla.org/integration/mozilla-inbound/rev/b88dfa4ac80c
https://hg.mozilla.org/integration/mozilla-inbound/rev/96a46dfbc2c1
Flags: in-testsuite+
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/0ec4cc06f49f
https://hg.mozilla.org/mozilla-central/rev/7580aeb129c6
https://hg.mozilla.org/mozilla-central/rev/b88dfa4ac80c
https://hg.mozilla.org/mozilla-central/rev/96a46dfbc2c1
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
Marking dev-doc-needed, as this adds a bunch of new values (without a -moz- prefix) for the list-style-type property[1] and the counter() function[2]; see also [3].

[1] https://developer.mozilla.org/en-US/docs/Web/CSS/list-style-type
[2] https://developer.mozilla.org/en-US/docs/Web/Guide/CSS/Counters
[3] http://dev.w3.org/csswg/css-counter-styles/#complex-predefined-counters
Keywords: dev-doc-needed
Flags: sec-review?(jruderman)

Comment 91

4 years ago
I added these list-style-types to the DOM fuzzer manually, but I'd prefer for property_database.js to be updated (bug 946895) so the DOM fuzzer can pick them up automatically.
Flags: sec-review?(jruderman) → sec-review+
These two documents have been updated:
https://developer.mozilla.org/en-US/docs/Web/CSS/list-style-type
and
https://developer.mozilla.org/en-US/Firefox/Releases/28
Keywords: dev-doc-needed → dev-doc-complete
(In reply to Jean-Yves Perrier [:teoli] from comment #92)
> These two documents have been updated:
> https://developer.mozilla.org/en-US/docs/Web/CSS/list-style-type
> and
> https://developer.mozilla.org/en-US/Firefox/Releases/28

Well, why did you think these new types has been supported by WebKit browsers... Firefox is the first browser which supports them!
Flags: needinfo?(jypenator)
You are right. My fault, I used the CompatVersionUnknown macro instead of the CompatUnknown. Somobody fixed it in-between so all should be ok right now. Thank you for the notice!
Flags: needinfo?(jypenator)

Updated

4 years ago
Depends on: 964900

Updated

4 years ago
Depends on: 964078
Well, in the end, the status of the spec is still unchanged. Hope it will go to CR before bug 966166 is landed in release :)
Depends on: 990318
Depends on: 1142400
You need to log in before you can comment on or make changes to this bug.