text-transform:capitalize behavior is incorrect per CSS3-Text spec

RESOLVED FIXED in mozilla14

Status

()

Core
Layout: Text
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: jfkthame, Assigned: jfkthame)

Tracking

Trunk
mozilla14
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments, 1 obsolete attachment)

(Assignee)

Description

6 years ago
The method nsContentUtils::IsPunctuationMark is misleadingly named, because it is designed specifically to support CSS first-letter, but for this purpose it recognizes a specific _subset_ of punctuation marks, not all Unicode characters with GC=P*.

We should rename it to something like IsPunctuationForFirstLetter to make its specialized nature clear.

Currently, there appear to be two clients of this method: the implementation of first-letter, for which it does the right thing, and the implementation of text-transform:capitalize. In the latter case, it leads to the oddity shown by this example:

data:text/html;charset=utf-8,<div style="text-transform:capitalize">(this) "is" [a] –short– -test- «for» *the* ¿capitalize? ¡transform!

where Firefox does _not_ capitalize the words "short" or "test". Safari, OTOH, does capitalize them.

(The current CSS3 Text spec says that "capitalize" should put "the first _letter_ of each word in titlecase", where "letter" is defined as "a character belonging to one of the Letter or Number general categories". According to this, Safari is right and Firefox is wrong. Using a definition of IsPunctuationMark that recognizes all GC=P* characters would be an improvement, though )

So I think we should (a) modify the implementation of capitalize to avoid relying on nsContentUtils::IsPunctuationMark, as that is not the correct test; and (b) change the name of that method to clarify its specialized nature.
(Assignee)

Comment 1

6 years ago
(In reply to Jonathan Kew (:jfkthame) from comment #0)
> Using a definition
> of IsPunctuationMark that recognizes all GC=P* characters would be an
> improvement, though )

Oops, unfinished sentence! I was going to say that while a more complete IsPunctuationMark would be an improvement, it still ignores characters with Symbol classes; to follow CSS3 Text, we should be testing for !IsAlphanumeric rather than IsPunctuation to skip over leading characters.
(Assignee)

Comment 2

6 years ago
Created attachment 601574 [details] [diff] [review]
patch, rename IsPunctuationMark and don't use it in text-transform:capitalize
Assignee: nobody → jfkthame
Attachment #601574 - Flags: review?
(Assignee)

Updated

6 years ago
Attachment #601574 - Flags: review? → review?(smontagu)
(Assignee)

Comment 3

6 years ago
Created attachment 601575 [details] [diff] [review]
patch, reftest for text-transform:capitalize with punctuation

It's not an exhaustive test, obviously, but includes a few cases that fail currently and are fixed by the patch here.
Attachment #601575 - Flags: review?(smontagu)
(Assignee)

Comment 4

6 years ago
It turns out the patch here causes the text-transform/all-title.html reftest to fail, mostly because it changes behavior with various characters in the Symbol category (e.g. the circled letters at U+24Dx). So we'll need to consider carefully what changes are or aren't desirable - and perhaps take this up with the CSS WG.
Looking at bug 389707 comment 4, using the same punctuation classes in capitalization as in first-letter was by design. I think we should discuss this in www-style before making any change to behaviour here. I agree with the naming change, though, and in fact I was going to make that change anyway before checking in bug 731222 in response to your review comment there.
Comment on attachment 601574 [details] [diff] [review]
patch, rename IsPunctuationMark and don't use it in text-transform:capitalize

Review of attachment 601574 [details] [diff] [review]:
-----------------------------------------------------------------

r- until we get the spec sorted out.
Attachment #601574 - Flags: review?(smontagu) → review-
Comment on attachment 601575 [details] [diff] [review]
patch, reftest for text-transform:capitalize with punctuation

Review of attachment 601575 [details] [diff] [review]:
-----------------------------------------------------------------

... but feel free to check this in as a failing test, mutatis mutandis, in the meanwhile.
Attachment #601575 - Flags: review?(smontagu) → review+
(Assignee)

Comment 8

6 years ago
(In reply to Simon Montagu from comment #6)
> Comment on attachment 601574 [details] [diff] [review]
> patch, rename IsPunctuationMark and don't use it in text-transform:capitalize
> 
> Review of attachment 601574 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r- until we get the spec sorted out.

Yep, fair enough!

There was a lengthy www-style thread on text-transform:capitalize a year ago, actually, which I've been re-reading. It's not clear to me that there's much point in re-opening it - there are plenty of cases where no spec or algorithm will be 100% "right", and there's little interest in ratholing about edge cases that can never be fully solved - but I've posted a message basically asking for confirmation that the spec as it currently stands is what we should aim to implement, particularly with regard to the areas where we'll need to modify our current behavior.
(Assignee)

Comment 9

6 years ago
Seeing no dissent - or indeed any interest in more discussion on the topic! - in response to http://lists.w3.org/Archives/Public/www-style/2012Mar/0018.html, I think we should simply adjust our implementation to conform to the current css3-text spec, making it capitalize the first alphanumeric character rather than the first character that is not in specific (first-letter) punctuation categories.

This provides a more intuitive result for authors, as well as better matching what the other browsers are doing (though none of them currently follow the spec entirely, as described in the www-style message).
Summary: text-transform:capitalize behavior is incorrect, partly because name of nsContentUtils::IsPunctuationMark is misleading → text-transform:capitalize behavior is incorrect per CSS3-Text spec
(Assignee)

Comment 10

6 years ago
Created attachment 602840 [details] [diff] [review]
patch, adjust text-transform:capitalize to follow css3 text spec

Re-requesting review here (for updated patch), after going through the spec and past www-style discussions. This adjusts our implementation to conform to the spec regarding which character is treated as the "letter" that gets capitalized.

For completeness, I've included surrogate handling in SetupCapitalization, although text-transform:capitalize won't actually work for Deseret until we also fix bugs 210501 and 605021.
Attachment #601574 - Attachment is obsolete: true
Attachment #602840 - Flags: review?(smontagu)
Attachment #602840 - Flags: review?(smontagu) → review+
(Assignee)

Comment 11

6 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/6ac8b89ae5f0 (patch)
https://hg.mozilla.org/integration/mozilla-inbound/rev/0362bad5569c (reftest)
Target Milestone: --- → mozilla14
(Assignee)

Comment 12

6 years ago
Created attachment 605689 [details] [diff] [review]
followup, add a reftest for Deseret text capitalization with punctuation marks
Attachment #605689 - Flags: review?(smontagu)
Attachment #605689 - Flags: review?(smontagu) → review+
(Assignee)

Comment 13

6 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/880369fd5beb
https://hg.mozilla.org/mozilla-central/rev/6ac8b89ae5f0
https://hg.mozilla.org/mozilla-central/rev/0362bad5569c
https://hg.mozilla.org/mozilla-central/rev/880369fd5beb
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.