first-letter should include both leading and trailing punctuation

RESOLVED FIXED

Status

()

Core
Layout: Text
RESOLVED FIXED
10 years ago
9 years ago

People

(Reporter: smontagu, Assigned: smontagu)

Tracking

({css2})

Trunk
Points:
---
Bug Flags:
wanted-next +
blocking1.9 -
wanted1.9 -

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments, 3 obsolete attachments)

(Assignee)

Description

10 years ago
Created attachment 285035 [details]
Testcase

From http://www.w3.org/TR/CSS21/selector.html#first-letter :
'Punctuation (i.e, characters defined in Unicode in the "open" (Ps), "close" (Pe), "initial" (Pi). "final" (Pf) and "other" (Po) punctuation classes), that precedes or follows the first letter should be included'

Currently we include punctuation preceding the first letter, but not following. Not a regression, but requesting blocking1.9 for standards compliance.
Flags: blocking1.9?
(Assignee)

Comment 1

10 years ago
Just for the record, CSS2 only says 'precedes the first letter'.

Comment 2

10 years ago
Comment 1 refers to CSS2.0, which is no longer what "CSS2" means aiui.
Keywords: css2
(Assignee)

Comment 3

10 years ago
Created attachment 285139 [details] [diff] [review]
Patch
Assignee: nobody → smontagu
Status: NEW → ASSIGNED
Attachment #285139 - Flags: review?(roc)
Should probably advance to the end of the cluster containing any trailing punctuation.

Should probably call FindClusterEnd to do this (and use that to instead of the find-end-of-cluster-loop already in the FindFirstLetterRange).
(Assignee)

Comment 5

10 years ago
Created attachment 285290 [details] [diff] [review]
Patch

This passes all the reftests in layout/reftests/first-letter without assertions, plus some extra ones I added to cover trailing punctuation, and the testcase in bug 329069 (which we should really add a reftest for). I want to test more scenarios before asking for review, but interim comments are always welcome.
Attachment #285139 - Attachment is obsolete: true
Attachment #285139 - Flags: review?(roc)
(Assignee)

Comment 6

10 years ago
Comment on attachment 285290 [details] [diff] [review]
Patch

>+  // consume clusters that start with whitespace or punctuation
>   for (i = 0; i < length; ++i) {
>+    if (IsTrimmableSpace(aFrag, aOffset + i) ||
>+        nsTextFrameUtils::IsPunctuationMark(aFrag->CharAt(aOffset + i))) {
>+      iter.SetSkippedOffset(aOffset + i);
>+      FindClusterEnd(aTextRun, aOffset + length, &iter);
>+      i = iter.GetSkippedOffset() - aOffset;
>+    } else {
>       break;
>+    }

I'm not sure about this part of the algorithm. Do we really want to include punctuation interspersed with whitespace? Would this be better?

  for (i = GetTrimmableWhitespaceCount(aFrag, aOffset, length, 1);
       i < length; ++i) {
    if (nsTextFrameUtils::IsPunctuationMark(aFrag->CharAt(aOffset + i))) {

I'm also not sure if I should be using iter.[GS]etSkippedOffset or [GS]etOriginalOffset
> Do we really want to include punctuation interspersed with whitespace?

I think we might as well.

I think instead of making i the key variable, we should just use iter as the key variable. Just compare iter.GetOriginalOffset() to aOffset + aLength.
Flags: blocking1.9? → blocking1.9-
Whiteboard: [wanted-1.9]
Flags: wanted1.9+
Whiteboard: [wanted-1.9]
Simon, were you planning to finish this?
Flags: wanted1.9-
Flags: wanted1.9+
Flags: wanted-next+
(Assignee)

Comment 9

9 years ago
After fiddling with this for some time, I'm fairly sure that we don't want to include punctuation interspersed with whitespace. Consider this case:

"*" is an asterisk.

The first-letter style would be applied to '"*" i', which seems totally wrong.

A question that bothers me more, though, is: when is a letter not a letter? CSS doesn't explicitly define what "letter" means in this context, though it does go out of its way to specify that the ':first-letter' also applies if the first letter is in fact a digit. To me this implies that if the first letter is in fact a punctuation mark (other than those specified to be included) or a symbol character, :first-letter should not apply.
(Assignee)

Comment 10

9 years ago
Created attachment 326912 [details]
Various edge cases
(Assignee)

Comment 11

9 years ago
Created attachment 326983 [details] [diff] [review]
Patch

This does what I believe is right, and only applies first-letter style to letters and numbers (with preceding and following punctuation). It also adds support for surrogate characters to IsPunctuationMark(). I'll be adding lots of reftests too.
Attachment #285290 - Attachment is obsolete: true
Attachment #326983 - Flags: superreview?(roc)
Attachment #326983 - Flags: review?(roc)
+  static nsCOMPtr<nsIUGenCategory> genCatService =
+    do_GetService(NS_UNICHARCATEGORY_CONTRACTID);

You might as well cache this with the other services that nsContentUtils caches.

+  // If the next character is not a letter or number, there is no first-letter.
+  // Return PR_TRUE so that we don't go on looking, but set aLength to 0.

So this means that '<span>"</span> F' does get first-letter style applied to the punctuation. That's unfortunate but I guess anything better is a lot harder, so I'm OK with it.

The code for "consume cluster that starts with punctuation" could be shared.

Looks good other than that.
(Assignee)

Comment 13

9 years ago
Created attachment 327312 [details] [diff] [review]
Addressed comments
Attachment #326983 - Attachment is obsolete: true
Attachment #327312 - Flags: superreview?(roc)
Attachment #327312 - Flags: review?(roc)
Attachment #326983 - Flags: superreview?(roc)
Attachment #326983 - Flags: review?(roc)
Attachment #327312 - Flags: superreview?(roc)
Attachment #327312 - Flags: superreview+
Attachment #327312 - Flags: review?(roc)
Attachment #327312 - Flags: review+
(Assignee)

Comment 14

9 years ago
Checked in
Status: ASSIGNED → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → FIXED
(Assignee)

Comment 15

9 years ago
Note: if we take this on the 3.0 branch, the branch patch should include the fix to bug 445626
You need to log in before you can comment on or make changes to this bug.