Closed Bug 1054092 Opened 10 years ago Closed 10 years ago

CSS ::first-letter pseudo-element also grabbing following ampersand

Categories

(Core :: Layout: Text and Fonts, defect)

31 Branch
defect
Not set
normal

Tracking

()

RESOLVED INVALID

People

(Reporter: mvw1, Unassigned)

Details

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36

Steps to reproduce:

When using the ::first-letter pseudo-element, if the first letter is "R" and the following character is an ampersand - whether a lone ampersand or one that is part of a character entity, even the entities for ampersands.  See the following example:

http://jsfiddle.net/zL596y0w/


Actual results:

The ::first-letter selector grabs not only the first letter but also the following character if it is an ampersand.  In my example (http://jsfiddle.net/zL596y0w/) R and & are grabbed instead of just R.



Expected results:

In English, I can't think of a case where requesting the first letter of an element should also grab a following ampersand.
Note that I'm seeing this in the latest released versions of not only Firefox but also of Chrome and Safari.
OS: Mac OS X → All
Hardware: x86 → All
http://www.w3.org/TR/CSS2/selector.html#first-letter says:

  Punctuation (i.e, characters defined in Unicode [UNICODE] in the "open" (Ps), "close" (Pe), "initial" (Pi). "final" (Pf) and "other" (Po) punctuation classes), that precedes or follows the first letter should be included

Unicode has the U+0026 AMPERSAND in the Po (other punctation) class.  (This is also what the U+0022 QUOTATION MARK and U+0027 APOSTROPHE are.

So this is as-specified.


It's not completely clear that it's desirable behavior, though.
Ah.  Fascinating!  I was aware of some subtleties in other languages, but wasn't considering the implications of ampersands being classified as punctuation right along quotation marks in unicode.  I don't suppose there are any CSS-only workarounds?
Also, is there a chance we can make the case that, if the ampersand following ::first-letter is actually part of a legal HTML character entity, it shouldn't be considered part of the ::first-letter unless the rendered entity itself is also in one of the P* classes defined in the specification?
Disregard my last comment - looks like that behavior is correct. EG if we have <p>R&eacute</p> then only the R is treated as the ::first-letter.
Re comment 3 - I'm not aware of any workarounds.
Not a CSS-only workaround, but if you're willing to touch the actual text, inserting a ZERO-WIDTH SPACE (&#x200b;) before the ampersand should help.
Jonathan, we do have control over the text, and the case is such a low occurrence that it will probably be dealt with editorially in the way you suggest - or in a very similar way.  I appreciate the tip!
David - since this is working according to spec, seems like the issue should be closed out.  Feel free to reopen if you think there's another angle that can be pursued (re: Comment 2, "It's not completely clear that it's desirable behavior, though.") - it sounds more like a specification challenge than a bug-fixing one, however.

Thanks!
Status: UNCONFIRMED → RESOLVED
Closed: 10 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.