Closed Bug 208789 Opened 22 years ago Closed 2 years ago

"text-transform: capitalize" needs to understand word boundaries

Categories

(Core :: Layout: Text and Fonts, defect, P4)

defect

Tracking

()

RESOLVED FIXED
113 Branch
Tracking Status
firefox113 --- fixed

People

(Reporter: travis.seitler, Assigned: jfkthame)

References

()

Details

Attachments

(4 files)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4b) Gecko/20030516 Mozilla Firebird/0.6 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4b) Gecko/20030516 Mozilla Firebird/0.6 In http://www.rnd.knoxcounty.org/purchasing/testing/supplierdiversityprogram.html, near the bottom of the page, the heading 'What are "small and disadvantaged businesses"?' has the 'text-transform: capitalize' rule applied, and so it should appear as: What Are "Small And Disadvantaged Businesses"? In Firebird 0.6, the word "small" is not capitalized. It appears that Firebird sees the quote character as the first letter of the word, and tries to "capitalize" the quote instead of the character following it. Reproducible: Always Steps to Reproduce: 1. Visit page with text-transform: capitalize applied to text within quotes, such as http://www.rnd.knoxcounty.org/purchasing/testing/supplierdiversityprogram.html Actual Results: The heading appeared as: What Are "small And Disadvantaged Businesses"? Expected Results: Recognize that the quote character is not a letter, and capitalize the letter following it instead.
Not a Mozilla Firebird-specific bug. It's most probably a (known?) Gecko bug.
Assignee: blaker → font
Component: General → Layout: Fonts and Text
Product: Phoenix → Browser
QA Contact: asa → ian
Version: unspecified → Trunk
Well, the bug is confirmed in Mozilla too (build 2003060503 on Mac OS X 10.2.6), but I can't find any comments in the CSS-spec about it. It must be a cultural thing too, but I think that skipping any " or ' prefix might be ok. Maybe other characters too ? The code that has to be changed is in http://lxr.mozilla.org/seamonkey/source/layout/html/base/src/nsTextTransformer.cpp (search for NS_STYLE_TEXT_TRANSFORM_CAPITALIZE). We might need something like : static void AsciiToTitleCase(unsigned char* aText, PRInt32 aWordLen) { while (aWordLen-- > 0) { if ( (*aText == '"') || (*aText == '\'') ) continue; *aText = toupper(*aText); break; } } and we'll have to fix nsCaseConversionImp2::ToTitle() too (see http://lxr.mozilla.org/seamonkey/source/intl/unicharutil/src/nsCaseConversionImp2.cpp#236)
Status: UNCONFIRMED → NEW
Ever confirmed: true
IIRC, there should already be code to achieve the same effect with :first-letter (ensuring that a letter following a punctuation character gets the :first-letter style), although I think it's only semi-funcitonal right now. Sharing that code should probably be investigated.
Priority: -- → P4
Target Milestone: --- → Future
It isn't only quotation marks. Capitalization also fails to occur after other punctuation such as a period or an open parenthesis. For examples see: http://forums.mozillazine.org/viewtopic.php?t=46482&highlight=
Summary: "text-transform: capitalize" misses first word within quotes → "text-transform: capitalize" needs to understand word boundaries
*** Bug 273360 has been marked as a duplicate of this bug. ***
*** Bug 345456 has been marked as a duplicate of this bug. ***
Severity: trivial → minor
OS: Windows 2000 → All
Hardware: PC → All
Attached file test case
The present code is structured rather differently, and I'm not at all sure how it manages to work - but it does the Right Thing for all the examples I can find, except arguably "c.a.g.e" -> "C.a.g.e" (and not "C.A.G.E") - I attach a test case and reference.
Assignee: layout.fonts-and-text → nobody
QA Contact: ian → layout.fonts-and-text
Any status on this bug? It works for most layouts, but still not working properly for some instances, such as the example above c.a.g.e -> still gives C.a.g.e (should be C.A.G.E) as well as super-size -> gives Super-size (should be Super-Size). Appears to work properly in Safari 5 and Chrome 6. Maybe a simple workaround might be to just add a check for a few specific, common symbols and punctuation until a proper fix can be written.
< - BUMP ? - > Still not working in some instances for FF v 17.0.1. Example: 6-figure ...should be... 6-Figure IDEA ? (I'm not a coder, outside of HTML/PHP/JavaScript ;-) but hopefully what I write will make sense to a coder... There are only 26 lowercase characters on the English keyboard. What about checking something along the lines of: if ("abcdefghijklmnopqrstuvwxyz").indexOf(theFirstLetter) == 0) //if the first char is lowercase { upperCase(%TheFirstLetter); } --------------------- Granted, this may be more difficult in other languages and may be processor heavy. Just an idea. Thanks for your great work! -Mike
Severity: minor → S4
Blocks: 1823451

I think we should consider adjusting the behavior here, so that a letter after word-internal punctuation gets capitalized (so "Word-Internal" rather than "Word-internal", if applied to this sentence). My sense is that this is more often the desired result than not, and it will align us more closely with how webkit/blink-based browsers behave.

(I'll note, though, that whatever we do, there will inevitably be cases where the result turns out to be inferior. Compare

data:text/html,<span style="text-transform: capitalize;">how does your country spell "colo[u]r"?</span>

between Firefox and the other browsers. There's no perfect solution here.)

No change in behavior; this just gives us our own version of the general category constants,
so we can avoid depending on ICU's constants elsewhere in the codebase.

Assignee: nobody → jfkthame
Status: NEW → ASSIGNED

This implements the adjustment to our behavior, bringing us closer to the other browsers
(although with a better result for examples like "colo[u]r", which we continue to treat
as one word rather than three).

Simple reftest included; for now, I've put it with our in-house tests rather than under
web-platform-tests because the exact behavior here is somewhat under-specified.

Depends on D173203

Pushed by jkew@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/cc59256f5ef0 patch 1 - Create an intl::GeneralCategory enum for UnicodeProperties::CharType() to return, to avoid directly referring to ICU4C constants or mapping via harfbuzz constants. r=platform-i18n-reviewers,nordzilla https://hg.mozilla.org/integration/autoland/rev/c5bd0893afc5 patch 2 - Make text-transform:capitalize act on a letter that follows punctuation without a space, as in "Tip-Top" or "What.Three.Words". r=emilio
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Duplicate of this bug: 1823451
Target Milestone: Future → 113 Branch
Regressions: 1827604
Regressions: 1911550
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: