Closed Bug 744357 Opened 12 years ago Closed 12 years ago

text-transform should use the mappings from Unicode SpecialCasing.txt

Categories

(Core :: Layout: Text and Fonts, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla15

People

(Reporter: jfkthame, Assigned: jfkthame)

References

Details

(Keywords: intl)

Attachments

(2 files, 3 obsolete files)

In nsTextRunTransformations, we currently have hard-coded support for mapping ß -> SS, but there are actually 100 or so additional such one-to-many mappings defined in SpecialCasing.txt.

A simple example that's affected by this would be

data:text/html;charset=utf-8,<div style="text-transform:uppercase">flying firefox

which currently renders (incorrectly) as flYING fiREFOX.

In addition to ligatures like this, there are a number of accented letters for which no corresponding uppercase precomposed form is encoded, and therefore the uppercase transform needs to expand them to a decomposed sequence.

(See also bug 672042, which calls for support of the SpecialCasing.txt mappings in JavaScript's toLowerCase/toUpperCase.)
Depends on: 745454
This implements support in text-transform for the one-to-many mappings specified by Unicode, replacing and extending our existing special-case code for handling German ß->SS.

As there aren't all that many "special mappings", and they don't fit very readily into the main Unicode properties structure as they're variable-length, I've just put them into simple sorted arrays that we can binary-search. This isn't quite as performant as our usual multi-level array indexing, but it's simpler and more compact for this amount of data, and transformed text-runs are not an absolutely perf-critical use case (unlike basic case-folding, as used for many string comparisons).

Note that this patch will cause the existing text-transform reftests to fail, as their reference files are based only on the one-to-one mappings from UnicodeData.txt. The following patch will update the reftests appropriately.
Attachment #616541 - Flags: review?(smontagu)
Attachment #616542 - Flags: review? → review?(smontagu)
Minor update as the previous version hit some MergeCharactersInTextRun assertions on tryserver, due to lacking adequate font coverage for some of the characters that get decomposed by case-mappings.
Attachment #616541 - Attachment is obsolete: true
Attachment #616541 - Flags: review?(smontagu)
Attachment #616929 - Flags: review?(smontagu)
Argh, uploaded the wrong patch - sorry for the spam! This one should be right...
Attachment #616929 - Attachment is obsolete: true
Attachment #616929 - Flags: review?(smontagu)
Attachment #616930 - Flags: review?(smontagu)
And refreshed the reftest patch to match (now using DejaVuSans for better font coverage).
Attachment #616932 - Flags: review?(smontagu)
Attachment #616542 - Attachment is obsolete: true
Attachment #616542 - Flags: review?(smontagu)
Attachment #616930 - Flags: review?(smontagu) → review+
Comment on attachment 616932 [details] [diff] [review]
patch, update case-mapping reftests to account for SpecialCasing.txt mappings

Review of attachment 616932 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/reftests/text-transform/reftest.list
@@ +8,5 @@
>  == lowercase-sigma-1.html lowercase-sigma-1-ref.html
>  == small-caps-1.html small-caps-1-ref.html
>  == uppercase-1.html uppercase-ref.html
>  == uppercase-szlig-1.html uppercase-szlig-ref.html
> +# these use LinLibertine via @font-face for consistency of results

Do they? I thought they used DejaVuSans
Attachment #616932 - Flags: review?(smontagu) → review+
(In reply to Simon Montagu from comment #6)
> Do they? I thought they used DejaVuSans

Just checking that you're alert!

(Ok, you caught me - I changed the font, and forgot to update the comment. Will fix.)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: