Last Comment Bug 744357 - text-transform should use the mappings from Unicode SpecialCasing.txt
: text-transform should use the mappings from Unicode SpecialCasing.txt
Status: RESOLVED FIXED
: intl
Product: Core
Classification: Components
Component: Layout: Text (show other bugs)
: unspecified
: All All
: -- normal (vote)
: mozilla15
Assigned To: Jonathan Kew (:jfkthame)
:
Mentors:
Depends on: 745454
Blocks:
  Show dependency treegraph
 
Reported: 2012-04-11 03:49 PDT by Jonathan Kew (:jfkthame)
Modified: 2012-05-03 10:04 PDT (History)
4 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch, implement the one-to-many mappings from SpecialCasing.txt (37.64 KB, patch)
2012-04-19 06:30 PDT, Jonathan Kew (:jfkthame)
no flags Details | Diff | Review
patch, update case-mapping reftests to account for SpecialCasing.txt mappings (49.52 KB, patch)
2012-04-19 06:31 PDT, Jonathan Kew (:jfkthame)
no flags Details | Diff | Review
patch, implement the one-to-many mappings from SpecialCasing.txt (20.79 KB, patch)
2012-04-20 04:37 PDT, Jonathan Kew (:jfkthame)
no flags Details | Diff | Review
patch, implement the one-to-many mappings from SpecialCasing.txt (38.44 KB, patch)
2012-04-20 04:39 PDT, Jonathan Kew (:jfkthame)
smontagu: review+
Details | Diff | Review
patch, update case-mapping reftests to account for SpecialCasing.txt mappings (49.52 KB, patch)
2012-04-20 04:40 PDT, Jonathan Kew (:jfkthame)
smontagu: review+
Details | Diff | Review

Description Jonathan Kew (:jfkthame) 2012-04-11 03:49:16 PDT
In nsTextRunTransformations, we currently have hard-coded support for mapping ß -> SS, but there are actually 100 or so additional such one-to-many mappings defined in SpecialCasing.txt.

A simple example that's affected by this would be

data:text/html;charset=utf-8,<div style="text-transform:uppercase">flying firefox

which currently renders (incorrectly) as flYING fiREFOX.

In addition to ligatures like this, there are a number of accented letters for which no corresponding uppercase precomposed form is encoded, and therefore the uppercase transform needs to expand them to a decomposed sequence.

(See also bug 672042, which calls for support of the SpecialCasing.txt mappings in JavaScript's toLowerCase/toUpperCase.)
Comment 1 Jonathan Kew (:jfkthame) 2012-04-19 06:30:33 PDT
Created attachment 616541 [details] [diff] [review]
patch, implement the one-to-many mappings from SpecialCasing.txt

This implements support in text-transform for the one-to-many mappings specified by Unicode, replacing and extending our existing special-case code for handling German ß->SS.

As there aren't all that many "special mappings", and they don't fit very readily into the main Unicode properties structure as they're variable-length, I've just put them into simple sorted arrays that we can binary-search. This isn't quite as performant as our usual multi-level array indexing, but it's simpler and more compact for this amount of data, and transformed text-runs are not an absolutely perf-critical use case (unlike basic case-folding, as used for many string comparisons).

Note that this patch will cause the existing text-transform reftests to fail, as their reference files are based only on the one-to-one mappings from UnicodeData.txt. The following patch will update the reftests appropriately.
Comment 2 Jonathan Kew (:jfkthame) 2012-04-19 06:31:34 PDT
Created attachment 616542 [details] [diff] [review]
patch, update case-mapping reftests to account for SpecialCasing.txt mappings
Comment 3 Jonathan Kew (:jfkthame) 2012-04-20 04:37:03 PDT
Created attachment 616929 [details] [diff] [review]
patch, implement the one-to-many mappings from SpecialCasing.txt

Minor update as the previous version hit some MergeCharactersInTextRun assertions on tryserver, due to lacking adequate font coverage for some of the characters that get decomposed by case-mappings.
Comment 4 Jonathan Kew (:jfkthame) 2012-04-20 04:39:29 PDT
Created attachment 616930 [details] [diff] [review]
patch, implement the one-to-many mappings from SpecialCasing.txt

Argh, uploaded the wrong patch - sorry for the spam! This one should be right...
Comment 5 Jonathan Kew (:jfkthame) 2012-04-20 04:40:57 PDT
Created attachment 616932 [details] [diff] [review]
patch, update case-mapping reftests to account for SpecialCasing.txt mappings

And refreshed the reftest patch to match (now using DejaVuSans for better font coverage).
Comment 6 Simon Montagu :smontagu 2012-04-24 08:33:59 PDT
Comment on attachment 616932 [details] [diff] [review]
patch, update case-mapping reftests to account for SpecialCasing.txt mappings

Review of attachment 616932 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/reftests/text-transform/reftest.list
@@ +8,5 @@
>  == lowercase-sigma-1.html lowercase-sigma-1-ref.html
>  == small-caps-1.html small-caps-1-ref.html
>  == uppercase-1.html uppercase-ref.html
>  == uppercase-szlig-1.html uppercase-szlig-ref.html
> +# these use LinLibertine via @font-face for consistency of results

Do they? I thought they used DejaVuSans
Comment 7 Jonathan Kew (:jfkthame) 2012-04-24 08:59:25 PDT
(In reply to Simon Montagu from comment #6)
> Do they? I thought they used DejaVuSans

Just checking that you're alert!

(Ok, you caught me - I changed the font, and forgot to update the comment. Will fix.)

Note You need to log in before you can comment on or make changes to this bug.