Last Comment Bug 400938 - make MathML work with Unicode fonts
: make MathML work with Unicode fonts
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: MathML (show other bugs)
: Trunk
: All All
: P1 normal with 1 vote (vote)
: mozilla1.9beta2
Assigned To: Karl Tomlinson (:karlt)
: Hixie (not reading bugmail)
:
Mentors:
Depends on: 289938 321438 382542 401988 403718 406927 408356 410284
Blocks: mathml-fonts 236880 324857 cambria-math 407059 407101 asana-math 483116
  Show dependency treegraph
 
Reported: 2007-10-24 03:04 PDT by Karl Tomlinson (:karlt)
Modified: 2011-06-12 21:40 PDT (History)
18 users (show)
roc: blocking1.9+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
don't implicitly lossy convert from nsGlyphCode to PRUnichar [checked in] (14.88 KB, patch)
2007-11-13 18:41 PST, Karl Tomlinson (:karlt)
roc: review+
roc: superreview+
Details | Diff | Splinter Review
mathfontUnicode.properties (6.30 KB, text/plain)
2007-11-22 15:45 PST, Karl Tomlinson (:karlt)
no flags Details
mathfontStandardSymbolsL.properties (3.81 KB, text/plain)
2007-11-22 15:57 PST, Karl Tomlinson (:karlt)
pavlov: review+
Details
mathfontSTIXNonUnicode.properties (5.22 KB, text/plain)
2007-11-22 18:19 PST, Karl Tomlinson (:karlt)
pavlov: review+
Details
mathfontSTIXSize1.properties (2.96 KB, text/plain)
2007-11-22 18:21 PST, Karl Tomlinson (:karlt)
pavlov: review+
Details
Update list of property files and preferred fonts (6.75 KB, patch)
2007-11-22 19:01 PST, Karl Tomlinson (:karlt)
no flags Details | Diff | Splinter Review
mathfontUnicode.properties support code (109.84 KB, patch)
2007-11-22 19:47 PST, Karl Tomlinson (:karlt)
no flags Details | Diff | Splinter Review
mathfontUnicode.properties with l/r moustache (6.56 KB, text/plain)
2007-11-22 19:50 PST, Karl Tomlinson (:karlt)
pavlov: review+
Details
Update list of property files and preferred fonts 1.1 (6.75 KB, patch)
2007-11-26 12:12 PST, Karl Tomlinson (:karlt)
pavlov: review+
Details | Diff | Splinter Review
use nsTArray<nsGlyphTable> (7.69 KB, patch)
2007-12-04 14:17 PST, Karl Tomlinson (:karlt)
no flags Details | Diff | Splinter Review
mathfontUnicode.properties support code v2 (109.61 KB, patch)
2007-12-04 14:34 PST, Karl Tomlinson (:karlt)
pavlov: review+
roc: superreview+
Details | Diff | Splinter Review

Description Karl Tomlinson (:karlt) 2007-10-24 03:04:57 PDT
We now have or will soon have available fonts with more complete Unicode
mathematical symbol support, which enables MathML font handling to be done in
a more standard way.

I see four categories of fonts:

1. True Unicode fonts:

   These each have a Unicode charmap that links Unicode code points to its
   glyphs for characters that have a Unicode code point assignment.  If
   Mozilla's MathML can be made to work with any one of these fonts then it
   will (mostly) work with any Unicode font available today or in the future.
   (If there are glyphs for characters that do not have a Unicode assignment
   then the Unicode Private Use Area (PUA) range is used for those glyphs.)

   Fonts in this category suitable for use in MathML include Cambria Math and
   the (hopefully) soon-to-be-released STIX fonts.  Meiryo, DejaVu Sans,
   Lucida Sans Unicode, Open Symbol, and Adobe's Symbol font (not the Monotype
   version) also fit in this category but have lesser coverage of the
   mathematical characters.  However, some of these fonts don't necessarily
   have the Unicode mapping for some Unicode characters even though they have
   a glyph (a PUA mapping is used instead, perhaps because the Unicode
   assignment didn't exist at the time the font was designed).

2. PUA-only Unicode fonts:

   These fonts have a Unicode charmap but there are only mappings to the PUA.

   Fonts in this category include TrueType versions of Math1 (not Math2) Math3
   Math4 Math5 Mathematica1 (not Mathematica2) Mathematica3 Mathematica4 (not
   Mathematica5) Mathematica6 and Mathematica7.

3. Fonts with a Microsoft Symbol charmap:

   This charmap is really a PUA-only Unicode charmap (and so this category is
   in many ways the same as Category 2) but the charmap is labelled
   differently (perhaps to make it clear that there is no defined meaning for
   these code points).

   The difficultly with relying on these fonts is that the platform libraries
   often (usually?) do not select this charmap but instead use another charmap
   that is incorrect and so these fonts end up falling into Category 4, below.
   If we explicitly selected the Microsoft Symbol charmap, then these fonts
   can be treated the same way as fonts in Category 2, but the current
   decision to keep the new non-Unicode behaviour in Bug 399636 would seem to
   preclude doing this as a general rule.  Perhaps, making Thebes search all
   (or some subset of) charmaps when mapping the character code points to
   glyphs would mean that these fonts could be used like Category 2 fonts.
   (This seems to be what currently happens on Windows.)

   Fonts in this Category include Monotype's Symbol, Math2, Mathematica2,
   and Mathematica5.

4. Deceitful Fonts:

   These claim to have glyphs corresponding to certain characters (in a
   Unicode or other charmap) but in fact provide glyphs for completely
   different characters.  Fonts in this category are best not installed (as
   system fonts) because an application may innocently believe that
   the fonts provide these characters and not know that they are providing
   garbage.

   Fonts in this category include the TrueType versions of cmr10 cmmi10 cmsy10
   and cmex10.


With fonts in Categories 2-4, using them to provide glyphs for mathematical
symbols requires explicit encoding tables to support each individual font.
Hopefully much of this can be avoided by adding general support for Category 1
fonts.

There are Unicode code point assignments for the separate parts of some
stretchy characters.  These include all the parts of left and right
parentheses, square brackets, and curly brackets, as well as top and bottom
half integral and integral extension.  There is a radical symbol bottom that
seems designed to join up (vertically) with "box drawings light down and
right".

I assume there are some stretchy characters that do not have assignments for
the desired sub-parts, and the Unicode code points do not provide a range of
sizes of characters, nor a range of angles for stretchy radical symbols.
However, we can aim to make things work as well as possible with Unicode code
points so that any Unicode font will work, but include some set of font/PUA
mappings so that MathML looks most beautiful with some special set of fonts.


The way special mathematical characters in MathML worked for Mozilla 1.8 was
that Mozilla had its own PUA assignments which gfx then translated, through
explicit tables of the mappings for each of the fonts, to pick up the desired
glyphs.  It would be good to avoid having gfx assign its own meaning for PUA
code points as these code points may be used in web pages to select characters
with no Unicode assignments from the PUA of specific fonts.

With fonts in Categories 1 and 2 (and maybe 3), any required translation from
Mozilla-MathML code point assignments to font-specific code points can be done
in a cross-platform manner in the MathML implementation (rather than in
Thebes).
Comment 1 Karl Tomlinson (:karlt) 2007-11-13 18:41:38 PST
Created attachment 288601 [details] [diff] [review]
don't implicitly lossy convert from nsGlyphCode to PRUnichar [checked in]

NS_ASSERTION(ch != uchar, "glyph table incorrectly set -- duplicate found")
was firing when different sizes of a glyph (from different fonts) had the same
Unicode point, as needed for STIXSize* fonts.

This changes the assertion to check the entire nsGlyphCode (including font),
and implements an Exists method to make the implicit conversion to PRUnichar
unnecessary.
Comment 2 Robert O'Callahan (:roc) (email my personal email if necessary) 2007-11-13 20:10:58 PST
Comment on attachment 288601 [details] [diff] [review]
don't implicitly lossy convert from nsGlyphCode to PRUnichar [checked in]

+  return BigOf(aPresContext, aChar, 1).Exists() != 0;

lose the != 0
Comment 3 Karl Tomlinson (:karlt) 2007-11-15 14:15:21 PST
Comment on attachment 288601 [details] [diff] [review]
don't implicitly lossy convert from nsGlyphCode to PRUnichar [checked in]

Checked this patch in with out the != 0:

1.131       mozilla/layout/mathml/base/src/nsMathMLChar.cpp
1.29        mozilla/layout/mathml/base/src/nsMathMLChar.h
Comment 4 Karl Tomlinson (:karlt) 2007-11-22 15:45:05 PST
Created attachment 289859 [details]
mathfontUnicode.properties

For mozilla/layout/mathml/base/src.

A glyph table for operators that can be stretched using only glyphs with Unicode code points.  This can be shared by Unicode fonts.
Comment 5 Karl Tomlinson (:karlt) 2007-11-22 15:57:41 PST
Created attachment 289861 [details]
mathfontStandardSymbolsL.properties

For mozilla/layout/mathml/base/src.

Fonts with entries in the PUA still need their own glyph tables.
No Mozilla-MathML-PUA to custom-font-code mapping is done in gfx now, so the tables need to specify the PUA codes for the font directly rather than using the \uF8FF indirect reference trick.  In some ways, this makes the system more
flexible as it avoids the need to have the encoding compiled in.

This file is based on the mathfontSymbol.properties file but the entry for U+005F &lowbar; is changed to U+0332 &UnderBar; because &lowbar; is not stretchy.
Comment 6 Karl Tomlinson (:karlt) 2007-11-22 18:19:21 PST
Created attachment 289872 [details]
mathfontSTIXNonUnicode.properties

Some stretchy chars that can be built using STIXNonUnicode.

This provides for all the stretchy characters that could be built from parts
using the CMSY10 and CMEX10 properties files, but cannot be built from Unicode
chars, except for radicals.  (I need to add l/r moustache U+23B0/U+23B1 to the
Unicode table.)
Comment 7 Karl Tomlinson (:karlt) 2007-11-22 18:21:05 PST
Created attachment 289873 [details]
mathfontSTIXSize1.properties

Support for building radicals from parts, and some variant sizes of brackets.
There are more characters in the STIXSize* files but this gives a good start.
Comment 8 Karl Tomlinson (:karlt) 2007-11-22 19:01:29 PST
Created attachment 289876 [details] [diff] [review]
Update list of property files and preferred fonts

It will be possible to use the Math1 and Math4 or preferably newer Mathematica1
and Mathematica4 fonts for stretchy chars if the property files are
updated/provided, but as they currently reference the old Mozilla-MathML PUA
assignments they are best disabled for now.

I have no intention of using the (category 4) TeX fonts at this stage.

It looks like MT Extra wasn't being used for any stretchy chars.

I'm removing Symbol from the mathml.css file as the Monotype version (for
Windows) is not Unicode (Bug 399636).  The Adobe and Apple versions of Symbol
will still be used for font fallback if no other fonts support the character.
Comment 9 Karl Tomlinson (:karlt) 2007-11-22 19:47:06 PST
Created attachment 289878 [details] [diff] [review]
mathfontUnicode.properties support code

Enables use of the mathfontUnicode.properties glyph table for fonts that don't
have/need a special glyph table.  The algorithm used is to iterate through the
ordered list of preferred font families.  If the font family has a its own glyph table then that table is used with that family to see if the character is of the correct size.  If the font does not have its own glyph table, then the Unicode table is used and glyphs are found using the complete list of font families.

The stretch searching algorithm is changed a little to prefer building
from parts in a preferred font over using variants in other fonts.
The search in variants within one font has been changed to continue searching
for the best even if an approximate fit was found.

Only uses preferences rather than mathfont.properties for preferred fonts and
doesn't cache the values so that a restart is not required to make the changes
take effect.

The &le; &ge; and minus issues seemed related to TeX fonts, which are no longer
being used, so we shouldn't need special fonts for these.
Comment 10 Karl Tomlinson (:karlt) 2007-11-22 19:50:02 PST
Created attachment 289879 [details]
mathfontUnicode.properties with l/r moustache
Comment 11 Karl Tomlinson (:karlt) 2007-11-26 12:12:48 PST
Created attachment 290258 [details] [diff] [review]
 Update list of property files and preferred fonts 1.1

Added missing semicolons to all.js, and preferring STIXSize1 to STIXGeneral,
so variant sizes are used, when available and suitable, in preference to
building from parts.
Comment 12 Stuart Parmenter 2007-11-26 13:06:45 PST
Comment on attachment 289861 [details]
mathfontStandardSymbolsL.properties

rubberstamp=me
Comment 13 Reed Loden [:reed] (use needinfo?) 2007-12-03 21:27:30 PST
So, can the .properties changes land now, or do they need to have the other patches in first?
Comment 14 Karl Tomlinson (:karlt) 2007-12-03 21:35:44 PST
(In reply to comment #13)

There's not much point really, as they won't be installed without attachment 290258 [details] [diff] [review].  And (as it is now) that attachment prevents mathfontPUA.properties from installing, which is needed until we have attachment 289878 [details] [diff] [review].
Comment 15 Stuart Parmenter 2007-12-04 11:24:05 PST
Comment on attachment 289878 [details] [diff] [review]
mathfontUnicode.properties support code

would it be worth while to change the nsTArray<nsGlyphTable*> mTableList to be nsTArray<nsGlyphTable> and stop doing the extra allocation?


the rest of this looks ok, but the patch is huge and i'm sure I've missed something.  Might want to get another set of eyes to look it over real quick.
Comment 16 Karl Tomlinson (:karlt) 2007-12-04 13:00:27 PST
Comment on attachment 289878 [details] [diff] [review]
mathfontUnicode.properties support code

roc, can you provide another set of eyes, please?

I don't think nsTArray<nsGlyphTable*> is any worse than the nsVoidArray that was there, but I think we can do a nsTArray<nsGlyphTable> by providing an Open or Init method.  I'll put together an incremental patch for that.
Comment 17 Karl Tomlinson (:karlt) 2007-12-04 14:17:49 PST
Created attachment 291523 [details] [diff] [review]
use nsTArray<nsGlyphTable>

I now see nsTArray can AppendElement with any copy constructor, so we can sneak in without a separate Init method.
The separate memory allocation for mUnicodeTable can also be avoided.
Comment 18 Karl Tomlinson (:karlt) 2007-12-04 14:34:15 PST
Created attachment 291525 [details] [diff] [review]
mathfontUnicode.properties support code v2

attachment 289878 [details] [diff] [review] + attachment 291523 [details] [diff] [review]
Comment 19 Robert O'Callahan (:roc) (email my personal email if necessary) 2007-12-04 19:34:35 PST
Comment on attachment 291525 [details] [diff] [review]
mathfontUnicode.properties support code v2

I had a look, but this is more of a rubber-stamp I'm afraid. We'll just have to go with it.
Comment 20 Karl Tomlinson (:karlt) 2007-12-04 21:13:55 PST
These have now been checked in:
1.48        layout/mathml/base/src/Makefile.in
1.19        layout/mathml/base/src/mathfont.properties
1.1         layout/mathml/base/src/mathfontSTIXNonUnicode.properties
1.1         layout/mathml/base/src/mathfontSTIXSize1.properties
1.1         layout/mathml/base/src/mathfontStandardSymbolsL.properties
1.1         layout/mathml/base/src/mathfontUnicode.properties
1.3         layout/mathml/base/src/mathml.pkg
1.133       layout/mathml/base/src/nsMathMLChar.cpp
1.30        layout/mathml/base/src/nsMathMLChar.h
1.34        layout/mathml/content/src/mathml.css
3.711       modules/libpref/src/init/all.js

I'll close this after making sure there are appropriate follow-up bugs.
Comment 21 Karl Tomlinson (:karlt) 2007-12-05 20:02:39 PST
The mathml.dtd should be updated to use Plane 1 and new Unicode code points, but that can all be done as part of bug 289938.

Note You need to log in before you can comment on or make changes to this bug.