Closed Bug 400938 Opened 17 years ago Closed 17 years ago

make MathML work with Unicode fonts

Categories

(Core :: MathML, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla1.9beta2

People

(Reporter: karlt, Assigned: karlt)

References

Details

Attachments

(7 files, 4 obsolete files)

We now have or will soon have available fonts with more complete Unicode
mathematical symbol support, which enables MathML font handling to be done in
a more standard way.

I see four categories of fonts:

1. True Unicode fonts:

   These each have a Unicode charmap that links Unicode code points to its
   glyphs for characters that have a Unicode code point assignment.  If
   Mozilla's MathML can be made to work with any one of these fonts then it
   will (mostly) work with any Unicode font available today or in the future.
   (If there are glyphs for characters that do not have a Unicode assignment
   then the Unicode Private Use Area (PUA) range is used for those glyphs.)

   Fonts in this category suitable for use in MathML include Cambria Math and
   the (hopefully) soon-to-be-released STIX fonts.  Meiryo, DejaVu Sans,
   Lucida Sans Unicode, Open Symbol, and Adobe's Symbol font (not the Monotype
   version) also fit in this category but have lesser coverage of the
   mathematical characters.  However, some of these fonts don't necessarily
   have the Unicode mapping for some Unicode characters even though they have
   a glyph (a PUA mapping is used instead, perhaps because the Unicode
   assignment didn't exist at the time the font was designed).

2. PUA-only Unicode fonts:

   These fonts have a Unicode charmap but there are only mappings to the PUA.

   Fonts in this category include TrueType versions of Math1 (not Math2) Math3
   Math4 Math5 Mathematica1 (not Mathematica2) Mathematica3 Mathematica4 (not
   Mathematica5) Mathematica6 and Mathematica7.

3. Fonts with a Microsoft Symbol charmap:

   This charmap is really a PUA-only Unicode charmap (and so this category is
   in many ways the same as Category 2) but the charmap is labelled
   differently (perhaps to make it clear that there is no defined meaning for
   these code points).

   The difficultly with relying on these fonts is that the platform libraries
   often (usually?) do not select this charmap but instead use another charmap
   that is incorrect and so these fonts end up falling into Category 4, below.
   If we explicitly selected the Microsoft Symbol charmap, then these fonts
   can be treated the same way as fonts in Category 2, but the current
   decision to keep the new non-Unicode behaviour in Bug 399636 would seem to
   preclude doing this as a general rule.  Perhaps, making Thebes search all
   (or some subset of) charmaps when mapping the character code points to
   glyphs would mean that these fonts could be used like Category 2 fonts.
   (This seems to be what currently happens on Windows.)

   Fonts in this Category include Monotype's Symbol, Math2, Mathematica2,
   and Mathematica5.

4. Deceitful Fonts:

   These claim to have glyphs corresponding to certain characters (in a
   Unicode or other charmap) but in fact provide glyphs for completely
   different characters.  Fonts in this category are best not installed (as
   system fonts) because an application may innocently believe that
   the fonts provide these characters and not know that they are providing
   garbage.

   Fonts in this category include the TrueType versions of cmr10 cmmi10 cmsy10
   and cmex10.


With fonts in Categories 2-4, using them to provide glyphs for mathematical
symbols requires explicit encoding tables to support each individual font.
Hopefully much of this can be avoided by adding general support for Category 1
fonts.

There are Unicode code point assignments for the separate parts of some
stretchy characters.  These include all the parts of left and right
parentheses, square brackets, and curly brackets, as well as top and bottom
half integral and integral extension.  There is a radical symbol bottom that
seems designed to join up (vertically) with "box drawings light down and
right".

I assume there are some stretchy characters that do not have assignments for
the desired sub-parts, and the Unicode code points do not provide a range of
sizes of characters, nor a range of angles for stretchy radical symbols.
However, we can aim to make things work as well as possible with Unicode code
points so that any Unicode font will work, but include some set of font/PUA
mappings so that MathML looks most beautiful with some special set of fonts.


The way special mathematical characters in MathML worked for Mozilla 1.8 was
that Mozilla had its own PUA assignments which gfx then translated, through
explicit tables of the mappings for each of the fonts, to pick up the desired
glyphs.  It would be good to avoid having gfx assign its own meaning for PUA
code points as these code points may be used in web pages to select characters
with no Unicode assignments from the PUA of specific fonts.

With fonts in Categories 1 and 2 (and maybe 3), any required translation from
Mozilla-MathML code point assignments to font-specific code points can be done
in a cross-platform manner in the MathML implementation (rather than in
Thebes).
Flags: blocking1.9?
Depends on: 321438
Blocks: mathml-fonts
Blocks: 236880
Status: NEW → ASSIGNED
Assignee: nobody → mozbugz
Status: ASSIGNED → NEW
Depends on: 382542
Depends on: 401988
Blocks: cambria-math
Priority: -- → P1
Status: NEW → ASSIGNED
Flags: blocking1.9? → blocking1.9+
Depends on: 403718
NS_ASSERTION(ch != uchar, "glyph table incorrectly set -- duplicate found")
was firing when different sizes of a glyph (from different fonts) had the same
Unicode point, as needed for STIXSize* fonts.

This changes the assertion to check the entire nsGlyphCode (including font),
and implements an Exists method to make the implicit conversion to PRUnichar
unnecessary.
Attachment #288601 - Flags: review?(roc)
Comment on attachment 288601 [details] [diff] [review]
don't implicitly lossy convert from nsGlyphCode to PRUnichar [checked in]

+  return BigOf(aPresContext, aChar, 1).Exists() != 0;

lose the != 0
Attachment #288601 - Flags: superreview+
Attachment #288601 - Flags: review?(roc)
Attachment #288601 - Flags: review+
Comment on attachment 288601 [details] [diff] [review]
don't implicitly lossy convert from nsGlyphCode to PRUnichar [checked in]

Checked this patch in with out the != 0:

1.131       mozilla/layout/mathml/base/src/nsMathMLChar.cpp
1.29        mozilla/layout/mathml/base/src/nsMathMLChar.h
Attached file mathfontUnicode.properties (obsolete) —
For mozilla/layout/mathml/base/src.

A glyph table for operators that can be stretched using only glyphs with Unicode code points.  This can be shared by Unicode fonts.
For mozilla/layout/mathml/base/src.

Fonts with entries in the PUA still need their own glyph tables.
No Mozilla-MathML-PUA to custom-font-code mapping is done in gfx now, so the tables need to specify the PUA codes for the font directly rather than using the \uF8FF indirect reference trick.  In some ways, this makes the system more
flexible as it avoids the need to have the encoding compiled in.

This file is based on the mathfontSymbol.properties file but the entry for U+005F _ is changed to U+0332 _ because _ is not stretchy.
Some stretchy chars that can be built using STIXNonUnicode.

This provides for all the stretchy characters that could be built from parts
using the CMSY10 and CMEX10 properties files, but cannot be built from Unicode
chars, except for radicals.  (I need to add l/r moustache U+23B0/U+23B1 to the
Unicode table.)
Support for building radicals from parts, and some variant sizes of brackets.
There are more characters in the STIXSize* files but this gives a good start.
Attachment #289861 - Attachment mime type: application/octet-stream → text/plain
It will be possible to use the Math1 and Math4 or preferably newer Mathematica1
and Mathematica4 fonts for stretchy chars if the property files are
updated/provided, but as they currently reference the old Mozilla-MathML PUA
assignments they are best disabled for now.

I have no intention of using the (category 4) TeX fonts at this stage.

It looks like MT Extra wasn't being used for any stretchy chars.

I'm removing Symbol from the mathml.css file as the Monotype version (for
Windows) is not Unicode (Bug 399636).  The Adobe and Apple versions of Symbol
will still be used for font fallback if no other fonts support the character.
Attachment #289876 - Flags: review?(pavlov)
Enables use of the mathfontUnicode.properties glyph table for fonts that don't
have/need a special glyph table.  The algorithm used is to iterate through the
ordered list of preferred font families.  If the font family has a its own glyph table then that table is used with that family to see if the character is of the correct size.  If the font does not have its own glyph table, then the Unicode table is used and glyphs are found using the complete list of font families.

The stretch searching algorithm is changed a little to prefer building
from parts in a preferred font over using variants in other fonts.
The search in variants within one font has been changed to continue searching
for the best even if an approximate fit was found.

Only uses preferences rather than mathfont.properties for preferred fonts and
doesn't cache the values so that a restart is not required to make the changes
take effect.

The ≤ ≥ and minus issues seemed related to TeX fonts, which are no longer
being used, so we shouldn't need special fonts for these.
Attachment #289878 - Flags: review?(pavlov)
Attachment #289879 - Flags: review?(pavlov)
Attachment #289859 - Attachment is obsolete: true
Attachment #289861 - Flags: review?(pavlov)
Attachment #289872 - Flags: review?(pavlov)
Attachment #289873 - Flags: review?(pavlov)
Added missing semicolons to all.js, and preferring STIXSize1 to STIXGeneral,
so variant sizes are used, when available and suitable, in preference to
building from parts.
Attachment #289876 - Attachment is obsolete: true
Attachment #290258 - Flags: review?(pavlov)
Attachment #289876 - Flags: review?(pavlov)
Attachment #288601 - Attachment description: don't implicitly lossy convert from nsGlyphCode to PRUnichar → don't implicitly lossy convert from nsGlyphCode to PRUnichar [checked in]
Comment on attachment 289861 [details]
mathfontStandardSymbolsL.properties

rubberstamp=me
Attachment #289861 - Flags: review?(pavlov) → review+
Attachment #289872 - Flags: review?(pavlov) → review+
Attachment #289873 - Flags: review?(pavlov) → review+
Attachment #289879 - Flags: review?(pavlov) → review+
So, can the .properties changes land now, or do they need to have the other patches in first?
(In reply to comment #13)

There's not much point really, as they won't be installed without attachment 290258 [details] [diff] [review].  And (as it is now) that attachment prevents mathfontPUA.properties from installing, which is needed until we have attachment 289878 [details] [diff] [review].
Comment on attachment 289878 [details] [diff] [review]
mathfontUnicode.properties support code

would it be worth while to change the nsTArray<nsGlyphTable*> mTableList to be nsTArray<nsGlyphTable> and stop doing the extra allocation?


the rest of this looks ok, but the patch is huge and i'm sure I've missed something.  Might want to get another set of eyes to look it over real quick.
Attachment #290258 - Flags: review?(pavlov) → review+
Comment on attachment 289878 [details] [diff] [review]
mathfontUnicode.properties support code

roc, can you provide another set of eyes, please?

I don't think nsTArray<nsGlyphTable*> is any worse than the nsVoidArray that was there, but I think we can do a nsTArray<nsGlyphTable> by providing an Open or Init method.  I'll put together an incremental patch for that.
Attachment #289878 - Flags: superreview?(roc)
Attached patch use nsTArray<nsGlyphTable> (obsolete) — Splinter Review
I now see nsTArray can AppendElement with any copy constructor, so we can sneak in without a separate Init method.
The separate memory allocation for mUnicodeTable can also be avoided.
Attachment #291523 - Flags: review?(pavlov)
attachment 289878 [details] [diff] [review] + attachment 291523 [details] [diff] [review]
Attachment #289878 - Attachment is obsolete: true
Attachment #291523 - Attachment is obsolete: true
Attachment #291525 - Flags: superreview?(roc)
Attachment #291525 - Flags: review?(pavlov)
Attachment #291523 - Flags: review?(pavlov)
Attachment #289878 - Flags: superreview?(roc)
Attachment #289878 - Flags: review?(pavlov)
Attachment #291525 - Flags: review?(pavlov) → review+
Comment on attachment 291525 [details] [diff] [review]
mathfontUnicode.properties support code v2

I had a look, but this is more of a rubber-stamp I'm afraid. We'll just have to go with it.
Attachment #291525 - Flags: superreview?(roc) → superreview+
These have now been checked in:
1.48        layout/mathml/base/src/Makefile.in
1.19        layout/mathml/base/src/mathfont.properties
1.1         layout/mathml/base/src/mathfontSTIXNonUnicode.properties
1.1         layout/mathml/base/src/mathfontSTIXSize1.properties
1.1         layout/mathml/base/src/mathfontStandardSymbolsL.properties
1.1         layout/mathml/base/src/mathfontUnicode.properties
1.3         layout/mathml/base/src/mathml.pkg
1.133       layout/mathml/base/src/nsMathMLChar.cpp
1.30        layout/mathml/base/src/nsMathMLChar.h
1.34        layout/mathml/content/src/mathml.css
3.711       modules/libpref/src/init/all.js

I'll close this after making sure there are appropriate follow-up bugs.
Blocks: 407059
Target Milestone: --- → mozilla1.9 M10
Blocks: 407101
The mathml.dtd should be updated to use Plane 1 and new Unicode code points, but that can all be done as part of bug 289938.
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Blocks: asana-math
Depends on: 408356
Depends on: 410284
Blocks: 483116
Depends on: 663740
No longer depends on: 663740
You need to log in before you can comment on or make changes to this bug.