make MathML work with Unicode fonts

RESOLVED FIXED in mozilla1.9beta2

Status

()

Core
MathML
P1
normal
RESOLVED FIXED
10 years ago
6 years ago

People

(Reporter: karlt, Assigned: karlt)

Tracking

(Blocks: 1 bug)

Trunk
mozilla1.9beta2
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9 +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(7 attachments, 4 obsolete attachments)

14.88 KB, patch
roc
: review+
Details | Diff | Splinter Review
3.81 KB, text/plain
Stuart Parmenter
: review+
Details
5.22 KB, text/plain
Stuart Parmenter
: review+
Details
2.96 KB, text/plain
Stuart Parmenter
: review+
Details
6.56 KB, text/plain
Stuart Parmenter
: review+
Details
6.75 KB, patch
Stuart Parmenter
: review+
Details | Diff | Splinter Review
109.61 KB, patch
Stuart Parmenter
: review+
Details | Diff | Splinter Review
(Assignee)

Description

10 years ago
We now have or will soon have available fonts with more complete Unicode
mathematical symbol support, which enables MathML font handling to be done in
a more standard way.

I see four categories of fonts:

1. True Unicode fonts:

   These each have a Unicode charmap that links Unicode code points to its
   glyphs for characters that have a Unicode code point assignment.  If
   Mozilla's MathML can be made to work with any one of these fonts then it
   will (mostly) work with any Unicode font available today or in the future.
   (If there are glyphs for characters that do not have a Unicode assignment
   then the Unicode Private Use Area (PUA) range is used for those glyphs.)

   Fonts in this category suitable for use in MathML include Cambria Math and
   the (hopefully) soon-to-be-released STIX fonts.  Meiryo, DejaVu Sans,
   Lucida Sans Unicode, Open Symbol, and Adobe's Symbol font (not the Monotype
   version) also fit in this category but have lesser coverage of the
   mathematical characters.  However, some of these fonts don't necessarily
   have the Unicode mapping for some Unicode characters even though they have
   a glyph (a PUA mapping is used instead, perhaps because the Unicode
   assignment didn't exist at the time the font was designed).

2. PUA-only Unicode fonts:

   These fonts have a Unicode charmap but there are only mappings to the PUA.

   Fonts in this category include TrueType versions of Math1 (not Math2) Math3
   Math4 Math5 Mathematica1 (not Mathematica2) Mathematica3 Mathematica4 (not
   Mathematica5) Mathematica6 and Mathematica7.

3. Fonts with a Microsoft Symbol charmap:

   This charmap is really a PUA-only Unicode charmap (and so this category is
   in many ways the same as Category 2) but the charmap is labelled
   differently (perhaps to make it clear that there is no defined meaning for
   these code points).

   The difficultly with relying on these fonts is that the platform libraries
   often (usually?) do not select this charmap but instead use another charmap
   that is incorrect and so these fonts end up falling into Category 4, below.
   If we explicitly selected the Microsoft Symbol charmap, then these fonts
   can be treated the same way as fonts in Category 2, but the current
   decision to keep the new non-Unicode behaviour in Bug 399636 would seem to
   preclude doing this as a general rule.  Perhaps, making Thebes search all
   (or some subset of) charmaps when mapping the character code points to
   glyphs would mean that these fonts could be used like Category 2 fonts.
   (This seems to be what currently happens on Windows.)

   Fonts in this Category include Monotype's Symbol, Math2, Mathematica2,
   and Mathematica5.

4. Deceitful Fonts:

   These claim to have glyphs corresponding to certain characters (in a
   Unicode or other charmap) but in fact provide glyphs for completely
   different characters.  Fonts in this category are best not installed (as
   system fonts) because an application may innocently believe that
   the fonts provide these characters and not know that they are providing
   garbage.

   Fonts in this category include the TrueType versions of cmr10 cmmi10 cmsy10
   and cmex10.


With fonts in Categories 2-4, using them to provide glyphs for mathematical
symbols requires explicit encoding tables to support each individual font.
Hopefully much of this can be avoided by adding general support for Category 1
fonts.

There are Unicode code point assignments for the separate parts of some
stretchy characters.  These include all the parts of left and right
parentheses, square brackets, and curly brackets, as well as top and bottom
half integral and integral extension.  There is a radical symbol bottom that
seems designed to join up (vertically) with "box drawings light down and
right".

I assume there are some stretchy characters that do not have assignments for
the desired sub-parts, and the Unicode code points do not provide a range of
sizes of characters, nor a range of angles for stretchy radical symbols.
However, we can aim to make things work as well as possible with Unicode code
points so that any Unicode font will work, but include some set of font/PUA
mappings so that MathML looks most beautiful with some special set of fonts.


The way special mathematical characters in MathML worked for Mozilla 1.8 was
that Mozilla had its own PUA assignments which gfx then translated, through
explicit tables of the mappings for each of the fonts, to pick up the desired
glyphs.  It would be good to avoid having gfx assign its own meaning for PUA
code points as these code points may be used in web pages to select characters
with no Unicode assignments from the PUA of specific fonts.

With fonts in Categories 1 and 2 (and maybe 3), any required translation from
Mozilla-MathML code point assignments to font-specific code points can be done
in a cross-platform manner in the MathML implementation (rather than in
Thebes).
(Assignee)

Updated

10 years ago
Flags: blocking1.9?
(Assignee)

Updated

10 years ago
Depends on: 321438
(Assignee)

Updated

10 years ago
Blocks: 295193
(Assignee)

Updated

10 years ago
Blocks: 236880
(Assignee)

Updated

10 years ago
Status: NEW → ASSIGNED
(Assignee)

Updated

10 years ago
Assignee: nobody → mozbugz
Status: ASSIGNED → NEW
(Assignee)

Updated

10 years ago
Depends on: 382542
(Assignee)

Updated

10 years ago
Depends on: 401988
(Assignee)

Updated

10 years ago
Blocks: 372351
(Assignee)

Updated

10 years ago
Priority: -- → P1
(Assignee)

Updated

10 years ago
Status: NEW → ASSIGNED
Flags: blocking1.9? → blocking1.9+
(Assignee)

Updated

10 years ago
Depends on: 403718
(Assignee)

Comment 1

10 years ago
Created attachment 288601 [details] [diff] [review]
don't implicitly lossy convert from nsGlyphCode to PRUnichar [checked in]

NS_ASSERTION(ch != uchar, "glyph table incorrectly set -- duplicate found")
was firing when different sizes of a glyph (from different fonts) had the same
Unicode point, as needed for STIXSize* fonts.

This changes the assertion to check the entire nsGlyphCode (including font),
and implements an Exists method to make the implicit conversion to PRUnichar
unnecessary.
Attachment #288601 - Flags: review?(roc)
Comment on attachment 288601 [details] [diff] [review]
don't implicitly lossy convert from nsGlyphCode to PRUnichar [checked in]

+  return BigOf(aPresContext, aChar, 1).Exists() != 0;

lose the != 0
Attachment #288601 - Flags: superreview+
Attachment #288601 - Flags: review?(roc)
Attachment #288601 - Flags: review+
(Assignee)

Comment 3

10 years ago
Comment on attachment 288601 [details] [diff] [review]
don't implicitly lossy convert from nsGlyphCode to PRUnichar [checked in]

Checked this patch in with out the != 0:

1.131       mozilla/layout/mathml/base/src/nsMathMLChar.cpp
1.29        mozilla/layout/mathml/base/src/nsMathMLChar.h
(Assignee)

Comment 4

10 years ago
Created attachment 289859 [details]
mathfontUnicode.properties

For mozilla/layout/mathml/base/src.

A glyph table for operators that can be stretched using only glyphs with Unicode code points.  This can be shared by Unicode fonts.
(Assignee)

Comment 5

10 years ago
Created attachment 289861 [details]
mathfontStandardSymbolsL.properties

For mozilla/layout/mathml/base/src.

Fonts with entries in the PUA still need their own glyph tables.
No Mozilla-MathML-PUA to custom-font-code mapping is done in gfx now, so the tables need to specify the PUA codes for the font directly rather than using the \uF8FF indirect reference trick.  In some ways, this makes the system more
flexible as it avoids the need to have the encoding compiled in.

This file is based on the mathfontSymbol.properties file but the entry for U+005F _ is changed to U+0332 _ because _ is not stretchy.
(Assignee)

Comment 6

10 years ago
Created attachment 289872 [details]
mathfontSTIXNonUnicode.properties

Some stretchy chars that can be built using STIXNonUnicode.

This provides for all the stretchy characters that could be built from parts
using the CMSY10 and CMEX10 properties files, but cannot be built from Unicode
chars, except for radicals.  (I need to add l/r moustache U+23B0/U+23B1 to the
Unicode table.)
(Assignee)

Comment 7

10 years ago
Created attachment 289873 [details]
mathfontSTIXSize1.properties

Support for building radicals from parts, and some variant sizes of brackets.
There are more characters in the STIXSize* files but this gives a good start.
(Assignee)

Updated

10 years ago
Attachment #289861 - Attachment mime type: application/octet-stream → text/plain
(Assignee)

Comment 8

10 years ago
Created attachment 289876 [details] [diff] [review]
Update list of property files and preferred fonts

It will be possible to use the Math1 and Math4 or preferably newer Mathematica1
and Mathematica4 fonts for stretchy chars if the property files are
updated/provided, but as they currently reference the old Mozilla-MathML PUA
assignments they are best disabled for now.

I have no intention of using the (category 4) TeX fonts at this stage.

It looks like MT Extra wasn't being used for any stretchy chars.

I'm removing Symbol from the mathml.css file as the Monotype version (for
Windows) is not Unicode (Bug 399636).  The Adobe and Apple versions of Symbol
will still be used for font fallback if no other fonts support the character.
Attachment #289876 - Flags: review?(pavlov)
(Assignee)

Comment 9

10 years ago
Created attachment 289878 [details] [diff] [review]
mathfontUnicode.properties support code

Enables use of the mathfontUnicode.properties glyph table for fonts that don't
have/need a special glyph table.  The algorithm used is to iterate through the
ordered list of preferred font families.  If the font family has a its own glyph table then that table is used with that family to see if the character is of the correct size.  If the font does not have its own glyph table, then the Unicode table is used and glyphs are found using the complete list of font families.

The stretch searching algorithm is changed a little to prefer building
from parts in a preferred font over using variants in other fonts.
The search in variants within one font has been changed to continue searching
for the best even if an approximate fit was found.

Only uses preferences rather than mathfont.properties for preferred fonts and
doesn't cache the values so that a restart is not required to make the changes
take effect.

The ≤ ≥ and minus issues seemed related to TeX fonts, which are no longer
being used, so we shouldn't need special fonts for these.
Attachment #289878 - Flags: review?(pavlov)
(Assignee)

Comment 10

10 years ago
Created attachment 289879 [details]
mathfontUnicode.properties with l/r moustache
Attachment #289879 - Flags: review?(pavlov)
(Assignee)

Updated

10 years ago
Attachment #289859 - Attachment is obsolete: true
(Assignee)

Updated

10 years ago
Attachment #289861 - Flags: review?(pavlov)
(Assignee)

Updated

10 years ago
Attachment #289872 - Flags: review?(pavlov)
(Assignee)

Updated

10 years ago
Attachment #289873 - Flags: review?(pavlov)
(Assignee)

Comment 11

10 years ago
Created attachment 290258 [details] [diff] [review]
 Update list of property files and preferred fonts 1.1

Added missing semicolons to all.js, and preferring STIXSize1 to STIXGeneral,
so variant sizes are used, when available and suitable, in preference to
building from parts.
Attachment #289876 - Attachment is obsolete: true
Attachment #290258 - Flags: review?(pavlov)
Attachment #289876 - Flags: review?(pavlov)
(Assignee)

Updated

10 years ago
Attachment #288601 - Attachment description: don't implicitly lossy convert from nsGlyphCode to PRUnichar → don't implicitly lossy convert from nsGlyphCode to PRUnichar [checked in]

Comment 12

10 years ago
Comment on attachment 289861 [details]
mathfontStandardSymbolsL.properties

rubberstamp=me
Attachment #289861 - Flags: review?(pavlov) → review+

Updated

10 years ago
Attachment #289872 - Flags: review?(pavlov) → review+

Updated

10 years ago
Attachment #289873 - Flags: review?(pavlov) → review+

Updated

10 years ago
Attachment #289879 - Flags: review?(pavlov) → review+
So, can the .properties changes land now, or do they need to have the other patches in first?
(Assignee)

Comment 14

10 years ago
(In reply to comment #13)

There's not much point really, as they won't be installed without attachment 290258 [details] [diff] [review].  And (as it is now) that attachment prevents mathfontPUA.properties from installing, which is needed until we have attachment 289878 [details] [diff] [review].

Comment 15

10 years ago
Comment on attachment 289878 [details] [diff] [review]
mathfontUnicode.properties support code

would it be worth while to change the nsTArray<nsGlyphTable*> mTableList to be nsTArray<nsGlyphTable> and stop doing the extra allocation?


the rest of this looks ok, but the patch is huge and i'm sure I've missed something.  Might want to get another set of eyes to look it over real quick.

Updated

10 years ago
Attachment #290258 - Flags: review?(pavlov) → review+
(Assignee)

Comment 16

10 years ago
Comment on attachment 289878 [details] [diff] [review]
mathfontUnicode.properties support code

roc, can you provide another set of eyes, please?

I don't think nsTArray<nsGlyphTable*> is any worse than the nsVoidArray that was there, but I think we can do a nsTArray<nsGlyphTable> by providing an Open or Init method.  I'll put together an incremental patch for that.
Attachment #289878 - Flags: superreview?(roc)
(Assignee)

Comment 17

10 years ago
Created attachment 291523 [details] [diff] [review]
use nsTArray<nsGlyphTable>

I now see nsTArray can AppendElement with any copy constructor, so we can sneak in without a separate Init method.
The separate memory allocation for mUnicodeTable can also be avoided.
Attachment #291523 - Flags: review?(pavlov)
(Assignee)

Comment 18

10 years ago
Created attachment 291525 [details] [diff] [review]
mathfontUnicode.properties support code v2

attachment 289878 [details] [diff] [review] + attachment 291523 [details] [diff] [review]
Attachment #289878 - Attachment is obsolete: true
Attachment #291523 - Attachment is obsolete: true
Attachment #291525 - Flags: superreview?(roc)
Attachment #291525 - Flags: review?(pavlov)
Attachment #291523 - Flags: review?(pavlov)
Attachment #289878 - Flags: superreview?(roc)
Attachment #289878 - Flags: review?(pavlov)

Updated

10 years ago
Attachment #291525 - Flags: review?(pavlov) → review+
Comment on attachment 291525 [details] [diff] [review]
mathfontUnicode.properties support code v2

I had a look, but this is more of a rubber-stamp I'm afraid. We'll just have to go with it.
Attachment #291525 - Flags: superreview?(roc) → superreview+
(Assignee)

Comment 20

10 years ago
These have now been checked in:
1.48        layout/mathml/base/src/Makefile.in
1.19        layout/mathml/base/src/mathfont.properties
1.1         layout/mathml/base/src/mathfontSTIXNonUnicode.properties
1.1         layout/mathml/base/src/mathfontSTIXSize1.properties
1.1         layout/mathml/base/src/mathfontStandardSymbolsL.properties
1.1         layout/mathml/base/src/mathfontUnicode.properties
1.3         layout/mathml/base/src/mathml.pkg
1.133       layout/mathml/base/src/nsMathMLChar.cpp
1.30        layout/mathml/base/src/nsMathMLChar.h
1.34        layout/mathml/content/src/mathml.css
3.711       modules/libpref/src/init/all.js

I'll close this after making sure there are appropriate follow-up bugs.
Depends on: 406927
(Assignee)

Updated

10 years ago
Blocks: 407059
(Assignee)

Updated

10 years ago
Target Milestone: --- → mozilla1.9 M10
(Assignee)

Updated

10 years ago
Blocks: 407101
(Assignee)

Comment 21

10 years ago
The mathml.dtd should be updated to use Plane 1 and new Unicode code points, but that can all be done as part of bug 289938.
Status: ASSIGNED → RESOLVED
Last Resolved: 10 years ago
Resolution: --- → FIXED
(Assignee)

Updated

10 years ago
Blocks: 407439

Updated

10 years ago
Depends on: 408356
(Assignee)

Updated

10 years ago
Depends on: 410284
(Assignee)

Updated

9 years ago
Blocks: 483116
(Assignee)

Updated

6 years ago
Depends on: 663740
(Assignee)

Updated

6 years ago
No longer depends on: 663740
You need to log in before you can comment on or make changes to this bug.