MathML Script Alphabet and other symbols are rendered as hex boxes with STIX fonts

RESOLVED FIXED

Status

()

P3
normal
RESOLVED FIXED
11 years ago
11 years ago

People

(Reporter: distler, Assigned: jtd)

Tracking

Trunk
PowerPC
Mac OS X
Points:
---
Bug Flags:
blocking1.9 +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(5 attachments)

(Reporter)

Description

11 years ago
User-Agent:       Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.5; en-US; rv:1.9b3pre) Gecko/2008020201 SeaMonkey/2.0a1pre
Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.5; en-US; rv:1.9b3pre) Gecko/2008020201 SeaMonkey/2.0a1pre

With the STIX fonts installed, other Math alphabets (Double-Struck, Fraktur, ...) seem to work, but Script does not seem to work. Only the characters from the BMP show up. The characters from Plane-1 do not.

You can see them just fine when you "View Source."

Reproducible: Always

Steps to Reproduce:
1.
2.
3.
(Reporter)

Comment 1

11 years ago
Created attachment 301819 [details]
testcase
(Reporter)

Comment 2

11 years ago
Created attachment 301822 [details]
screenshot
This looks like the Mac version of bug 382542.

The STIXGeneral font family has glyphs for MATHEMATICAL SCRIPT characters U+1D49C - U+1D4CF in the font face for the italic subfamily but no corresponding glyphs in the roman subfamily.

Does this render any better?

data:text/html,<p%20style="font-style:italic;font-family:STIXGeneral">&#x1D49C;</p>

(Not that this nor mathvariant:italic is an appropriate work-around.)
Status: UNCONFIRMED → NEW
Depends on: 382542
Ever confirmed: true
Flags: blocking1.9?
Priority: -- → P2
(Reporter)

Comment 4

11 years ago
Yeah, sure, that renders just fine.

Omitting the "font-style:italic", as you'd expect, yields a placeholder character.

Updated

11 years ago
Component: MathML → GFX: Thebes
QA Contact: mathml → thebes

Updated

11 years ago
Blocks: 324857
Flags: blocking1.9? → blocking1.9+
Priority: P2 → P3

Updated

11 years ago
Summary: MathML Script Alphabet not supported → MathML Script Alphabet and other symbols are rendered as hex boxes with STIX fonts
This bug will affect more than just STIX fonts.  (see e.g. bug 382542)
(Assignee)

Comment 7

11 years ago
Font matching for the testcase textruns that generate hex boxen:

InitTextRun 3d1bf9f0 fontgroup 3d12ae30 (STIXGeneral,Cambria,Cambria Math,DejaVu Serif,DejaVu Sans,Times,Lucida Sans Unicode,OpenSymbol,Standard Symbols L,serif,serif) lang: x-western len 45 TEXTRUN " 
Assignee: nobody → jdaggett
(Assignee)

Comment 8

11 years ago
Wow, cool, surrogate pairs cause bugzilla to burp!


Font matching for the testcase textruns that generate hex boxen:

InitTextRun 3d1bf9f0 fontgroup 3d12ae30 (STIXGeneral,Cambria,Cambria Math,DejaVu Serif,DejaVu Sans,Times,Lucida Sans Unicode,OpenSymbol,Standard Symbols L,serif,serif) lang: x-western len 45 
InitTextRun font: STIXGeneral
InitTextRun 3d1bf9f0 fontgroup 3d12ae30 font 3d10ac30 match STIXGeneral (1-1)
InitTextRun 3d1bf9f0 fontgroup 3d12ae30 font 0 match <null> (2-2)
InitTextRun 3d1bf9f0 fontgroup 3d12ae30 font 3d10b9e0 match DejaVuSans (4-1)
InitTextRun 3d1bf9f0 fontgroup 3d12ae30 font 0 match <null> (5-4)
InitTextRun 3d1bf9f0 fontgroup 3d12ae30 font 3d10b9e0 match DejaVuSans (9-2)
InitTextRun 3d1bf9f0 fontgroup 3d12ae30 font 0 match <null> (11-2)
InitTextRun 3d1bf9f0 fontgroup 3d12ae30 font 3d10b9e0 match DejaVuSans (13-2)
InitTextRun 3d1bf9f0 fontgroup 3d12ae30 font 0 match <null> (15-4)
InitTextRun 3d1bf9f0 fontgroup 3d12ae30 font 3d10b9e0 match DejaVuSans (19-2)
InitTextRun 3d1bf9f0 fontgroup 3d12ae30 font 0 match <null> (21-8)
InitTextRun 3d1bf9f0 fontgroup 3d12ae30 font 3d10b9e0 match DejaVuSans (29-1)
InitTextRun 3d1bf9f0 fontgroup 3d12ae30 font 0 match <null> (30-16)

InitTextRun 3d480ae0 fontgroup 3d12ae30 (STIXGeneral,Cambria,Cambria Math,DejaVu Serif,DejaVu Sans,Times,Lucida Sans Unicode,OpenSymbol,Standard Symbols L,serif,serif) lang: x-western len 50 
InitTextRun font: STIXGeneral
InitTextRun 3d480ae0 fontgroup 3d12ae30 font 3d10ac30 match STIXGeneral (1-1)
InitTextRun 3d480ae0 fontgroup 3d12ae30 font 0 match <null> (2-8)
InitTextRun 3d480ae0 fontgroup 3d12ae30 font 3d10b9e0 match DejaVuSans (10-1)
InitTextRun 3d480ae0 fontgroup 3d12ae30 font 0 match <null> (11-2)
InitTextRun 3d480ae0 fontgroup 3d12ae30 font 3d48b340 match MS-PMincho (13-1)
InitTextRun 3d480ae0 fontgroup 3d12ae30 font 0 match <null> (14-14)
InitTextRun 3d480ae0 fontgroup 3d12ae30 font 3d10b9e0 match DejaVuSans (28-1)
InitTextRun 3d480ae0 fontgroup 3d12ae30 font 0 match <null> (29-22)

For the \mathcal{ABCDEFGHIJKLMNOPQRSTUVWXYZ} sequence, the MathML code is generating the following codepoints:

ABCDEF ==> d835 dc9c 212c d835 dc9e d835 dc9f 2130 2131

The screen output is <hexbox 01D49C> B <hexbox 01D49CE> <hexbox 01D49F> E F

Unicode has codepoints for specific font faces?!?  ick...
Yeah, bold B, italic B, fraktur B, and script B, all have different meanings in maths so they get different code points.

These simple testcases may have the same problem (they do on windows):

data:text/html,<p%20style="font-family:STIXGeneral">&#x212C;</p>
data:text/html,<p%20style="font-style:italic;font-family:STIXGeneral">&#x221E;</p>
(Assignee)

Comment 10

11 years ago
(In reply to comment #5)
> This bug will affect more than just STIX fonts.  (see e.g. bug 382542)

This is not the same as that bug, the font matching code on the mac does cmap matching on a per *face* basis.  Not sure why the system font fallback code isn't finding STIXGeneral-Italic in this case, since that face contains the codepoints in its cmap.  Debugging in progress...
No longer depends on: 382542
(Assignee)

Comment 11

11 years ago
Created attachment 303449 [details] [diff] [review]
patch, fix utterly boneheaded coding error

Argh, bailing on first face with missing cmap entry, should be continuing to the next.
Attachment #303449 - Flags: superreview?(roc)
Attachment #303449 - Flags: review?(roc)
Attachment #303449 - Flags: superreview?(roc)
Attachment #303449 - Flags: superreview+
Attachment #303449 - Flags: review?(roc)
Attachment #303449 - Flags: review+
(Assignee)

Comment 12

11 years ago
The patch fixes the problem but the rendering is not particularly good looking,
since the style set by MathML code includes STIXGeneral, DejaVu Sans.  The
regular face for STIXGeneral is missing entries for U+212C but DejaVu Sans has
this.  For non-BMP codepoints, only the italic face of STIXGeneral has these so
these get picked up in the system-wide search.

Yick, patooey...
Status: NEW → ASSIGNED
(Assignee)

Comment 13

11 years ago
Created attachment 303459 [details]
additional testcase

Shows strange interactions between codepoints and styles.  Note how the use of DejaVu Sans influences the rendering of the B glyph, this is because DejaVu Sans has some of the symbol characters defined in the BMP plane but not in the non-BMP plane.
(Assignee)

Comment 14

11 years ago
Created attachment 303460 [details]
screenshot of additional testcase rendering
(Assignee)

Comment 15

11 years ago
checked in
Status: ASSIGNED → RESOLVED
Last Resolved: 11 years ago
Resolution: --- → FIXED
(Assignee)

Comment 16

11 years ago
mail to the STIX fonts folks (betatest@stixfonts.org):

Hi,

I justed wanted to provide some brief feedback on the STIX fonts that were released in beta form last October.  I work as developer on Firefox and recently I've been doing a lot of work on our font handling system on Mac OS X.  I've noticed there a couple problematic ways in which these STIX fonts are structured.

In browsers, fonts are typically accessed via the font family name.  CSS style attributes like font-weight and font-style dictate the particular font face to choose in a family.  When matching fonts with text, these are determined by the containing element.  If a given font face lacks a given character, fallback occurs, the browser will search through all fonts on the system to find a font that contains that character.  The problem I came across is that fonts within a given family don't have matching character maps.

Specifically, the STIXGeneral family of fonts has glyphs for the Mathematical Alphanumeric Symbols Unicode range (U+1D400-U+1D7FF) spread across individual faces.  The regular face contains glyphs for Fraktur, Double-struck, Sans-serif and Monospace symbols while the italic face contains glyphs for Italic, Script, Sans-serif Italic and Italic Greek symbols.  In effect, one needs to use the *union* of these faces to have a font that will work properly for all Unicode values.  When dealing with a sequence of Latin text and characters from the mathematical symbols Unicode range the Latin text varies depending upon the style settings (bold, italic) but the mathematical symbols should *always* render the same way, no matter what the style settings.  If an application sets the font face based on the style settings as most text rendering software does, the user will see unrenderable ranges of characters because the glyphs for that range are in a face associated with a different style.

The solution I think is to include the glyphs for all codepoints in the Mathematical Alphanumeric Symbols Unicode range in each of the STIXGeneral faces, rather than separating them out based on the implied style of a given subrange.  This will work function much better with existing text rendering software.

If you have any questions about this, feel free to contact me.

Cheers,

John Daggett
Mozilla Japan
(Assignee)

Comment 17

11 years ago
From the STIX fonts faq:

http://www.aip.org/stixfonts/STIXfaq.html

"Q. Why are the Unicode letter-like symbols and Plane 1 alphanumerics distributed across different fonts rather than being all in one place?

A. It was felt, perhaps unwisely, that an ordinary user would look for an italic letter in an italic font, a bold letter in a bold font, etc. We have rethought this decision, and in the production release will place the full Unicode complement of these two blocks in a single font."

*Yay!*
You need to log in before you can comment on or make changes to this bug.