Closed Bug 161137 Opened 22 years ago Closed 15 years ago

MathML renders wrong symbols if TeX fonts are installed

Categories

(Core :: MathML, defect)

PowerPC
macOS
defect
Not set
major

Tracking

()

RESOLVED WONTFIX

People

(Reporter: hsivonen, Assigned: rbs)

References

()

Details

Attachments

(19 files, 5 obsolete files)

19.82 KB, image/png
Details
18.24 KB, image/png
Details
22.32 KB, image/png
Details
1.41 KB, text/plain
Details
124.21 KB, image/png
Details
17.16 KB, patch
Details | Diff | Splinter Review
57.16 KB, image/png
Details
32.53 KB, image/png
Details
44.34 KB, image/png
Details
42.89 KB, image/png
Details
75.84 KB, image/png
Details
72.03 KB, image/png
Details
81.96 KB, image/png
Details
75.62 KB, image/png
Details
104.44 KB, image/png
Details
91.68 KB, image/png
Details
41.37 KB, image/png
Details
4.10 KB, patch
Details | Diff | Splinter Review
20.95 KB, text/plain
Details
Build ID: 2002080503 FizzillaCFM

Reproducible: Always.

Steps to reproduce:
1) Install the versions of Mathematica fonts. I used the versions that came with
the Classic version of Mathematica.
2) Install the Type 1 versions of Computer Modern (OzTeX or Textures
package--doesn't matter) from AMS.
3) Load http://www.mozilla.org/projects/mathml/start.xml

Expected results:
Expected a rendering similar to the reference screenshots.

Actual results:
The rendering falls apart and wrong glyphs are shown.

Additional info:
The rendering is correct with the Mathematica fonts but without the TeX fonts.
Summary: MathML rendering break is TeX fonts are installed → MathML rendering breaks if TeX fonts are installed
MathML-fonts and full LaTeX2e installation (all TeX fonts) installed
-using win32, build 2002080504 (no problem)
-using linux kernel2.4.16 same build (no problem)

i'm not used with MacOSX but i work with linux and the font-management must be
simmilar

try to check if fontnames are simmilar or equal (Mathematica and TeX ones) 
> 2) Install the Type 1 versions of Computer Modern

Ah, if that's what you guys have been doing then it calls for troubles.

The problem with these TeX fonts is that the encoding is different when you use
Type1 or TrueType. The ucvmath module has an #ifdef PLATFORM to pick one of the
two:

On Windows/Mac/OS2, it picks the TrueType encoding.
On Linux/Unix, it picks the Type1 encoding.

So if one subsequently uses a Type1 version instead of TrueType, the mismatch
arises. Conversely, Linux people often reports strange rendering because they
install the TrueType versions instead of Type1 - bug 35236.

The difficulty of fixing the problem lies in the fact that GFX doesn't know
which physical font was actually loaded by the OS. Indeed, if that was possible
to know that the loaded font is Type1 or TrueType, then the right encoding could
be associated on the fly.

In the meantime, try the TrueType versions to see if that solves the problem.
(There is no problem with the Mathematica fonts because the encoding is the
same.)
>Indeed, if that was possible to know that the loaded font
>is Type1 or TrueType, then the right encoding could be 
>associated on the fly.

actually, there is a way to tell.  i'm don't have my code in front of me, but i'll 
post something in the morning.
In that case, the Mac could be protected once for all from this problem.

If that was possible on Linux too, then a protection could be added there as 
well.
Do you "recover" the character maps of the TrueType TeX fonts when you visit:
http://www.mozilla.org/projects/mathml/fonts/encoding/
perhaps when we create the converter object we could pass in what we know about
the font's format as another parameter...
Iteration to illustrate how to proceed, missing the CopyCString2Pascal()
business. schofield, something for you to shew upon.

However, it is still not clear to me if installing the TrueType versions fixed
the problems (comment 8). There is something else to note. Even if the TrueType
font is there, there are other internal data in the font file that are meant
for the Mac or_exclusive_ Windows. So, if installing the TrueType versions
didn't fix the problem (e.g., the internal data for the Mac are missing), we
may have to request people to exclusively install Type1. In which case, this
elaborate patch won't be necessary. It will only be necessary to s/.ttf/.t1/g
in fontEncoding.properties and adjust the #ifdef in ucvmath.
Attachment #94259 - Attachment is obsolete: true
> Do you "recover" the character maps of the TrueType TeX fonts when you visit:
> http://www.mozilla.org/projects/mathml/fonts/encoding/

With the BaKoMa fonts, some chars show up correctly (mostly the first halves of
the fonts). Greek chars tend to fall back to Lucida Grande. Some chars show up
as rectangles.
Now that bug 161481 is over, it would be great to finish off this one... Are you
willing to do some experiments?

I am going to attach a patch to try with the Type1 TeX fonts. The whole point of
the experiment is to confirm/deny the suitability of these versions on the Mac.
These fonts are already known to work fine on Unix. So if it turns out that they
are also okay out-of-the-box on the Mac, then these could be recommended by
default, and the code might then take step to do the detection properly.

Keywords: review
The patch isn't quite ready for review yet. I am just calling on Mac folks with 
Mac builds, rob, hsivonen, for an initial attempt to hook Type1 TeX fonts, and 
see how they fare.
Keywords: review
I applied the patch and rebuild intl and gfx. I installed Type 1 versions of the
CM fonts and uninstalled the BaKoMa fonts. I rebooted to make sure the BaKoMa
fonts weren't cached anywhere. I'm still seeing gibberish with the CM fonts
installed. :-(
Do you have any utility on the Mac to graphically visualize the character map of 
a font? If so, could you attach a screenshot of what you see on, say, CMSY10?
did you want the true type, type 1, or both?
for completeness, let's see the true type too.
s/true type/type 1/
hmmm.  i downloaded the type 1 fonts from ftp://ftp.ams.org/pub/tex/psfonts/cm/cmps-
unix.tar.gz but i can't seem to get them to work.  could someone who has them working 
point me in the right direction?
I think OS X doesn't support data fork Type 1 fonts (.pfb, .pfa). The resource
fork-based versions are ftp://ftp.ams.org/pub/tex/psfonts/cm/cmps-oztex.hqx and
ftp://ftp.ams.org/pub/tex/psfonts/cm/cmps-textures.hqx
I am a bit in the dark due to the lack of a Mac build environment. But it could 
be that there is an issue with the handling of the |script|. The following
document describes related problems:
http://www.mactech.com/articles/mactech/Vol.14/14.09/MultiscriptEnvironment/
Maybe it is necessary to "lock" the system into a "symbolic script" mode so that 
the system processes the glyph indices "as-is", and doesn't try to re-resolve 
the glyphs into something else.

===
Also, while re-reading the code, I noted that GetBoundingMetrics() is doing 
something that doesn't look quite right: Looking at 
nsUnicodeRenderingToolkit::GetTextSegmentBoundingMetrics(), it does

  ::TextFont(fontNum);
  ScriptCode script = ::FontToScript(fontNum);
  [...]
        GetScriptTextBoundingMetrics(buf, outLen, script, segBoundingMetrics);

But FontToScript() doesn't have a meaningful value for a symbolic font. So the 
value should be fetched from |fontMapping| which is already caching them. For 
example fontMapping.ConvertUnicodeToGlyphs() could be changed to return the 
script code. Or there could be a function fontmap.GetFontScriptCode(fontnum);

Then, some special-casing will be needed to avoid calling CharacterByteType() 
with BAD_SCRIPT in GetScriptTextBoundingMetrics().
mac folks, any additional insights here?
Attachment #94269 - Attachment is obsolete: true
Attachment #95170 - Attachment is obsolete: true
Attachment #96383 - Attachment is obsolete: true
Attached image ATSUI's view of ASY10
Attached image ATSUI's view of INE10
A couple of points: The glyph-to-character mappings of the LaTeX font family
fonts are completely bogus. I think we shouldn't instruct people to install
fonts like this since they might disturb ATSUI.

The mappings of Computer Modern Plain are partially correct. There are some
mismapped glyps: the ff ligature maps to U+21B5 DOWNWARDS ARROW WITH CORNER
LEFTWARDS, which explains an anomaly I though was a regression in Mozilla. The
ffl ligature maps to U+270f PENCIL. The dotless j maps to PLACE OF INTEREST
SIGN. The grave accent maps tp CHECK MARK. The glyph to the left of te
exclamation mark maps to U+0009. The right curly double and single quote maps
to ASCII  QUOTATION MARK and APOSTROPHE. The upside-down exclamation mark maps
to LESS-THAN sign. The upside-down question mark maps to GREATER-THAN SIGN. the
left double quote maps to REVERSE SOLIDUS. The dot above maps to LOW LINE. The
left single quote maps to GAVE ACCENT. Also the glyphs after z have bogus
mappings.

IMO, it would make more sense to fix the font than to try to special-case it in
Mozilla, because it would disturb ATSUI in other apps anyway.
And the reason why teh Wolfram fonts don't cause trouble with other apps is that
as far as ATSUI is concerned, all the glyphs of those fonts are mapped to PUA
code points.
Is it related to bug 147704? I don't know what fonts are installed for TeX on my
machine, but I have the packages from the Debian distribution.
The screenshots I've attached show two things: The TeX fonts are
ATSUI-incompatible. However, their character repertoire is for the most part
covered by the fonts that come with OS X (at least Jaguar).

Since the fonts could cause trouble with other apps, I think we shouldn't
instruct users to install the fonts and, therefore, shouldn't try to work around
the issues in Mozilla. I think it would make more sense to make the fonts
ATSUI-compatible by importing the Type 1 outlines in a properly coded OpenType
wrapper. If I understood the AMS copyright statement correctly, doing that is OK
if the new font isn't misrepresented as an AMS font. However, I don't know what
tools or skills are needed for making an OpenType fonts out of premade outlines.
ATSUI seems to be aware of code points up to 2FFFF, so coding the non-BMP chars
probably isn't a problem from the ATSUI point of view, although it could be from
the Mozilla point of view.

Would this make sense?
 * Making Mozilla not complain if the TeX fonts are missing.
 * Documenting that the user should not install the old TeX fonts on Mac.
 * Trying to figure out how to convert the fonts to properly mapped 
   OpenType fonts.

Disclaimer: I haven't tested any of this stuff on Mac OS 9.

(This bug and bug 147704 don't seem to be related.)

> A couple of points: The glyph-to-character mappings of the LaTeX font family
> fonts are completely bogus. I think we shouldn't instruct people to install
> fonts like this since they might disturb ATSUI.

These fonts are "symbol" fonts and so they don't have any predefined mapping 
other than as defined by the application. It seems to me that it is ATSUI (like 
the Windows GDI or the X-font server) who should be special-casing these fonts 
by labelling them as "symbol", in the same category as other symbol fonts such 
as Webdings, Wingdings.

To stop Mozilla from popping up the missing font dialog about TeX fonts, just 
set this pref in your prefs.js (or your user.js):

user_pref("font.mathfont-family", "Math1, Math2, Math4, Symbol");
*** Bug 171423 has been marked as a duplicate of this bug. ***
Summary: MathML rendering breaks if TeX fonts are installed → MathML renders wrong symbols if TeX fonts are installed
I am not sure where this is all but since I've been given the hint to remove
texfonts, so I did... but... fonts are still corrupted.
Attached is a screenshot of 
http://www.w3.org/Math/testsuite/testsuite/Content/TheoryOfSets/notin/notin2.xml
which definitely shows that there a few corrupted things.
Note: I'm not on Jaguar (yet), I have the Math 1 to 5 and not TeX font
installed...
Does 10.2 make the difference ?
Is there a bug in the UMSS? Using the "View MathML Source" feature on Win32 with
the example, I see that the presentation MathML that is fed to Mozilla is:

<math xmlns="http://www.w3.org/1998/Math/MathML">
  <mrow>
    <notin/>
    <mo/>
    <mfenced open="(" close=")" separators=",">
      <mrow>
        <mi>a</mi>
      </mrow>
      <mrow>
        <mi>A</mi>
      </mrow>
    </mfenced>
  </mrow>
</math>

This renders as "(a, A)".
Depends on: 228804
This patch does a special casing for TeX fonts to use ATSUI engine instead of raw QuickDraw functions, in GetTextSegmentDimensions, GetTextSegmentBoundingMetrics and DrawTextSegment.
The cause of this problem was that QD's DrawText expects the data to be encoded according to the current font's script system while ucvmath output was UCS2-truncated-to-8bit.
I first thought of using SetScriptVariable to set the script's encoding to smUninterp, but it didn't work.
This patch is not without problems: e.g. performance hurt by fontname comparison, bad layout for some glyphs including u2212 (minus sign, though u002D is not affected) and u00B1 (plusminus)....
> I first thought of using SetScriptVariable to set the script's encoding to
> smUninterp, but it didn't work.

1/ If you do ::FontToScript(fontNum) when fontNum is a Mathematica font, do you get smUninterp as the script? If not, could you try to set the value that you see with from Mathematica fonts on TeX fonts too?

2/ Is there any difference when using the Type1 versions? Note that the font-encoding mappings are very different. So you will need some of the earlier patches here to map the Type1 versions to the right glyphs.
http://www.mozilla.org/projects/mathml/fonts/encoding/cmsy.html

In particular, looking at the listings of TrueType vs Type1 on the page, minus and plusminus have totally different encoding points:

TrueType:                                Type1
  0xA1  0x2212, ...  #Minus sign            0x00  0x2212, ...  #Minus sign
  0xA7  0x00B1  #Plus-minus sign            0x06  0x00B1  #PLUS-MINUS SIGN
  ^^^^                                      ^^^^
Both cmsy10 and Math1 have script code 0 and encoding 0, lang 0
Here, "script code" means the result of FontToScript(fontNum), "encoding" the result of
::GetScriptVariable(smCurrentScript, smScriptEncoding);
and "lang" the result of 
::GetScriptVariable(smCurrentScript, smScriptLang);
The rendering does not change when I change these to encoding=32 (smUninterp), lang=32767 (langUnspecified) for both fonts.
please wait a bit for type1s.
I've installed the CM-type1-for-texture distribution from ams ftp
ftp://ftp.ams.org/pub/tex/psfonts/cm/
and uninstalled bakoma.
looks like Firefox 1.5 rc1 renders MathML just fine with them, so that the only MathML bug remaining in that release is the "Symbol missing" alert...
So you don't get the gibberish that Henri mentionned in Comment #16? Maybe you guys have different OS versions or something? Anyway, it answers the question that these truetype fonts are not appropriate for the Mac -- perhaps because they miss bits of the internal font data that the MacOS wants as I speculated earlier.

Let's forget about the TrueType, and let's focus now on enabling the Type1. Note that they still cannot work properly without the .t1 converters in ucvmath. You can check this by removing the Mathematica fonts so that they don't kick in, and/or setting the font.mathfont-family to "CMSY10, CMEX10", and visiting the torture test. 

Also, you should "recover" the character maps of the TeX fonts when you visit:
http://www.mozilla.org/projects/mathml/fonts/encoding/
BTW... just to spell it out, be sure not to let your special-casing patch in the way, of course.
cm-type1-for-texture fonts are working without Mathematica fonts, once they are recognized by the OS (for reproducers: I had to copy the file "CM/PS screen" from "CM screen" folder, togther with CMEX10...CMSY10 to make them recognized)
cmsy10-type1-for-texture has a cmap table declaring "Mac platform, Roman script, format 0 (256 glyphs array)" with e.g. 0xA1 - Minus glyph correspondence.
IMU this is why it goes well with the truetype conveter (and treating its output as a MacRoman data).
Do you mean that the character map of, say Type1 CMS10, is *exactly* 
attachment 95294 [details]? or
http://www.mozilla.org/projects/mathml/fonts/encoding/cmsy-ttf.gif
Attached file mapping of cmsy for texture (obsolete) —
With ATSFontGetTable for CMSY10 I get two cmap tables, one for "Unicode platform" and another for "Mac platform." The Mac table agrees with that of attachment 95294 [details] when you read that as "Minus glyph is in the place of MacRoman-0xA1, or U+00B0, BigCircle glyph is in the place of MacRoman-0xB0 or U+221E..."
I put the whole mapping of Mac table in the attachment. Do you need the code generating it?
It appears therefore that those two tables were deliberately put in the font data to enable using the same font on the Windows platform too.

I assume that it is the Unicode table that is 0-based and maps the type1 converter in ucvmath, right? In the case of CMR10 (which also has a type1 converter in ucvmath), it was reported that it is indeed 0-based, see attachment 96705 [details]. Is it possible to use these Unicode maps instead in the measuring and drawing functions? This way, it won't be necessary to setup brand new font mapping tables in ucvmath to resolve the issues with the minus, big circle, etc.

[I gather from this thread that the additional map in the type1 file must have been added after-the-fact. The initial 0-based mappings cannot work on Windows because this OS reverses some slots in the character map. For example, position 0 is reversed for the replacement glyph, etc.]
attachment 96705 [details] does not reproduce for me. CMR10's A appears at U+0041 in the Characters Panel.
Unicode map of CMSY is just the ditto of mac table (it's sparse due to MacRoman-Unicode conversion) The "glyph index" (1 based) matches ucvmath-type1 output offset by 1. Glyph 0 means "nonexistent" as in Windows.
Interestingly, Math1 and Symbol do more tricky things: they have 0x03B1->'alpha' mapping in the unicode table and 0x61->'alpha' in the mac table (although their mac table claim to be in Roman script code...), see my attachment.
Attachment #202959 - Attachment is obsolete: true
Why is that things seem to work in your system whereas others reported the gibberish. What is in your system that others do not have?!?

Can other Mac people test the fonts again to confirm that they now works for them too?
jshin, it appears from makoto observations that Type1 TeX now work just fine out-of-the-box on the Mac. Is that your experience too?!?
Attachment 96705 [details] seems to have been taken with 'character pallet' with the glyph view selected and I got exactly the same result, which is identical to the visual encoding of CMR10 except for the off-by-one difference as already reported in comment #54 and shown in attachment 96705 [details]. I'm not sure what Yamashita-san meant when he wrote that he couldn't reproduce attachment 96705 [details].

 Latin letters and numbers in the ASCII range have the correct mapping (I can get their Unicode positions when I hover the cursor over), but Greek letters have completely wrong Unicode positions mapped. Presumably, this means that ATSUI would be 'disturbed' with CMR installed as Henri wrote.

I wonder how attachment 95294 [details] was obtained.  
Oh, I just didn't know the existence of the glyph view in the pallet.
That surely reproduces in the glyph view, but the view looking that way is not a problem itself, right? It's true that CM fonts have *incorrect* unicode mappings, but I don't think this affects outside-MathML experience. Naturally MacRoman characters are covered by so many fonts, it's less likely CM glyphs are used for them.
Moreover I don't think this "incorrect mapping" theory explains the cause of the original problem... They do match mozilla's expectation (if we interpret the output of ucvmath as the codepoints for the mac table).
So, if CM fonts work for mozilla MahML now, the options are
1) Keep telling people not to install CM fonts, because theoretically they *can* butter the rendering of macroman latter half chars. And look for PUA mapped CM fonts.
2) Close this bug
I definitely prefer 2.
> I wonder how attachment 95294 [details] was obtained.  

It seems from the screenshot to have been produced by a freeware. Maybe (highly speculative maybe), they may have made their code to scan the Windows data in the TrueType font as well. I have been speculating that the Windows TrueType versions may not be woking on the Mac due the same issue that you mentioned in bug 254585 comment 6. 

But I am confused about the conflicting messages from you guys with the Type1 versions. First, the one-off shift creates a difference re: the .t1 converter used in Unix non-Xft builds. Next, the "disturbance" of ATSUI makes me wonder what other applications are doing. Surely, TeXtures and friends do work on the Mac. What are they doing that we _cannot_ do? Next, what makes Makoto see that Mozilla renders MathML fine out-of-the-box for him?
To make it sure, we aren't using ATSUI for CM fonts. When CM* is chosen the string data get converted and go to DrawScriptText, l 1202 of nsUnicodeRenderingToolkit.cpp, that in turn calls QD's ::DrawText that refer to the "macroman" cmap table.
jshin, it's not clear from your comment what happens to you when you open
http://www.mozilla.org/projects/mathml/start.xhtml
with Firefox rc3 (or whatever recent official 1.5 release you have) and cm-type1-for-texture installed on your system. Now I have two working cases other than my own mac, so please see how it goes. If that's fine, refrain of #58. If not, is it like attachment 94050 [details]?
Is this WONTFIX at this point?
QA Contact: ian → mathml
(In reply to comment #62)
> Is this WONTFIX at this point?

I suggest closing this bug, since TeX fonts are no longer the recommended fonts for mozilla 1.9:

http://www.mozilla.org/projects/mathml/fonts/
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: