Unix/X11 PS printing problem with UTF-8 encoded PS/CID fonts




19 years ago
18 years ago


(Reporter: jshin, Assigned: masaki.katakai)



Firefox Tracking Flags

(Not tracked)




(4 attachments)



19 years ago
When the following lines are put in unix.js,
  pref("print.psnativecode.ko", "utf-8");
  pref("print.psnativefont.ko", "Munhwa-Regular-UniKS-UTF8-H");
instead of
  pref("print.psnativecode.ko", "euc-kr");
  pref("print.psnativefont.ko", "Munhwa-Regular-KSC-EUC-H");
(where Munhwa-Regular-UniKS-UTF8-H is a CID-keyed font with UTF-8
CMap and Munhwa-Regular-KSC-EUC-H is a CID-keyed font with EUC-KR
CMap), Hangul syllables in EUC-KR encoded are printed as
hollow boxes. (Linux build 2000090608)

Comment 1

19 years ago
Hi Jungshik, thank you for testing PS printing in Korean locale.

I sent wrong info to you, sorry. I had provided a preference
separately for UNICODE based font, can you try again with
the following prefs?

 pref("print.psunicodefont.ko", "Munhwa-Regular-UniKS-UCS2-H");

 pref("print.psnativecode.ko", "euc-kr");
 pref("print.psnativefont.ko", "Munhwa-Regular-KSC-EUC-H");

"UCS2" encoded font should be used.
When "Munhwa-Regular-UniKS-UCS2-H" is not available on the system,
"Munhwa-Regular-KSC-EUC-H" will be used instead.

In my environment, it works.


19 years ago

Comment 2

19 years ago
Hi Masaki,
Thank you for your reply.

Putting the following line, however,
   pref("print.psunicodefont.ko", "Munhwa-Regular-UniKS-UCS2-H");
in addition to the following two lines

   pref("print.psnativecode.ko", "euc-kr");
   pref("print.psnativefont.ko", "Munhwa-Regular-KSC-EUC-H");

doesn't help me with printing UTF-8 encoded pages. Still worse is
with 'print.psunicodefont.ko' defined, even EUC-KR encoded pages
don't get printed properly. Again ko_ls is defined but 
gets never invoked. Only default_ls is invoked. With the only latter
two lines, at least euc-kr encoded pages get printed fine.

Could you send me or attach the PS files generated with the above three lines
in unix.js from both

Thank you,
Component: ActiveX Wrapper → Printing

Comment 3

19 years ago
Created attachment 15420 [details]
output files in my environment


19 years ago
QA Contact: cpratt → shrir

Comment 4

19 years ago
I attached the outputs in my environemnt,

prefs.js - my prefs.js
euckr.ps - euckr output
utf8_c.ps - proc_utf8.html from Mozilla started in C locale
utf8_ja.ps - proc_utf8.html from Mozilla started in ja locale
utf8_ko.ps - proc_utf8.html from Mozilla started in ko locale

I think you run Mozilla in C locale, right?

It seems that Mozilla GFX layer passed language group as
`Mozilla running locale' when document is UTF-8 encoded.
When I was working on, as I described in


"x-unicode" language group was passed when document is UTF-8.

Please see the outputs utf8_ja.ps and utf8_c.ps, which are
outputs from Mozilla that is started in C locale and ja locale.
In utf8_c.ps, "default_ls" is defined. In utf8_ja.ps,
"ja_ls" is used. I'm not sure this is correct behavior or not,
I have to ask Erik.

Erik, is this behavior correct? I could not get "x-unicode"
from UTF-8 document.

I got utf8_ko.ps from Mozilla running in ko locale on Solaris.
It seems fine. How about in your environment when you starts
Mozilla in ko locale?

Comment 5

19 years ago
Yes, we now pass the locale's language group down to the font engines when the
document is a Unicode-based one.

Comment 6

18 years ago
Very sorry for late response.

Jungshik, How is the result when you start Mozilla in ko locale?
As Erik said, now we can not get language group "ko" for UTF-8 page
until we start Mozilla in ko locale.

Comment 7

18 years ago
By the way, the layout engine will also pass "ko" down to the font engine if the
document has a LANG attribute with "ko" in one or more of the elements. (Of
course, it only passes "ko" for the element that has that attribute.)

Comment 8

18 years ago
Sorry for the late reply. I tried a few 'latest' builds over the last couple
of weeks, but all of them didn't get started (it died right after I launched 
it). So, I haven't been able to test it. However, if I remember correctly,
I did test it under LANG=ko and got the same result as I reported. Could it
be the cause of my problem that I set LC_MESSAGES to C although LANG is set
to ko. Once I find any working build, I'll try again with LC_MESSAGES unset
(or set to ko as well). If this turns out to be the cause (that is
setting LC_MESSAGES to ko solves the problem), I guess I have to file
a separate bug about the language group identification (LC_MESSAGES should be
given a lower priority than LANG, shouldn't it? )

Comment 9

18 years ago
Hi Jungshik,

Do the recent builds still crash if you move your ~/.mozilla to some other

LANG is the default, while LC_MESSAGES and the other LC_* override LANG for each
of those categories. LC_ALL overrides all of them.

Comment 10

18 years ago
Hi Erik,
Thank you for the tip. With ~/.mozilla renamed sth. else, I was able to aovid
the crash. 

As I suspected, Mozilla's determination of the language group(and possibly
locale) has something to do with this  problem.
When I unset LC_MESSAGES with LANG set to ko, it works fine. I suspected this
because a few month ago when Mozilla spit out the locale information at
the start-up(it doesn't do any more by default), I noticed it misidentified the 
current locale when the LC_MESSAGES was set to C with LANG set to ko (which is
my usual setting). What LC_*/LANG variables are refered to to determine
the (global) language group? Does LC_MESSAGES play any special role
(With LANG set to ko, setting LC_* other than LC_MESSAGES to C doesn't
lead to what I reported)? 
I couldn't find any hint of that in the source although LC_MESSAGES seems 
treated a little bit differently (approximated with LC_CTYPE). 

Comment 11

18 years ago
Mozilla shouldn't crash, no matter what you have in your ~/.mozilla. So, if you
don't mind, would you please create a new bug report and attach the contents of
your ~/.mozilla (by tar and gzip). Use the application/octet-stream type for the
attachment. Please assign it to me. Thanks.

Mozilla uses LC_MESSAGES to determine the locale's language group for font
fallbacks, since that is probably the most appropriate category. What category
would you propose? Remember, we should not look at $LANG directly. That is not
the normal way to get the locale. We are supposed to use setlocale with one of
the existing categories.

Comment 12

18 years ago
Thank you for your comment. So, LC_MESSAGES does play a special role.
I have a little different "opinion" about which to refer to among
LC_*/LANG/LC_ALL pick up fonts (or more appropriately language group).
If there's a clear choice among the POSIX locale categories, there'd be no doubt that we have to use it
if it's defined and not overiden by LC_ALL. However, in this particular case,
I don't think there's a clear undisputable category to choose for the job (picking fonts).
Therefore, I guess it's better to resort to the value of LANG (it's even more fitting
considering that Mozilla uses the term 'language group') instead of not-so-much-related (in my
eyes) LC_MESSAGES. In summary, I think the determination of the language group
should not depend on LC_* other than LC_ALL *when* LANG and/or LC_ALL are/is defined.
As for Mozilla crashing on start-up with ~/.mozilla, I'll file a bug
with my ~/.mozilla directory attached (tarred and gzipped).
I can't print czech with mozilla bacause it uses iso-8859-1 fonts in postscript
and iso-8859-2 are needed to display all chars. I think it's the same bug.
Anyway, the page is automatically loaded as iso-8852-2 so there should be no
dependence on environment.
As for LC_* I think that the correct one is LC_CTYPE. As far as I know it was
*designed* to specify what set of character is used by apps. When I want
to _write_ (or print) czech I set LC_CTYPE to cs_CZ leaving other LC_* unset.
Here's an example of my communication with bash which shows function of
LC_MESSAGES: it changes the way the program _speaks_. Of course, you have to
use correct fonts to read the messages.
11:14 msuc8339@u-pl3/1:~$ unset LC_MESSAGES
11:15 msuc8339@u-pl3/1:~$ bash Y
Y: Y: No such file or directory
11:15 msuc8339@u-pl3/1:~$ export LC_MESSAGES=cs_CZ
11:15 msuc8339@u-pl3/1:~$ bash Y
Y: Y: není souborem ani adresá?em

Comment 15

18 years ago
hramrach, thanks you for report. Could you add comments what your exact problem is?
Which characters are missing? If possible, please attach snapshot.

Actually, in my environment, the characters of white
color couldn't be printed because the background color is white. However, I changed
the background color to white and text color to black, it seems that I get the
correct PostScript file. I'll attach the file, so please evaluate.

Comment 16

18 years ago
Created attachment 20724 [details]
output after changing background and text color
I guess it's caused by our web server: When your locale isn't czech -> browser
doesn't claim to accept iso-8859-2 encoding -> iso-8859-1 is used with offending
characters converted to non-diacritic ones. I'm going to attach the page
rendered as HTML (png format, unfortunately the colors are wrong) and the
postscript output.
Created attachment 20751 [details]
The page using iso-88592-2 encoding (png); to get the same try the link "Kodovani"
Created attachment 20753 [details]
Postscript - viewable with ImageMagick's `display' program, white on white in gv

Comment 20

18 years ago
spam : changing qa to sujay (new qa contact for Printing)
QA Contact: shrir → sujay

Comment 21

18 years ago
Assignee: tajima → katakai

Comment 22

18 years ago
temp patch has been attached in bug 75930.

I can now use UTF-8 fonts by the setting,

user_pref("print.psnativecode.ja", "UTF-8");
user_pref("print.psnativefont.ja", "GothicBBB-Medium-UniJIS-UTF8-H");

Comment 23

18 years ago
The original problem of this bug report has been fixed as bug 75930.

For iso-8859-2 issue, I'll file new one.

*** This bug has been marked as a duplicate of 75930 ***
Last Resolved: 18 years ago
Resolution: --- → DUPLICATE

Comment 24

18 years ago
You need to log in before you can comment on or make changes to this bug.