Closed Bug 53477 Opened 24 years ago Closed 23 years ago

Unix/X11 PS printing problem with UTF-8 encoded PS/CID fonts

Tracking

()

Status:

VERIFIED DUPLICATE of bug 75930

People

(Reporter: jshin, Assigned: masaki.katakai)

References

(
URL
)

Details

Attachments

(4 files)

output files in my environment 24 years ago Masaki Katakai 5.56 KB, application/octet-stream		Details
output after changing background and text color 24 years ago Masaki Katakai 19.61 KB, application/octet-stream		Details
The page using iso-88592-2 encoding (png); to get the same try the link "Kodovani" 24 years ago Michal 'hramrach' Suchanek 62.46 KB, image/png		Details
Postscript - viewable with ImageMagick's `display' program, white on white in gv 24 years ago Michal 'hramrach' Suchanek 232.42 KB, application/postscript		Details

Jungshik Shin

Reporter

Description

•

24 years ago

When the following lines are put in unix.js,
  pref("print.psnativecode.ko", "utf-8");
  pref("print.psnativefont.ko", "Munhwa-Regular-UniKS-UTF8-H");
instead of
  pref("print.psnativecode.ko", "euc-kr");
  pref("print.psnativefont.ko", "Munhwa-Regular-KSC-EUC-H");
(where Munhwa-Regular-UniKS-UTF8-H is a CID-keyed font with UTF-8
CMap and Munhwa-Regular-KSC-EUC-H is a CID-keyed font with EUC-KR
CMap), Hangul syllables in EUC-KR encoded are printed as
hollow boxes. (Linux build 2000090608)

Masaki Katakai

Assignee

Comment 1

•

24 years ago

Hi Jungshik, thank you for testing PS printing in Korean locale.

I sent wrong info to you, sorry. I had provided a preference
separately for UNICODE based font, can you try again with
the following prefs?

 pref("print.psunicodefont.ko", "Munhwa-Regular-UniKS-UCS2-H");

 pref("print.psnativecode.ko", "euc-kr");
 pref("print.psnativefont.ko", "Munhwa-Regular-KSC-EUC-H");

"UCS2" encoded font should be used.
When "Munhwa-Regular-UniKS-UCS2-H" is not available on the system,
"Munhwa-Regular-KSC-EUC-H" will be used instead.

In my environment, it works.

Hidetoshi Tajima

Updated

•

24 years ago

Status: NEW → ASSIGNED

Jungshik Shin

Reporter

Comment 2

•

24 years ago

Hi Masaki,
Thank you for your reply.

Putting the following line, however,
   pref("print.psunicodefont.ko", "Munhwa-Regular-UniKS-UCS2-H");
in addition to the following two lines

   pref("print.psnativecode.ko", "euc-kr");
   pref("print.psnativefont.ko", "Munhwa-Regular-KSC-EUC-H");

doesn't help me with printing UTF-8 encoded pages. Still worse is
with 'print.psunicodefont.ko' defined, even EUC-KR encoded pages
don't get printed properly. Again ko_ls is defined but 
gets never invoked. Only default_ls is invoked. With the only latter
two lines, at least euc-kr encoded pages get printed fine.

Could you send me or attach the PS files generated with the above three lines
in unix.js from both
   http://ykga.org/~jungshik/proc_euckr.html
      and
   http://ykga.org/~jungshik/proc_utf8.html

Thank you,

Component: ActiveX Wrapper → Printing

Masaki Katakai

Assignee

Comment 3

•

24 years ago

Attached file output files in my environment — Details

cpratt

Updated

•

24 years ago

QA Contact: cpratt → shrir

Masaki Katakai

Assignee

Comment 4

•

24 years ago

I attached the outputs in my environemnt,

prefs.js - my prefs.js
euckr.ps - euckr output
utf8_c.ps - proc_utf8.html from Mozilla started in C locale
utf8_ja.ps - proc_utf8.html from Mozilla started in ja locale
utf8_ko.ps - proc_utf8.html from Mozilla started in ko locale

I think you run Mozilla in C locale, right?

It seems that Mozilla GFX layer passed language group as
`Mozilla running locale' when document is UTF-8 encoded.
When I was working on, as I described in

http://village.infoweb.ne.jp/~katakai/mozilla/printing.htm

"x-unicode" language group was passed when document is UTF-8.

Please see the outputs utf8_ja.ps and utf8_c.ps, which are
outputs from Mozilla that is started in C locale and ja locale.
In utf8_c.ps, "default_ls" is defined. In utf8_ja.ps,
"ja_ls" is used. I'm not sure this is correct behavior or not,
I have to ask Erik.

Erik, is this behavior correct? I could not get "x-unicode"
from UTF-8 document.

I got utf8_ko.ps from Mozilla running in ko locale on Solaris.
It seems fine. How about in your environment when you starts
Mozilla in ko locale?

Erik van der Poel

Comment 5

•

24 years ago

Yes, we now pass the locale's language group down to the font engines when the
document is a Unicode-based one.

Masaki Katakai

Assignee

Comment 6

•

24 years ago

Very sorry for late response.

Jungshik, How is the result when you start Mozilla in ko locale?
As Erik said, now we can not get language group "ko" for UTF-8 page
until we start Mozilla in ko locale.

Erik van der Poel

Comment 7

•

24 years ago

By the way, the layout engine will also pass "ko" down to the font engine if the
document has a LANG attribute with "ko" in one or more of the elements. (Of
course, it only passes "ko" for the element that has that attribute.)

Jungshik Shin

Reporter

Comment 8

•

24 years ago

Sorry for the late reply. I tried a few 'latest' builds over the last couple
of weeks, but all of them didn't get started (it died right after I launched 
it). So, I haven't been able to test it. However, if I remember correctly,
I did test it under LANG=ko and got the same result as I reported. Could it
be the cause of my problem that I set LC_MESSAGES to C although LANG is set
to ko. Once I find any working build, I'll try again with LC_MESSAGES unset
(or set to ko as well). If this turns out to be the cause (that is
setting LC_MESSAGES to ko solves the problem), I guess I have to file
a separate bug about the language group identification (LC_MESSAGES should be
given a lower priority than LANG, shouldn't it? )

Erik van der Poel

Comment 9

•

24 years ago

Hi Jungshik,

Do the recent builds still crash if you move your ~/.mozilla to some other
place?

LANG is the default, while LC_MESSAGES and the other LC_* override LANG for each
of those categories. LC_ALL overrides all of them.

Jungshik Shin

Reporter

Comment 10

•

24 years ago

Hi Erik,
Thank you for the tip. With ~/.mozilla renamed sth. else, I was able to aovid
the crash. 

As I suspected, Mozilla's determination of the language group(and possibly
locale) has something to do with this  problem.
When I unset LC_MESSAGES with LANG set to ko, it works fine. I suspected this
because a few month ago when Mozilla spit out the locale information at
the start-up(it doesn't do any more by default), I noticed it misidentified the 
current locale when the LC_MESSAGES was set to C with LANG set to ko (which is
my usual setting). What LC_*/LANG variables are refered to to determine
the (global) language group? Does LC_MESSAGES play any special role
(With LANG set to ko, setting LC_* other than LC_MESSAGES to C doesn't
lead to what I reported)? 
I couldn't find any hint of that in the source although LC_MESSAGES seems 
treated a little bit differently (approximated with LC_CTYPE).

Erik van der Poel

Comment 11

•

24 years ago

Mozilla shouldn't crash, no matter what you have in your ~/.mozilla. So, if you
don't mind, would you please create a new bug report and attach the contents of
your ~/.mozilla (by tar and gzip). Use the application/octet-stream type for the
attachment. Please assign it to me. Thanks.

Mozilla uses LC_MESSAGES to determine the locale's language group for font
fallbacks, since that is probably the most appropriate category. What category
would you propose? Remember, we should not look at $LANG directly. That is not
the normal way to get the locale. We are supposed to use setlocale with one of
the existing categories.

Jungshik Shin

Reporter

Comment 12

•

24 years ago

Erik,
Thank you for your comment. So, LC_MESSAGES does play a special role.
I have a little different "opinion" about which to refer to among
LC_*/LANG/LC_ALL pick up fonts (or more appropriately language group).
If there's a clear choice among the POSIX locale categories, there'd be no doubt that we have to use it
if it's defined and not overiden by LC_ALL. However, in this particular case,
I don't think there's a clear undisputable category to choose for the job (picking fonts).
Therefore, I guess it's better to resort to the value of LANG (it's even more fitting
considering that Mozilla uses the term 'language group') instead of not-so-much-related (in my
eyes) LC_MESSAGES. In summary, I think the determination of the language group
should not depend on LC_* other than LC_ALL *when* LANG and/or LC_ALL are/is defined.
As for Mozilla crashing on start-up with ~/.mozilla, I'll file a bug
with my ~/.mozilla directory attached (tarred and gzipped).

Michal 'hramrach' Suchanek

Comment 13

•

24 years ago

I can't print czech with mozilla bacause it uses iso-8859-1 fonts in postscript
and iso-8859-2 are needed to display all chars. I think it's the same bug.
Anyway, the page is automatically loaded as iso-8852-2 so there should be no
dependence on environment.
As for LC_* I think that the correct one is LC_CTYPE. As far as I know it was
*designed* to specify what set of character is used by apps. When I want
to _write_ (or print) czech I set LC_CTYPE to cs_CZ leaving other LC_* unset.
Here's an example of my communication with bash which shows function of
LC_MESSAGES: it changes the way the program _speaks_. Of course, you have to
use correct fonts to read the messages.
11:14 msuc8339@u-pl3/1:~$ unset LC_MESSAGES
11:15 msuc8339@u-pl3/1:~$ bash Y
Y: Y: No such file or directory
11:15 msuc8339@u-pl3/1:~$ export LC_MESSAGES=cs_CZ
11:15 msuc8339@u-pl3/1:~$ bash Y
Y: Y: není souborem ani adresá?em

Michal 'hramrach' Suchanek

Comment 14

•

24 years ago

The URL http://www.ms.mff.cuni.cz/index.html.cz

Masaki Katakai

Assignee

Comment 15

•

24 years ago

hramrach, thanks you for report. Could you add comments what your exact problem is?
Which characters are missing? If possible, please attach snapshot.

Actually, in my environment, the characters of white
color couldn't be printed because the background color is white. However, I changed
the background color to white and text color to black, it seems that I get the
correct PostScript file. I'll attach the file, so please evaluate.

Masaki Katakai

Assignee

Comment 16

•

24 years ago

Attached file output after changing background and text color — Details

Michal 'hramrach' Suchanek

Comment 17

•

24 years ago

I guess it's caused by our web server: When your locale isn't czech -> browser
doesn't claim to accept iso-8859-2 encoding -> iso-8859-1 is used with offending
characters converted to non-diacritic ones. I'm going to attach the page
rendered as HTML (png format, unfortunately the colors are wrong) and the
postscript output.

Michal 'hramrach' Suchanek

Comment 18

•

24 years ago

Attached image The page using iso-88592-2 encoding (png); to get the same try the link "Kodovani" — Details

Michal 'hramrach' Suchanek

Comment 19

•

24 years ago

Attached file Postscript - viewable with ImageMagick's `display' program, white on white in gv — Details

shrirang khanzode

Comment 20

•

24 years ago

spam : changing qa to sujay (new qa contact for Printing)

QA Contact: shrir → sujay

Hidetoshi Tajima

Comment 21

•

23 years ago

reassign.

Assignee: tajima → katakai

Status: ASSIGNED → NEW

Masaki Katakai

Assignee

Comment 22

•

23 years ago

temp patch has been attached in bug 75930.

I can now use UTF-8 fonts by the setting,

user_pref("print.psnativecode.ja", "UTF-8");
user_pref("print.psnativefont.ja", "GothicBBB-Medium-UniJIS-UTF8-H");

Masaki Katakai

Assignee

Comment 23

•

23 years ago

The original problem of this bug report has been fixed as bug 75930.

For iso-8859-2 issue, I'll file new one.


*** This bug has been marked as a duplicate of 75930 ***

Status: NEW → RESOLVED

Closed: 23 years ago

Resolution: --- → DUPLICATE

sujay

Comment 24

•

23 years ago

verified.

Status: RESOLVED → VERIFIED

You need to log in before you can comment on or make changes to this bug.