Closed
Bug 86427
Opened 24 years ago
Closed 21 years ago
Vietnamese support is deficient (UTF-8 and VISCII)
Categories
(Core :: Internationalization, defect)
Tracking
()
RESOLVED
INVALID
Future
People
(Reporter: fare+mozilla, Assigned: smontagu)
Details
(Keywords: intl)
Attachments
(2 files)
I've long tried to get a linux browser to read vietnamese correctly.
Painful attempts with Netscape 4 were unsuccessful.
My latest attempt with mozilla 0.9.1 is both encouraging
and unsatisfying.
I've converted some vietnamese text into UTF-8.
Works perfectly on Windows with a recent IE or Mozilla
(unicode fonts preinstalled; didn't check font auto-installation).
Mozilla 0.9.1 on Linux will kind of show it correctly,
but will use a really **** scaled raster font
for any character not in latin1,
even though I do my best to specify as "Unicode" font
a TrueType font that does have all required characters.
You can test mozilla with the following URL:
http://ciev.org/1984-vi-utf8-html/
By comparison, my previous attempt, using VISCII 1.1,
doesn't work with the most recent IE under windows (5.5)
or mozilla under Linux (0.9.1), but kind of works with
mozilla under Windows (if I re-specify the font everytime).
http://ciev.org/1984-vi-viscii-html/
Mozilla 0.9.1 will kind of work if forced into VISCII encoding,
which is a global setting and uses the ugly unscaled raster font.
What's the Right Thing(tm) to declare VISCII as charset encoding?
Note: the œ character, essential to fully support french,
is also displayed in this ugly font under Linux.
Comment 2•24 years ago
|
||
I the real problem is we do not let users specify a vietnames font in the font
pref and we do not recognize vietnames font
In Linux, what is the XLFD on vietnames font? where can we find one?
reassign to bstell@netscape.com and mark it as future
Assignee: nhotta → bstell
Target Milestone: --- → Future
| Reporter | ||
Comment 3•24 years ago
|
||
I don't know zilch about XLFD. If you give me sensible URLs, I'm willing to have
a look.
Maybe selecting a vietnamese font might help; but even then, a main difficulty
with vietnamese is that it mixes of latin characters in the 00-7F range, the
100-1FF range, and the 1E00-1EFF range, and you want words in a vietnamese text
to be displayed with the SAME font, for the sake of readability. So stubborn
per-character range-checking won't do it. But then, mix of ranges happen for
other languages (e.g. european ones), so if these should work (french has same
problem as vietnamese "thanks" to œ), then vietnamese should too, by the
same solution.
Why can't I have a way to just specify Bitstream CyberBit or Arial Unicode MS,
or some other TrueType font with large Unicode support, and get a clean uniform
result from my browser in any mix of language? I don't know how IE manages
things, but at least my pages display nicely with it, so something should be
posible. At the very least, the "current" font shouldn't be overridden with
unifont or anything if it contains the required characters. And there should be
a way to specify something better than unifont for the overriding font.
Additional difficulty with vietnamese fonts: so as to be compatible with legacy
software, many of them "cheat" with iso-10646-1 encoding, by faking the default
encoding of the system (iso-8859-1 or windows-1252). I don't know if the
TrueType versions contain disambiguating information (how can I check?).
| Reporter | ||
Comment 4•24 years ago
|
||
BTW, to confirm my latter remark, the VISCII version works with mozilla and IE,
if I cheat and DO NOT specify VISCII as the encoding. If I DO specify VISCII as
the encoding, then I MUST NOT specify a VISCII font, and let IE choose its nice
font and Mozilla choose its ugly font. The problem being VISCII fonts cheating
with encoding by being incorrectly declared as standard windows or latin
encoding, for compatibility with legacy programs *and documents* (the latter
part being the most tricky since there is no universal conversion utility, even
less a fully automatic one). Indeed VISCII fonts existed before any office
application really supported Unicode.
BTW, MSIE seems to grok BASEFONT, and not Mozilla. Is it on purpose, or should I
file a bug report?
Updated•24 years ago
|
QA Contact: andreasb → ylong
Updated•24 years ago
|
Status: NEW → ASSIGNED
| Reporter | ||
Comment 5•24 years ago
|
||
Ok, after a lot of attempts, I finally got around to have vietnamese work
properly. However, it was HELL getting it to work.
First problem and fix, I realized that although I had installed fonts that were
capable of displaying vietnamese (Verdana, etc), they weren't recognized as such
by X, and thus by Mozilla. Of course, Mozilla gave me no hint about it, and I
had to discover it painfully, by reading lots of documentation all over the net.
Declaring fonts as being iso10646-1 capable in the fonts.dir was a matter of
hacking a simple shell/perl script, which it's definitely not something a newbie
can do. It might not be strictly Mozilla's fault, but I think the Mozilla
documentation should at the very least include some remarks about this in a
prominent way in the release notes, or else the unicode support in Mozilla will
prove mostly useless to most Linux users.
Next problem, configuring Mozilla so as to display all the vietnamese characters
using the SAME font. I spent HOURS trying to find a correct setting, because the
font selection code in Mozilla SUCKS big time. My! Go see how Konqueror does it
-- they do it MUCH MUCH better. Actually, I only got heart trying with Mozilla
thanks to Galeon's slow but much faster and much more usable selection code, so
I first got it to work with Galeon, and then migrated my settings to Mozilla.
At the worst moment, I had *4* classes of vietnamese characters, each displayed
with a different font:
* vietnamese letters present in western alphabets
* vietnamese letters o+ u+ (U+1B0) and some variants
* vietnamese letter with ?-shaped accent (da^'u hoi?)
* other vietnamese accented letters
Ultimately, I "fixed" the problem by selecting Verdana in all 12 proportional
font settings of Western, Unicode and User-Defined, and it worked.
Font selection is so slow and difficult (over 1 minute to make the slightest
change -- and it was much worse when I had those hundreds of fonts installed in
the server), that I stopped trying to identify a "minimal solution" once I got
things working. A simple way of seeing fonts without selecting them (as in
Konqueror) and of applying font settings without Mozilla from the preferences
menu (as in Konqueror) without having to spend one minute closing and opening it
again would be great. Also, configuring fonts is all the more disheartening
since there is no user-available description (except the source) of how font
selection works, and thus of how users should be configuring them. Also, if it
ever was necessary to have lots of fonts settings, a simple way to copy/paste
from settings to settings would help a lot.
Note: the situation with vietnamese on MacOS9 seems desperate, so even though it
was hell on Linux, it could have been even worse.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Comment 6•24 years ago
|
||
Mark it as verified for Linux by the previous comment. Please re-open if still
has problems.
Status: RESOLVED → VERIFIED
| Reporter | ||
Comment 7•24 years ago
|
||
Yow! I am now using Mozilla 0.9.7, and Vietnamese is broken again.You can browse around http://ciev.org/1984-vi-utf8-html/ and see the results.It ought to be all of the same unicode font, say Verdana.I get many different fonts, at least 3 different after I set everythingto Verdana, and maybe 5 or 6 different if settings differ (but I don't have the courage to test anymore, considering the utter slowness of testingfont settings -- maybe a debugging tool to identify which font from whichsetting was used at a given point would help). The following charactersshould (hopefully) span the whole range of character classes used by Mozilla:ASCII and Latin1 characters (a a a' á),latin-N characters (dd đ DD Đ),extended latin characters (o+ ơ u+ ư),characters with ? accent (a? ả),other vietnamese accent combinations(u+' ứ u+? ử e^? ể e~ ẽ o^~ ỗ).I don't recognize which font is chosen for ASCII and Latin1 when I browsea vietnamese page, but it looks like none I selected in the settings,and certainly not the Verdana that I managed to configure for all the other character classes.I don't know how Mozilla handles its fonts, but it seems overly complicated.Instead of building huge kludges, you should promote use of unicode fonts.With Internet Explorer, you have one font choice for all Latin and derived,and Verdana (or Times) and Courier work great. The Microsoft fonts areavailable on all platforms. Bitstream and B&H also have fonts available for everyone. Konqueror also has complicated settings like Mozilla, butat least, it seems to work (plus it's anti-aliased!).If you really want to kluge something that doesn't depend on non-freely available fonts, I think that rather than have settings so complex noone understands how they work, you should have simple settings,and provide one or two "virtual fonts" made from existing free fonts.Or maybe manage to distribute Lucida Console or something like thatas part of the Mozilla package or an associated package.It might pay to do the right thing and achieve the distribution of a realunicode font, rather than add kluges over kluges so as to handlelatin characters.Yours freely, -- #f ?
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
This image shows how Mozilla is rendering UTF-8 vietnamese for me.
I have installed MS-Arial ttf font, which does work for vietnamese
afaik (because it shows fine on windows mozilla)
It looks like the font is being rendered as a composition of similar
characters, which for example if i highlight a(? (U+1EB3), it is all
a single glyph, not separate characters ( which is good except that it
is a terrible looking rendering).
I wonder what causes this type of output- rather than using the
correct glyph in the unicode range U+1E00 – U+1EFF.
Someone mentioned getting Xfree to recognize iso10646-1, if they could
elaborate how they did that...
thx,
| Reporter | ||
Comment 10•24 years ago
|
||
As for installing unicode fonts under XFree86.
First, you must be TrueType-ready. Install XFree 4.1.0 or later and be sure that
FreeType be enabled in the XF86Config-4: in Section "Module", Load "freetype".
Using older servers (or servers from an evil proprietary vendor), install xfsft,
or the latest xfs from XFree86 that already includes the freetype patches.
Then, you must install the fonts. Get the Microsoft web fonts from:
http://www.microsoft.com/typography/fontpack/default.htm
and unpack them all in /usr/X11R6/lib/fonts/TrueType/ or /usr/local/... or some
such -- using debian woody or later, you can just apt-get install msttcorefonts
which will do the job for you.
Then you must declare the fonts as Unicode-ready. The idea is to create a
fonts.dir file with for each font an entry with encoding iso10646-1 as in:
Verdana.ttf -Microsoft-Verdana-medium-r-normal--0-0-0-0-p-0-iso10646-1
So you must cd into the directory where you put the .ttf files, run mkfontdir,
edit the fonts.dir file it created, and add a line like that for every font.
Emacs macros or perl can help you there. Then copy the fonts.dir file into a
fonts.scale file, so as to ensure the fonts will work in all sizes. Note that
under debian, must instead edit /etc/X11/fonts/TrueType/msttcorefonts.scale and
dpkg-reconfigure msttcorefonts (or manually run update-fonts-dir and
update-fonts-scale).
If you write a HOWTO, or better, a script (perl or whatever) that does
everything, and publish it on a web page, you'll do everyone a great service!
I'd add it to http://ciev.org/ that already points to a few pages that can help
you about Unicode or test your browser with vietnamese.
BTW, ciev.org is experiencing DNS problems, so you can try
http://206.63.100.249:8108/ instead. Similarly the first page I use to test
browser support of vietnamese (cited in a previous addendum to this bug report) is
http://ciev.org/1984-vi-utf8-html/1984-1-1.html
http://206.63.100.249:8108/1984-vi-utf8-html/1984-1-1.html
| Reporter | ||
Comment 11•24 years ago
|
||
Here is a simple no-frills page containing Vietnamese data using Unicode, using
characters in the U+1Exx range rather than explicit composition of Unicode
characters (hum -- that would require another test case). You can find it on
http://ciev.org/1984-vi-utf8-html/1984-1-1.html
or (if the DNS is still down) on
http://206.63.100.249:8108/1984-vi-utf8-html/1984-1-1.html
The test is successful if and only if all characters show as correct Unicode
glyphs of the SAME FONT (say, Verdana), just like IE5 or Konqueror 2.2 do it.
Currently, Mozilla uses up to 5 different fonts for the text -- ugly.
Comment 12•24 years ago
|
||
I went ahead and added the iso10646-1 to my cyberbit font, and
restarted xfs. I am running Xfree 3.6, but i worked, and im rather
bothered that ttmkfontdir doesnt know how to generate that line.
Once i load the 1984 excerpt, xfs tries to rasterize the entire 13
megabyte font, which freezes up my desktop (and production server
(now you know why im running 3.6 still)) for about 30 seconds.
At least the glyphs are readable now, even if the letters with
two diacritics are in a different font than those without, i can
at least see the right letters. (Now to learn the language)
Now that Ive gotten it this far, Im willing to bet its just not
a Mozilla issue, though this has been the most productive forum
Ive found for getting this working.
This issue probably needs to be taken up with freetypes mkttfntdir
and some Xfree86 faq.
I cant wait for Xft and render, and it looks like mozilla will be ready.
i cant get te the ciev ip address at all, but I think the contents of this
bug should at least be harvested for a faq on the subject.
Comment 13•24 years ago
|
||
I think once you set the ttf font as ISO-8859-1 then it should be able to see them
The problem you saw are the last-resort transliteration code.
which linux are you using ?
Status: NEW → ASSIGNED
| Reporter | ||
Comment 14•24 years ago
|
||
I'm using the latest packages from the latest (unstable) debian sid (well, now two weeks old). And yes, the fonts are also declared as iso-8859-1 (although in a different directory - I will retry with the same font in same directory, later, in case it might matter). Anyway, once the font is declared as iso-10646-1, which subsumes iso-8859-1, why should mozilla care whether it is also iso-8859-1 at all??? Why does mozilla try to look into other fonts characters that
are actually present in the current font, and declared as such?
If you can't reproduce the bug (can you not?), I can setup a test system, running an Xvnc server, or whatever suits you.
Comment 15•24 years ago
|
||
Using
Mozilla/5.0 (X11; U; SunOS sun4u; en-US; rv:0.9.9+) Gecko/20020318
the browser display is somewhat strange.
I have installed the ISO10646-1 fonts found at
http://www.cl.cam.ac.uk/~mgk25/unicode.html (TrueType WGL4 fonts don't work
correctly here) and tried to view attachment 66689 [details].
Unfortunately it seems that the presence of certain unicode characters triggers
a display mode where
- the Preferences/Appearance/Fonts setting for Unicode will be ignored (same as
comment 5)
- ISO8859-1 characters will be displayed in some monospace font
- most non-ISO8859-1 characters will be displayed in another monospace font
- characters containing a "dau hoi" diacritic will be last-resort-transliterated
- in the Tab bar, non-ISO8859-1 characters show up in a third font
- if you try to select last-resort-transliterated text the selection does not
follow the mouse cursor as it should
- in the print preview, the title in the header is displayed correctly in the
Adobe-Times ISO10646-1 font! (the rest remains messed up though)
In Windows, everything is displayed in the Times New Roman font as defined in
the Preferences dialog.
As said, this is very strange and if we could find out what is responsible for
this mode and fix it, Mozilla usability for displaying vietnamese content would
be greatly improved. The print preview header does The Right Thing(tm), why do
Navigator and Composer not?
Comment 17•21 years ago
|
||
> - the Preferences/Appearance/Fonts setting for Unicode will be ignored (same as
> comment 5)
See bug 256383. Also note that currently Vietnamese is regarded as 'x-western'
so that you have to set
fonts for Western to Vietnamese fonts you want to use (or Latin fonts with a
sufficiently large coverage).
Besides, if you have a page in one of Unicode encodings (e.g. utf-8), make sure
to specify 'lang'
like this <html lang="vi">. If only a part of the document is in Vietnamese, use
'lang="vi"'
only in that part (e.g. <div lang="vi">, <span lang="vi">, <p lang="vi">, etc).
In case your pages are
in one of Vietnamese encoding, Mozilla infers the languge from the page encoding
and uses that unless
it's overriden explicitly with 'lang' attribute.
This was not a bug per se but a documentation issue (it should have been made
clear that fonts for Western
have to be used to set fonts for Vietnamese in one of Vietnamese encodings).
Perhaps, we have to open a new bug about help and other documentation issues on
the font selection.
Status: NEW → RESOLVED
Closed: 24 years ago → 21 years ago
Resolution: --- → INVALID
You need to log in
before you can comment on or make changes to this bug.
Description
•