Closed Bug 281283 Opened 20 years ago Closed 20 years ago

rendering of latin/unicode characters is ugly and unreadable when in gb2312 (chinese simplified), though it is fine in other encodings

Categories

(Core :: Layout: Text and Fonts, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED INVALID

People

(Reporter: bugzilla.6.capybara, Unassigned)

References

()

Details

Attachments

(2 files)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0

The example page http://paul.graysonfamily.org/~paul/firefox/pinyin.html
contains a mix of Simplified Chinese characters (encoding: gb2312), ascii, and
escaped latin unicode characters.

Reproducible: Always

Steps to Reproduce:
1. load the page
2. notice that the latin text is ugly
3. switch encoding to western or UTF-8 and notice the difference

Actual Results:  
At step 2, the latin text is in an unusual, somewhat ugly font (probably the
latin subset of the standard slackware linux gb2312 font), there is extra space
around the unicode characters, and the level of boldness varies from character
to character, making the page pretty much unreadable.

At step 3, the latin characters are displayed with the fonts that the browser
usually uses latin text.

Expected Results:  
Steps 2 and 3 should use the same fonts for latin characters, independent of the
encoding used to view the page.  I realize that there is probably some obscure
reason for the different, but it is completely opaque to me, since a letter "a"
is a letter "a" no matter what encoding the web page uses.

I am running Slackware linux 10, so this may have something to do with the
particular fonts I have installed.  I have not tried this on other machines, but
I have observed similar problems for a long time with mozilla/firefox, showing
up in all kinds of different encodings.  But since switching the encoding causes
the characters to look right, it is clear that firefox is capable of showing
them correctly.  It seems bizarre that the encoding (a property of the input
text) should affect the fonts (an independent property of the output text).

I will attach screenshots.
If we had to examine every character and choose a font according to what
language we thought that character belonged to, rendering would become painfully
slow. To get the results you want you should use markup with the "lang"
attribute, e.g.
 
<span lang="en">Chinese characters: </span><span lang="zh-hans">北京大学</span><br>

(view this bug in UTF-8 to see the example correctly)

I think this bug is WONTFIX.
Summary: rendering of latin/unicode characters is ugly and unreadable when in gb2312 (chinese simplified), though it is fine in other encodings → rendering of latin/unicode characters is ugly and unreadable when in gb2312 (chinese simplified), though it is fine in other encodings
Component: General → Layout: Fonts and Text
Product: Firefox → Core
Version: unspecified → Trunk
(In reply to comment #0)
> It seems bizarre that the encoding (a property of the input
> text) should affect the fonts (an independent property of the output text).

I understand your point here. The logic of it is that if there is no language
information in the page, we fall back to using the encoding as a hint for the
language. Like all fallbacks, this only gives good results some of the time.
Assignee: firefox → nobody
QA Contact: general → layout.fonts-and-text
(In reply to comment #3)
> If we had to examine every character and choose a font according to what
> language we thought that character belonged to, rendering would become painfully
> slow. 

I agree (although Gfx:Win does something similar after all the options are
exhausted) Moreover, the result wouldn't necessarily be good, either. 

Anyway, we achieved a similar result using 'fontset'. See bug 229394 (Xft) and
bug 227815. Note that this feature is not available in FF 1.0 on Linux (Xft),
but was enabled only a couple of weeks ago. It's in Mozilla 1.8a6 (Linux
gt2k+xft). FF 1.0 has the feature on Windows. 

Btw, as Simon wrote, it's always a good idea to specify 'lang' (especially in
UTF-8 documents)

Simon, will you resolve this as you see fit? I'm not sure how to resolve it....
Resolving invalid.  The idea of using all chars from the same font as much as
possible is that the assumption is that the font is designed so they'll look
good together.  This is a reasonable assumption, imo.
Status: UNCONFIRMED → RESOLVED
Closed: 20 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: