Open Bug 677919 Opened 13 years ago Updated 2 years ago

Inconsistent rendering of Chinese characters

Categories

(Core :: Layout: Text and Fonts, defect)

8 Branch
x86
Windows 7
defect

Tracking

()

People

(Reporter: kaixiluo, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug, )

Details

(Keywords: regression)

Attachments

(9 files)

User Agent: Mozilla/5.0 (Windows NT 6.1; rv:8.0a1) Gecko/20110809 Firefox/8.0a1
Build ID: 20110809030751

Steps to reproduce:

I made a list containing Chinese characters with Workflowy (http://workflowy.com)


Actual results:

Firefox rendered the Chinese characters inconsistently, i.e. some characters appeared bolder than others, some other characters were fuzzier than the rest.


Expected results:

Chinese characters should be should rendered in a uniform way.
Kaixi, could you attach HTML file to reproduce this?
I've been able to pinpoint the problem I think. When a web page doesn't specify a Chinese font in their CSS rules, Firefox will attempt to render the page with 2 different fallback fonts, Simsun (which is the default Simplified Chinese font in Firefox) and MS PGothic.

In all those cases, Firefox should only use Simsun.
I can consistenly reproduce this with a static HTML file that meet the above conditions.
Attached file testcase1
HTML file demonstrating the problem. I used the Fontinfo addon to find out the fonts used by Firefox to render the page.
A workaround for web developers is to put a Latin font first and then a Chinese font in their CSS file, i.e. font-family: Arial, Simsun;
Can you post screenshots from the two test cases?

I don't see any bold characters, but I am on WinXP. Nightly, fresh profile. Fontinfo addon says it is using Arial and SimSun only. Also tested "Namoroka/3.6.21pre" and "Firefox/6.0 SeaMonkey/2.3" and saw no bold.
Attached image testcase1 screenshot
Here's the screenshot of testcase1.
Attached image testcase2 screenshot
Here's the screenshot of what the page looks like after I change 

body { font-family: Arial; } 

to 

body { font-family: Arial, Simsum; }
btw, I am on Windows 7 Professional (English Edition). And neither Chrome nor IE9 have any problems rendering Chinese characters.
I just tested everything with a fresh profile on the latest Nightly 32-bit build and the problem is still the same.
I do not have the font "MS PGothic" on my U.S. WinXP so that is why I do not see the bug.

Microsoft says:

Products that supply this font
Office Mac 2008	5.02
Windows 7	5.01
Windows Server 2008	5.00
Windows Vista	5.00

http://www.microsoft.com/typography/fonts/font.aspx?FMID=1271
Attachment #552320 - Attachment mime type: text/plain → text/html
(In reply to kitchin from comment #11)
> I do not have the font "MS PGothic" on my U.S. WinXP so that is why I do not
> see the bug.

I've uploaded the MS PGothic font file to my Dropbox account. Could you download the font, install it on your system and test again?

http://dl.dropbox.com/u/1829895/msgothic.ttc
Confirming bug. After I installed the "MS PGothic" font to WinXp, the test page is displayed in a mixture of "MS PGothic" and "SimSun", resulting in some bold-looking characters, as in the screenshot. Is it due to the fact that Japanese uses a subset of Chinese characters? I think "MS PGothic" is a Japanese font.

Bug:
Firefox Nightly
Seamonkey 2.3 Beta (Firefox/6.0 SeaMonkey/2.3)
Namoroka/3.6.21pre

No bug:
IE8, latest release
Chrome, latest release
Opera, latest release
The test page has no indication of language; hence Firefox can't tell whether to default to the Japanese or Chinese (which locale?) fonts from preferences (for the CJK characters not supported by Arial, the font actually specified by the page). In my testing (on a US English system), it appears to give priority to the Japanese font prefs; but then if there are specific characters not supported in the default Japanese font (MS PGothic), it falls back to the Chinese one (SimSun).

This prioritization should depend on the browser's Accept-Languages settings, and on the user's system locale. If Chinese is given priority in Accept-Languages, for example, I think this should resolve the problem. (You need to restart the browser after adding to the list of "preferred languages" in Tools/Options/Content.)

Also, if you add a language tag such as  lang="zh" to the <body> element, the CJK characters all render with SimSun, as this provides the hint needed to tell the browser which font preference to use.

It's not clear to me how Firefox should be expected to guess the "best" font to use when no appropriate style or language information is available, however.
Thanks for the explanation. I deliberately left out any language tags in the test cases. Why can't Firefox guess the best font to use if both Chrome and IE can?
Bisecting, Firefox 1.0 also has this bug.

Also, if I paste the MS PGothic text from Firefox Nightly to WordPad, it is converted to this:
Font [SimSun] Font Size [10] Font Script [Chinese_GB2312]

If I highlight the text in WordPad and try to change it to "MS PGothic," I cannot, unless I first change Font Script to "Western":
Font [MS PGothic] Font Size [10] Font Script [Western]

The fact that Opera, Chrome and IE agree with each other makes me think the WinAPI is being used differently. Perhaps alpha order plays a role, rightly or wrongly.

The alternative is that Opera, Chrome and IE are all using heuristics, to survey the characters. To test that, we need a test case with only the bold characters. Certainly Chrome uses heuristics for some purposes, because it offers to translate the document from Chinese (Simplified Han) to English!
See also:
Bug 543200 font fallback should try to use the same font for a complete character cluster or word
(In reply to Jonathan Kew from comment #14) 
> This prioritization should depend on the browser's Accept-Languages
> settings, and on the user's system locale. If Chinese is given priority in
> Accept-Languages, for example, I think this should resolve the problem. (You
> need to restart the browser after adding to the list of "preferred
> languages" in Tools/Options/Content.)
> 
> Also, if you add a language tag such as  lang="zh" to the <body> element,
> the CJK characters all render with SimSun, as this provides the hint needed
> to tell the browser which font preference to use.

Adding the language tag definitely makes Firefox render everything with SimSun but adding Chinese in Preferred Languages doesn't resolve the problem.
This is yet another testcase, which is an article from today's news on Xinhuanet (http://news.xinhuanet.com/photo/2011-08/12/c_121848818.htm). I copied the contents of the article to an html file which doesn't has any lang tag or css rule for Chinese characters. 

Here Firefox uses 3 fallback fonts to render Chinese characters, Gulim, Simsun and MS PGothic.
confirmed on
Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:8.0a1) Gecko/20110812 Firefox/8.0a1
I see a different scenario.

Mozilla/5.0 (Windows NT 6.1; WOW64; rv:9.0a1) Gecko/20110828 Firefox/9.0a1

In a new profile, extension Fontinfo is showing that testcase1 (sans workaround) and testcase3 are using only Arial (for English characters and numbers) and MingLiu_HKSCS (for Chinese characters, Traditional or simplified) on my machine. No Gulim, Simsun or MS PGothic are involved. I had all these fonts installed.
On the page in xinhuanet, my Nightly is using Mingliu_HKSCS and Arial on the first 3 paragraphs, and Simsun only for the remaining paragraphs.


My Windows 7 Sp1 English version settings FWIW:
Region and Language-
Format: Chinese (Traditional, Hong Kong S.A.R.)
Location: Hong Kong S.A.R.
Language for non-Unicode programs: Chinese (Traditional, Hong Kong S.A.R.)

Also My Nightly's chrome's font follows the setting of "Fonts for Traditional Chinese (Hong Kong)" in Options > Content > Font, which may differ from you guys.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment on attachment 598799 [details]
another example of ugly character rendering

I have found this inconsistent rendering of Chinese characters frustrating for quite some time now. Finally I decided to register here, on Bugzilla in hoping you could fix this.

Why can Chrome or Safari display them perfectly, yet Firefox keeps having issues? (Mac user here.)

I believe the default rendering font for Chinese characters should be Heiti SC - I've altered my Firefox' userChrome.css to make the browser use this font when displaying Chinese characters in the tabs (since Chinese characters look uneven, ugly in tabs too, by default), and it works like a charm.

Here's what I use in the userChrome.css (through Stylish):

{font-family: "Lucida Grande", "Heiti SC" !important;}


I wish something like this could be done for text rendering on websites, too. Please, do something about it!
(In reply to Jonathan Kew (:jfkthame) from comment #14)

> It's not clear to me how Firefox should be expected to guess the "best" font
> to use when no appropriate style or language information is available,
> however.

Can we do what Safari and or IE does in the same situation?
(In reply to Kaixi Luo from comment #15)
> Thanks for the explanation. I deliberately left out any language tags in the
> test cases. Why can't Firefox guess the best font to use if both Chrome and
> IE can?
Because it is not the "best" for Japanese text. We, Japanese users, sometimes suffered from ugly rendering of IE and Chrome. There's no solution everybody can be comfortable because of Han Unification. It's the reason authors should specify the language explicitly.

(In reply to Gen Kanai [:gen] from comment #24)
> Can we do what Safari and or IE does in the same situation?
Things are not so easy.
(In reply to Masatoshi Kimura [:emk] from comment #25)
> (In reply to Gen Kanai [:gen] from comment #24)
> > Can we do what Safari and or IE does in the same situation?
> Things are not so easy.

Yeah. As jfkthame said in comment 14, we're using Accept-Language pref values for deciding the priority between CJKT languages. Normally, users must download their language's localized build or customize Accept-Language pref if they don't use the localized build. If so, the list gives priority for the language. I guess that you're using English Nightly build on non-CJKT Windows and you have never customized the Accept-Language list. If so, we try Japanese, Korean, Chinese Simplified, Chinese HK, Chinese Taiwan by this order.
http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/gfxPlatform.cpp#1014
(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #26)
> I guess that you're using English Nightly build on
> non-CJKT Windows and you have never customized the Accept-Language list.
Per comment #18, Accept-Language settings do not have an effect (and I confirmed).
Our fallback order selection may have a bug.
I found [Regional and Language Options] > [Formats] virtually took over the settings from Accept-Languages. Is this intentional?
Even if it's deliberate, I'm not sure it's correct to obtain the default language from the [Format] settings for the purpose of font selection. Using [Language for non-Unicode programs] would be more appropriate (and what other browsers did).
If the font-family property does not list a generic family, it would be appended. The appended family would be taken from the "font.name.variable.<language>" pref.
http://hg.mozilla.org/mozilla-central/file/8822243a8d6c/layout/style/nsRuleNode.cpp#l2749
nsStyleFont::Init would call nsPresContext::GetLanguageFromCharset() to get the language if the language is not specified on the Web page explicitly.
http://hg.mozilla.org/mozilla-central/file/8822243a8d6c/layout/style/nsStyleStruct.cpp#l158
nsPresContext would get the mLanguage from nsLanguageAtomService::GetLocaleLanguage().
http://hg.mozilla.org/mozilla-central/file/8822243a8d6c/layout/base/nsPresContext.cpp#l1129
nsLanguageAtomService::GetLocaleLanguage() would call nsLocaleService::GetApplicationLocale().
http://hg.mozilla.org/mozilla-central/file/8822243a8d6c/intl/locale/src/nsLanguageAtomService.cpp#l113
On Windows, nsLocaleService would get the application locale from GetUserDefaultLCID() which corresponds to [Formats] settings.
http://hg.mozilla.org/mozilla-central/file/8822243a8d6c/intl/locale/src/nsLocaleService.cpp#l150
If the application locale is not CJK, "font.name.*.<lang>" and "font.name-list.*.<lang>" do not contain the CJK fonts, hence expanded generic family also do not.
So Accept-Language settings would have an effect.
If the application locale is Japanese, "MS PGothic" would have precedence over "SimSun" regardless of Accept-Language settings because "font.name.*.ja" contain the former.
If the application locale is Chinese, "SimSun" would have precedence regardless of Accept-Language settings.
Guys, is there a way Firefox could utilize Heiti SC font to display Chinese characters? This font seems to work best, at least for the Chinese characters in tabs...
(In reply to goldyn chyld from comment #31)
> Guys, is there a way Firefox could utilize Heiti SC font to display Chinese
> characters? This font seems to work best, at least for the Chinese
> characters in tabs...
Please do one of the following:
- Set the system locale to Chinese (But I don't know how to change the system locale on Mac OS X).
- If the system locale is non-CJK and you do not want change it, add Chinese to "preferred languages" preference of Firefox.
Ok, wow this seems to have done the trick! I moved Chinese/China [zh-cn] language on top of my preferred languages in Firefox and it looks like the Chinese characters appear even now!

Thanks!
Depends on: 729982
It seems to be working fine now... Would still be nice if it worked this good without changing my default language to Chinese (on Firefox), too...
There's no way to determine whether the character is indeed Chinese because Unicode uses the same code-points for Chinese, Japanese, and Korean Ideographs. Chrome and IE just assume that everything is Chinese if language-info is unavailable.
While it may be possible to guess the language from the content (like universalchardet), it is not an easy task.
Why can't Firefox also "just assume that everything is Chinese if language-info is unavailable" then?
I see... So for now the only solution to solve the ugly rendering of Chinese characters is to set Firefox "preferred language" to Chinese?
(In reply to goldyn chyld from comment #38)
> I see... So for now the only solution to solve the ugly rendering of Chinese
> characters is to set Firefox "preferred language" to Chinese?
Unfortunately, yes. Sorry for the inconvenience to Chinese users.

I have one question. Do Chinese users usually use English version of Firefox? Chinese version sets the "preferred language" to Chinese by default.
It's more of an inconvenience for us "non-Chinese" users, since we generally don't use the Chinese version of Firefox...
Do "non-Chinese" users usually see Chinese texts?

BTW I recalled another work around.
1. Open about:config.
2. Select New > String from the context menu.
3. Enter "font.name-list.serif.x-western" as the preference name (assuming you usually set "preferred language" to English).
4. Enter "Heiti SC" as the value.
5. Restart Firefox.

This work around would not work on CJK versions of Operating systems until bug 729982 is fixed.
Wow, this worked!! (I'm on a Mac) -- thanks so much! :)
Masatoshi Kimura, I've sent you an e-mail... :)
Confirmed here using English Firefox 22 in Linux Mint 15 Cinnamon.
I've tried @Kimura's suggestion of adding font.name-list but the issue remains.
Must "Heiti SC" match a font name on my system? Is there a more general setting?

When I add lang="ja" or lang="zh" to the <html>, the font rendering is okay.
See the attached pngs.

Again, setting preferred language in preference won't work (bug 729982).
Attached image fail case capture
The font name depends on your Operating System.
Please copy the font name from your Chinese settings. (For example, see the value of "font.name-list.serif.zh-CN" in about:config.)
Thanks for the reply.

I cannot find any other "font.name-list" in my settings.
I tried to use "WenQuanYi Micro Hei" (hinted from FontInfo and font conf) but test1 is still ugly :-(.

http://i74.photobucket.com/albums/i261/leesei/screenshot/Screenshotfrom2013-07-06004552.png
Still experiencing this bug...
Chrome and IE11 works fine.
But Edge has same problem...
jimbo1qaz send me an e-mail on this bug. For your reference, this is the current situation (which I replied to him/her):

===

The test cases tests when the markup sets the language of the text to English, but use some CJK characters in there. Here is how Firefox decide the font to use:

1. The default sans-serif font for Latin scripts is "Arial" on Windows. Obviously, Arial doesn't come with CJK characters. http://searchfox.org/mozilla-central/rev/0079c7adf3b329bff579d3bbe6ac7ba2f6218a19/modules/libpref/init/all.js#3292
2. We do not set font.name-list.sans-serif.x-western on Windows (you may want to verify that on your own about:config), so there is no font to check here. This should be set in the same file (all.js) in the source code.
3. Because you are using a English version of Firefox, your preferred language list (found in Options -> Content -> Languages) should not contain any of the CJK locales, none will be considered ahead of others. http://searchfox.org/mozilla-central/rev/0079c7adf3b329bff579d3bbe6ac7ba2f6218a19/gfx/thebes/gfxPlatformFontList.cpp#1060
4. You are using a English version of Firefox, so none of the CJK fonts will be used before others. http://searchfox.org/mozilla-central/rev/0079c7adf3b329bff579d3bbe6ac7ba2f6218a19/gfx/thebes/gfxPlatformFontList.cpp#1095
5. At this point, Firefox has no way to tell the locale it should render the characters into. It decided to use the old gfx (Netscape?) hard-coded list, which is Japanese, Korean, Chinese - China, Chinese - HK, and Chinese - Taiwan. http://searchfox.org/mozilla-central/rev/0079c7adf3b329bff579d3bbe6ac7ba2f6218a19/gfx/thebes/gfxPlatformFontList.cpp#1113-1118 This is why you see MS PGothic here.

As a user, you can fix that by setting the preferred content language preferences (step 3), or use Firefox of a specific locale (step 4).

===

As I browse the code, I found this which use Arial Unicode MS as the last resort in the system font matching.

http://searchfox.org/mozilla-central/rev/0079c7adf3b329bff579d3bbe6ac7ba2f6218a19/gfx/thebes/gfxWindowsPlatform.cpp#863

This is probably a better default than step 5 described above given that it's locale generic. The current step 5 will result mixed fonts in the sans-serif/en test case. Masayuki, what do you think?

(The test case come from an add-on of mine, which tweak the font.name-list pref: https://addons.mozilla.org/en-US/firefox/addon/chinese-default-font-tweak/ . It will be broken by Fx57 when we disable classic add-ons, so it's probably make more sense to fix that here.)
Flags: needinfo?(masayuki)
Hmm, I'm surprising at actually some users reach #5. Users who read Chinese characters should set accept-language settings because global company's web sites check accept-language and they send localized pages.

Anyway, we don't have any good resolution for this issue if content language is different from UI locale nor OS locale.

Arial Unicode MS isn't a good solution. It seems that it's already not maintained. So, some glyph may be too old and anyway, it must have glyph difference issue between CJKT. Additionally, it doesn't have good glyph. So, if it had higher priority than current CJKT fonts listed in font.name-list.*, the rendering result on some web sites would become uglier.

I wonder, isn't you guys see different recent regression of bug 1346674? Because suddenly some users added some comments since yesterday. If so, wait a couple of days to fix the regression.
Flags: needinfo?(masayuki)
@masayuki: I think this bug has existed since forever, I just only recently commented and sent timdream an email.

1. I want my pages in English, but still want Chinese pages to render correctly.
2. Chrome renders https://timdream.org/zh-font-tweak/test.html and Google Docs Chinese characters correctly, without me setting any special Chinese settings. If I explicitly change to a Chinese font, Firefox uses it.

Google Translate renders correctly on Chrome through `lang="zh-CN, ja..."` tags: <span id="result_box" class="short_text" lang="ja"><span class="" contenteditable="false" tabindex="-1">小さい</span></span>

Google Docs does not have `lang=""` attributes, and renders incorrectly in Firefox. Apparently Chrome renders CJK characters in SimSun and Japanese-specific characters in Yu Gothic. The same behavior is observed in htmledit.squarefree.com.

On Chrome, if I manually add Japanese into the list of languages, CJK characters are rendered in Yu Gothic.

3. Does changing my language preferences fix font rendering? On Firefox, I added "zh-ch" and "zh" to my font list, but CJK characters are still rendering in MS PGothic (my default Japanese font).
I <s>love</s>hate how Han Unification got us into such messes.
(In reply to jimbo1qaz from comment #53)
> 1. I want my pages in English, but still want Chinese pages to render
> correctly.

I don't understand what is "my pages". If you meant an HTML file you wrote should be rendered with proper language's font, use lang attribute properly. But it seems that you didn't try to say that...

> 2. Chrome renders https://timdream.org/zh-font-tweak/test.html and Google
> Docs Chinese characters correctly, without me setting any special Chinese
> settings.

Yeah, that's very difficult issue to decide proper language to render web pages and web apps which don't have lang attribute due to used in worldwide.

> If I explicitly change to a Chinese font, Firefox uses it.

Did you mean that you changed en-US's Serif or Sans-serif font to a Chinese font? If so, it should be so.

> Google Translate renders correctly on Chrome through `lang="zh-CN, ja..."`
> tags: <span id="result_box" class="short_text" lang="ja"><span class=""
> contenteditable="false" tabindex="-1">小さい</span></span>

If lang attribute is specified "correctly", it should be rendered. Otherwise, rendering engine needs to guess proper language from other hints. That's very difficult, comment 51 explains current our implementation.

> Google Docs does not have `lang=""` attributes, and renders incorrectly in
> Firefox. Apparently Chrome renders CJK characters in SimSun and
> Japanese-specific characters in Yu Gothic. The same behavior is observed in
> htmledit.squarefree.com.

So, sounds like you reached #5 of comment 51 unfortunately. Do you use which locale's Firefox and OS? If one of them is CJKT, the locale should be used to render Chinese characters when it's not specified CJKT language with lang attribute.

> 3. Does changing my language preferences fix font rendering? On Firefox, I
> added "zh-ch" and "zh" to my font list, but CJK characters are still
> rendering in MS PGothic (my default Japanese font).

When you add zh-CN from [Options] -> [Content] -> [Languages] -> [Choose...] -> [Select a language to add] and press [Add] button and make it moved up to other CJKT languages, Simplified Chinese font should be used unless elements are specified as JKT language. Default fonts for each language can be specified from [Advanced...] button of [Fonts & Colors] above the [Languages].

If you use localized build for Simplified Chinese, the [Languages] settings should have "zh-CN" at its top. (I've never checked the localized build for Simplified Chinese, though.)
(In reply to Masayuki Nakano [:masayuki] from comment #52)
> Hmm, I'm surprising at actually some users reach #5. Users who read Chinese
> characters should set accept-language settings because global company's web
> sites check accept-language and they send localized pages.
> 

Would it make sense to collect telemetry on how many user reached this case? That will help us making an informed decision on whether or not we should spend time on this.

I think this bug is fixed by bug 1581822 and is customisable by bug 1596875.

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: