Closed Bug 343454 Opened 14 years ago Closed 14 years ago

CJK problem on font switching

Categories

(Core :: Graphics, defect)

x86
Windows 2000
defect
Not set

Tracking

()

RESOLVED FIXED

People

(Reporter: masayuki, Assigned: masayuki)

References

Details

(Keywords: intl)

Attachments

(2 files, 5 obsolete files)

Now, |LangGroupFromUnicodeRange| returns nsnull if the unicode range is kRangeSetCJK. Therefore, current font switching is not usable for CJK if the document is not legacy encoding on these locales.

I think that Stuart's comment in GetNextFont of gfxWindowsFonts.cpp is right. But it's very ad-hock hack. Because by the logic, Japanese fonts are always preferred than CK fonts. I think we should honor the language list("intl.accept_languages").
You mean this comment ? :
http://lxr.mozilla.org/seamonkey/source/gfx/thebes/src/gfxWindowsFonts.cpp#10031003 
#if 0 // the old code does this -- i think it is sort of a hack

It seems indeed the clean solution is that LangGroupFromUnicodeRange returns the language of the current locale in CJK locale, and the easy good approximation of that is "intl.accept_languages".
Maybe the better solution is to go down the "intl.accept_languages" list until you find a CJK language, it would work well on *my* computer :-)
Blocks: 324560
(In reply to comment #1)
> It seems indeed the clean solution is that LangGroupFromUnicodeRange returns
> the language of the current locale in CJK locale, and the easy good
> approximation of that is "intl.accept_languages".

We cannot change |LangGroupFromUnicodeRange|, because very many Chinese characters are used in all country of CJK. So, we need a decision like human. (e.g, from context, from the meaning...) Of course, it is impossible.

I think that the users who can look CJK page should add the language for the settings.(If not so, they cannot access to some sites.)

I think that non-listed languages in the setting should be added like old gfx code after the listing.

I'll create the patch.
In the old Gfx:Win, this code path was taken only when we ran out of methods to determine langGroup. I assume the same is the case in the new code.

(In reply to comment #2)
> (In reply to comment #1)
> > It seems indeed the clean solution is that LangGroupFromUnicodeRange returns
> > the language of the current locale in CJK locale, and the easy good
> > approximation of that is "intl.accept_languages".

As long as that's the case, this can be done without any worry, can't it?
 

> I think that the users who can look CJK page should add the language for the
> settings.(If not so, they cannot access to some sites.)

  You meant sites in UTF-8 (or other Unicode encodings) without lang/xml:lang explictly specified, didn't you? 
Attached patch Patch rv1.0 (obsolete) — Splinter Review
Thank you, Jungshik. I adding ACP checking like old gfx code.

I'll test this patch soon. If you have a suggestion, please tell me.
Assignee: nobody → masayuki
Status: NEW → ASSIGNED
Comment on attachment 228183 [details] [diff] [review]
Patch rv1.0

please don't add users of nsIPref, it's deprecated
Attachment #228183 - Flags: review-
Comment on attachment 228183 [details] [diff] [review]
Patch rv1.0

>+        } else if (!mTriedLangAttrFonts) {
>+            mTriedLangAttrFonts = PR_TRUE;
>+
>+            nsCAutoString langAttr(mGroup->GetStyle()->langGroup);
>+            if (!langAttr.IsEmpty()) {
>+                if (PR_LOG_TEST(gFontLog, PR_LOG_DEBUG))
>+                    PR_LOG(gFontLog, PR_LOG_DEBUG, ("Trying to find fonts for: %s", langAttr.get()));
>+                AppendPrefFonts(langAttr.get());
>+            }
>+            goto TRY_AGAIN_HOPE_FOR_THE_BEST_2;

This code isn't needed as we already use the langGroup to fill in generic fonts (and we append a generic font if there isn't one in the font list).


>             /* first check with the script properties to see what they think */
>             const SCRIPT_PROPERTIES *sp = ScriptProperties();
>             WORD primaryId = PRIMARYLANGID(sp->langid);
>             WORD subId = SUBLANGID(sp->langid);
>-            if (!sp->fAmbiguousCharSet) {
>+            if (primaryId && !sp->fAmbiguousCharSet) {
>                 const char *langGroup = gScriptToText[primaryId].langCode;
>+                // Note that we should not trust if the langGroup is CJK.
>                 if (langGroup) {
>                     if (PR_LOG_TEST(gFontLog, PR_LOG_DEBUG))
>                         PR_LOG(gFontLog, PR_LOG_DEBUG, ("Trying to find fonts for: %s (%s)", langGroup, gScriptToText[primaryId].value));
>-
>-                    platform->GetPrefFonts(langGroup, fonts);
>-                    if (!fonts.IsEmpty()) {
>-                        const nsACString& lg = (langGroup) ? nsDependentCString(langGroup) : EmptyCString();
>-                        gfxFontGroup::ForEachFont(fonts, lg, UniscribeItem::AddFontCallback, this);
>-                    }
>-                } else if (primaryId != 0) {
>+                    if (IsCJKLangGroup(langGroup))
>+                        AppendCJKPrefFonts();
>+                    else
>+                        AppendPrefFonts(langGroup);
>+                } else {

This code isn't needed either since fAmbiguousCharSet should always be TRUE for CJK strings.


>             mTriedOtherFonts = PR_TRUE;
>-            nsString fonts;
>+
>             gfxWindowsPlatform *platform = gfxWindowsPlatform::GetPlatform();
>+            nsString fonts;
> 
no point in this change.


>+
>+    PRBool IsCJKLangGroup(const char *aLangGroup) {
>+        return !strcmp(aLangGroup, "ja")    || !strcmp(aLangGroup, "ko") ||
>+               !strcmp(aLangGroup, "zh-CN") || !strcmp(aLangGroup, "zh-HK") ||
>+               !strcmp(aLangGroup, "zh-TW");
>+    }
>+
>+    void AppendPrefFonts(const char *aLangGroup) {
>+        NS_ASSERTION(aLangGroup, "aLangGroup is null");
>+        gfxWindowsPlatform *platform = gfxWindowsPlatform::GetPlatform();
>+        nsString fonts;
>+        platform->GetPrefFonts(aLangGroup, fonts);
>+        if (fonts.IsEmpty())
>+            return;
>+        nsCAutoString lg(aLangGroup);
>+        gfxFontGroup::ForEachFont(fonts, lg, UniscribeItem::AddFontCallback, this);
>+        return;
>+   }
>+

nsCAutoString lg(aLangGroup); should just be nsDependentCString(aLangGroup) -- no point in copying


>+   void AppendCJKPrefFonts() {
>+       nsCOMPtr<nsIPref> prefs(do_GetService(NS_PREF_CONTRACTID));
>+       if (!prefs)
....
>+       AppendPrefFonts("zh-CN");
>+       AppendPrefFonts("zh_HK");
>+       AppendPrefFonts("zh_TW");
>+    }

I don't really like this function, but not sure how else to write it...
Attachment #228183 - Flags: review-
Attached patch Patch rv2.0 (obsolete) — Splinter Review
fixing some nits.
Attachment #228183 - Attachment is obsolete: true
(In reply to comment #6)
> (From update of attachment 228183 [details] [diff] [review] [edit])
> >+        } else if (!mTriedLangAttrFonts) {
> >+            mTriedLangAttrFonts = PR_TRUE;
> >+
> >+            nsCAutoString langAttr(mGroup->GetStyle()->langGroup);
> >+            if (!langAttr.IsEmpty()) {
> >+                if (PR_LOG_TEST(gFontLog, PR_LOG_DEBUG))
> >+                    PR_LOG(gFontLog, PR_LOG_DEBUG, ("Trying to find fonts for: %s", langAttr.get()));
> >+                AppendPrefFonts(langAttr.get());
> >+            }
> >+            goto TRY_AGAIN_HOPE_FOR_THE_BEST_2;
> 
> This code isn't needed as we already use the langGroup to fill in generic fonts
> (and we append a generic font if there isn't one in the font list).

Yeah, but you have a wrong. The current code appends only "font.name.". So, we should add to append "font.name-list." in that time. (My latest patch has it.)

> 
> 
> >             /* first check with the script properties to see what they think */
> >             const SCRIPT_PROPERTIES *sp = ScriptProperties();
> >             WORD primaryId = PRIMARYLANGID(sp->langid);
> >             WORD subId = SUBLANGID(sp->langid);
> >-            if (!sp->fAmbiguousCharSet) {
> >+            if (primaryId && !sp->fAmbiguousCharSet) {
> >                 const char *langGroup = gScriptToText[primaryId].langCode;
> >+                // Note that we should not trust if the langGroup is CJK.
> >                 if (langGroup) {
> >                     if (PR_LOG_TEST(gFontLog, PR_LOG_DEBUG))
> >                         PR_LOG(gFontLog, PR_LOG_DEBUG, ("Trying to find fonts for: %s (%s)", langGroup, gScriptToText[primaryId].value));
> >-
> >-                    platform->GetPrefFonts(langGroup, fonts);
> >-                    if (!fonts.IsEmpty()) {
> >-                        const nsACString& lg = (langGroup) ? nsDependentCString(langGroup) : EmptyCString();
> >-                        gfxFontGroup::ForEachFont(fonts, lg, UniscribeItem::AddFontCallback, this);
> >-                    }
> >-                } else if (primaryId != 0) {
> >+                    if (IsCJKLangGroup(langGroup))
> >+                        AppendCJKPrefFonts();
> >+                    else
> >+                        AppendPrefFonts(langGroup);
> >+                } else {
> 
> This code isn't needed either since fAmbiguousCharSet should always be TRUE for
> CJK strings.

But you reset the script to 0 in |DisableShaping|. My latest patch backup the value of the script. But that is always English on Win2k-ja/WinXP-ja...(The primary id returns 9 always.) We cannot trust the script on my environments. Do they return other languages in your enviroments? (or, when the script is complex?)

And I still need to test the patch.
Attached patch Patch rv2.1 (obsolete) — Splinter Review
Let's go this, I cannot find any regressions.

CJK and many languages returns primaryId is English(9) and sp->fAmbiguousCharSet is true, but some languages returns correct primaryId and sp->fAmbiguousCharSet is false. I think that we need to backup the original script that is not reset by DisableShaping.

Please re-review this.
Attachment #228280 - Attachment is obsolete: true
Attachment #228291 - Flags: review?(pavlov)
(In reply to comment #6)
> >+   void AppendCJKPrefFonts() {
> >+       nsCOMPtr<nsIPref> prefs(do_GetService(NS_PREF_CONTRACTID));
> >+       if (!prefs)
> ....
> >+       AppendPrefFonts("zh-CN");
> >+       AppendPrefFonts("zh_HK");
> >+       AppendPrefFonts("zh_TW");
> >+    }
> 
> I don't really like this function, but not sure how else to write it...

I don't like this too. The loop is very low level processing. But if we use nsCStringArray::ParseString, that will be more simple and readable. But I think that it is not good for GFX. (E.g., performance and heap fragments)

And AppendPrefFonts should be called only once per langGroup. But GetNextFont has same problem. We'll need more work for the optimization.
Stuart:

Would you re-review the latest patch?
I need the XP part of the patch in bug 339513.
Attached patch Patch rv2.2 (obsolete) — Splinter Review
updating for latest trunk.

Sturat:

Would you hurry up to review this?
We have serious perf problem on some sites by this bug.
# including Spread Firefox :-(
Attachment #228291 - Attachment is obsolete: true
Attachment #233080 - Flags: review?(pavlov)
Attachment #228291 - Flags: review?(pavlov)
Attached patch v2.5 (obsolete) — Splinter Review
this is basically your patch with a few unneeded pieces removed (mScript), variables ordered in the right way in the constructor and fixed logic so i don't get more than one looking for CJK font messages, shifted the UniscribeItem methods to the bottom.. probably a few other things but nothing really important. r+ from me on this.  r? from you on my changes
Attachment #233080 - Attachment is obsolete: true
Attachment #234120 - Flags: review?
Attachment #233080 - Flags: review?(pavlov)
Comment on attachment 234120 [details] [diff] [review]
v2.5

sorry this has taken so long for me to get to.
Attachment #234120 - Flags: review? → review?(masayuki)
+reordering a variable in UniscribeItem so we can get 4 packedbools in a row for space wins!
Attachment #234120 - Attachment is obsolete: true
Attachment #234121 - Flags: review?
Attachment #234120 - Flags: review?(masayuki)
Attachment #234121 - Flags: review? → review?(masayuki)
Comment on attachment 234121 [details] [diff] [review]
v2.50000000000001

thanks.
Attachment #234121 - Flags: review?(masayuki) → review+
checked-in.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Oops. Sorry. The tree is closed. I was backed-out the patch.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → ASSIGNED
re-checked-in to trunk.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago14 years ago
Resolution: --- → FIXED
Wow, sorry. This bug isn't fixed.
We need |mScript| changing for CJK.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → ASSIGNED
Hey, Stuart. We need mScript for CJK font switching. Because we are checking |SCRIPT_PROPERTIES.fAmbiguousCharSet| in |GetNextFont|. But it returns the value of |SCRIPT_UNDEFINED| script that is set by |DisableShaping|.

Current build is not fixed this bug. But I confirmed this patch fixes this bug.
# Sorry for the wrong review.
Attachment #234386 - Flags: review?(pavlov)
Attachment #234386 - Flags: review?(pavlov) → review+
checked-in.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.