Open Bug 793881 Opened 12 years ago Updated 2 years ago

Komi-Permyak language (koi) is confused for Korean (ko)

Categories

(Core :: Layout: Text and Fonts, defect)

x86_64
Linux
defect

Tracking

()

People

(Reporter: amir.aharoni, Unassigned)

References

(Depends on 1 open bug)

Details

Attachments

(1 file)

The Komi-Permyak language has the language code "koi". If the lang attribute of an element is defined to this code, a font is applied to it as if it was Korean ("ko"). This language is written in the Cyrillic script, so forcing a Korean font on it is weird and wrong.

This doesn't happen in Chromium.

This affects, among other sites, the Wikipedia in that language ( https://koi.wikipedia.org ), and all other Wikipedias that link to it.
I don't understand what is happening here: the equivalent error doesn't seem to occur for other three-letter language codes
Component: Internationalization → Layout: Text
Hmm, I also can't reproduce if I select an explicit font for "Other languages" in fonts prefs, rather than the default "serif", "sans-serif" etc.
I think this may be a fontconfig issue. On my Linux VM, it looks like fontconfig is defaulting to a Korean font for language codes it doesn't recognize, as well as to some (but not all) languages that would be expected to use Cyrillic.

jkew@jkew-vb:~$ fc-match 
DejaVuSans.ttf: "DejaVu Sans" "Book"

jkew@jkew-vb:~$ fc-match :lang=en
DejaVuSans.ttf: "DejaVu Sans" "Book"

jkew@jkew-vb:~$ fc-match :lang=ja
fonts-japanese-gothic.ttf: "TakaoPGothic" "Regular"

jkew@jkew-vb:~$ fc-match :lang=zh
wqy-microhei.ttc: "文泉驛微米黑" "Regular"

jkew@jkew-vb:~$ fc-match :lang=ko
NanumGothic.ttf: "NanumGothic" "Regular"

jkew@jkew-vb:~$ fc-match :lang=uk
DejaVuSans.ttf: "DejaVu Sans" "Book"    # seems reasonable for Ukrainian

jkew@jkew-vb:~$ fc-match :lang=ru
NanumGothic.ttf: "NanumGothic" "Regular"    # surprising

jkew@jkew-vb:~$ fc-match :lang=koi
NanumGothic.ttf: "NanumGothic" "Regular"    # maybe FC doesn't recognize "koi"?

jkew@jkew-vb:~$ fc-match :lang=xxx
NanumGothic.ttf: "NanumGothic" "Regular"
Oh BTW, forgot to mention that I'm trying this on Linux. On other systems the behavior may be different.

There is a somewhat similar bug in the Android OS: If you ask to see the interface of your device in the Sakha language (sah), you'll get it in Sanskrit (sa) instead. I guess that in both cases somebody was checking just the first two letters instead of parsing the language code properly.
(In reply to Amir Aharoni from comment #4)
> I guess that in both cases somebody was checking just
> the first two letters instead of parsing the language code properly.

That was my first assumption too, but from my and Jonathan's test results, it looks as if something different is going on here, and that this is probably a fontconfig bug.
Bug 835074 may fix this, since it assigns Cyrillic script to koi and a bunch of other languages that were previously undefined, though judging by comment 3 the problem may still occur with other languages.
Depends on: 835074
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: