Closed
Bug 248792
Opened 20 years ago
Closed 19 years ago
pages with bad "lang=" setting in <html> fails to display national language character for that language
Categories
(Core :: Internationalization, defect)
Tracking
()
RESOLVED
EXPIRED
People
(Reporter: acrab001, Unassigned)
References
()
Details
Attachments
(4 files)
User-Agent: Mozilla/5.0 (OS/2; U; Warp 4.5; ko-KR; rv:1.7) Gecko/20040617 Build Identifier: Mozilla/5.0 (OS/2; U; Warp 4.5; ko-KR; rv:1.7) Gecko/20040617 If the page has bad "lang=" info, like "lang=ko " (serveral blanks after 'ko'), in <html> part, mozilla fails to display all of its national language characters. Similar problem in mozilla mail&news reader, when display mails encoded in UTF-8. I've tested http://google.co.kr/ which has no "lang=" info in <html> part. But after download, if put "lang=ko " in <html> part, the same result as above. Reproducible: Always Steps to Reproduce: 1. counrty=082 in CONFIG.SYS and set codepage to 949 in OS/2 2. set primary language setting of mozilla navigator as "ko" 3. view http://kldp.net/ with mozilla 1.7 for OS/2 (you should have korean font installed) Actual Results: Mozilla fails to disply Korean characters. Expected Results: Mozilla should display proper Korean Characters.
Comment 1•20 years ago
|
||
Since the problems are caused by bad HTML coding, I'd say this is more of a Tech Evangelism issue than a problem with the browser itself. The lang attribute is supposed to have only the two letter ISO code for the desired country/language, so having spaces in the lang attribute is actually bad HTML coding. If you ever see this problem, please contact the webmaster so that he can correct the error. I tried what you suggested on the Google Web site. There _is_ a slight change whether I put "ko" or "ko " in the lang attribute, Gecko seems to change some subtle things, which I can't quite point out, but no symbols are changed. I'm guessing that if Gecko reads a bad lang attribute, it ignores it altogether. I suggest filing this as Tech Evangelism.
Comment 2•20 years ago
|
||
I forgot to mention, since I haven't been able to reproduce the bug, it seems that this another OS/2 specific bug. Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040625 Firefox/0.9 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040616
Reporter | ||
Comment 3•20 years ago
|
||
(In reply to comment #1) > Since the problems are caused by bad HTML coding, I'd say this is more of a Tech > Evangelism issue than a problem with the browser itself. The lang attribute is > supposed to have only the two letter ISO code for the desired country/language, > so having spaces in the lang attribute is actually bad HTML coding. If you ever > see this problem, please contact the webmaster so that he can correct the error. > > I tried what you suggested on the Google Web site. There _is_ a slight change > whether I put "ko" or "ko " in the lang attribute, Gecko seems to change some > subtle things, which I can't quite point out, but no symbols are changed. I'm > guessing that if Gecko reads a bad lang attribute, it ignores it altogether. > > I suggest filing this as Tech Evangelism. I think that's the matter of defining letter(when you mean "only the two letter ISO code for the desired country/language") that it includes white space characters or not. and don't know what do you mean by Tech Evangelism.
Reporter | ||
Comment 4•20 years ago
|
||
(In reply to comment #2) not displaying national characters with bad "lang=" is an OS/2 specific bug. But, still there's an issue in other OS too. With bad "lang=", Mozilla uses two fonts for each charset, but with proper "lang=" uses only one font(its national language font) for both. ("Allow documents to use other fonts" option should be off) So, fot the first case, mozilla needs font association func, by OS or mozilla itself and I don't know which one mozilla uses. If the second one(mozilla its own font association scheme) is the case, then some thing can be done, I think.
Reporter | ||
Comment 5•20 years ago
|
||
(In reply to comment #1) By the way, where can I find this "lang=" statement spec. (I mean, two letter ISO code thing)
Comment 6•20 years ago
|
||
http://www.w3.org/TR/html4/struct/dirlang.html#h-8.1 The above is the part about the lang attribute in W3C's HTML 4.01 specification. The section right below (8.1.1) informs about the syntax of language codes. It seems that the two-letter subcode is not the only allowed as I said before. Here's the important part as far as you're concerned though, in section 6.8 of the Basic HTML data types manifest: http://www.w3.org/TR/html4/types.html#type-langcode It explicitly states that "whitespace is not allowed within the language-code".
Reporter | ||
Comment 7•20 years ago
|
||
(In reply to comment #6) I've got that. So this is the problem of the web site. But I still wonder which font mozilla uses when this is the case.
Reporter | ||
Comment 8•20 years ago
|
||
When bad lang= attribute was used, it look like mozilla uses lang="en" instead, right?
Reporter | ||
Comment 9•20 years ago
|
||
Reporter | ||
Comment 10•20 years ago
|
||
If mozilla uses lang="en" instead when this is the case, it would be better to treat it as if no lang attribute specified. Is this (possible) solution a still Tech Evangelism stuff?
Reporter | ||
Comment 11•20 years ago
|
||
Same statement as http://bugzilla.mozilla.org/show_bug.cgi?id=248790#c9 If something is worng, plz correct me.
Reporter | ||
Updated•20 years ago
|
Component: Browser-General → Internationalization
Comment 12•20 years ago
|
||
This is both a tech-evangel and a mozilla 'bug'. Mozilla should cope better with a bad lang specification. Anyway, why don't you write to the KLDP admin. to fix their problem?
Reporter | ||
Comment 13•20 years ago
|
||
I have installed Innotek Font Engine, and now get better view of http://kldp.net . But still some characters are missing(they are too small or bold faced?). I can't judge which one is a real problem between OS/2 and Mozilla, but I think Mozilla should be able to handle this better. I have consulted with Ko Myung Hun(who made a patch for FT/2) about this matter. He said that it is a problem of OS/2 and can make a patch for this specific case but that will break OS/2 system's font association scheme. He also said that mozilla would be able to provide better solution about this matter.
Reporter | ||
Comment 14•20 years ago
|
||
If I get rid FreeType/2 of fake bold support, I can see bold faced characters as normal faced with Innotek font engine. Something goes wrong between my fake bold patch and Innotek's font engine, Eeeeh!
Comment 15•20 years ago
|
||
What do you have specified as your unicode font? In the case of the bad lang, we treat the page as western, and all characters above FF are displayed using the unicode font in preferences.
Reporter | ||
Comment 16•20 years ago
|
||
(In reply to comment #15) Now I got the picture of what caused this. The reason why CJK font fails in this case is, because FreeType/2 registers CJK unicode font as pifi->szGlyphlistName="PMCHT" "PMJPN" "PMKOR", not "UNICODE". If I change FT/2 to use "UNICODE" instead "PMxxx", I can see all the characters correctly. Problem solved? NO!!! because font association scheme of OS/2 (OS2.INI->PM_SystemFonts->PM_AssociateFont) does not allow different encoding from base font encoding. If I use Helv as a base font and "UNICODE"(not "PMKOR") Gulim as associated font, Gulim is displayed as broken. So both side have a problem in FT/2. Of course there is Innotek font engine which ignores OS/2 font drivers and use its own font rendering, but it has problems too. First, as I stated in comment #13, it uses font rendering of FT/2 for DBCS bold face, so no character for DBCS bold face in this case. Second, some Korean characters are missing in some web pages(ex. http://news.msn.co.kr/service/msnnews/ShellView.asp?ArticleID=2004071311380550004&LinkID=102 ) with Mozilla 1.7(Moz 1.7a is fine though, strange) So I couldn't find flawless solution for this case. Using Unicode for >0xFF characters in western also could be a problem. For this works, western font must be a szGlyphlistName="UNICODE" one, but again, "UNICODE" font breaks font association scheme stated above, in this case, base font is "UNICODE" and associated "PMxxx". But using "PMUGL" instead "UNICODE" cause broken unicode chracters in western encoding and broken cyrillic chracters in cyrillic encoding. So, this can be quite some problem for DBCS users(only).(for SBCS users, this is not even a problem because they don't need font assocication scheme.) As a possible solution, I'm trying to fix FT/2 to register a unicode truetype font for both "UNICODE", "PMxxx" as a diffrent font family name each. But I'm not certain this would work. p.s. to mkaply Can you help me to find IFI(Intelligent Font Interface) specfication document? I've searched on internet, but only have found IFI header files in FT/2 source. I think IBM is the unique source.
Comment 17•19 years ago
|
||
This is an automated message, with ID "auto-resolve01". This bug has had no comments for a long time. Statistically, we have found that bug reports that have not been confirmed by a second user after three months are highly unlikely to be the source of a fix to the code. While your input is very important to us, our resources are limited and so we are asking for your help in focussing our efforts. If you can still reproduce this problem in the latest version of the product (see below for how to obtain a copy) or, for feature requests, if it's not present in the latest version and you still believe we should implement it, please visit the URL of this bug (given at the top of this mail) and add a comment to that effect, giving more reproduction information if you have it. If it is not a problem any longer, you need take no action. If this bug is not changed in any way in the next two weeks, it will be automatically resolved. Thank you for your help in this matter. The latest beta releases can be obtained from: Firefox: http://www.mozilla.org/projects/firefox/ Thunderbird: http://www.mozilla.org/products/thunderbird/releases/1.5beta1.html Seamonkey: http://www.mozilla.org/projects/seamonkey/
Comment 18•19 years ago
|
||
This bug has been automatically resolved after a period of inactivity (see above comment). If anyone thinks this is incorrect, they should feel free to reopen it.
Status: UNCONFIRMED → RESOLVED
Closed: 19 years ago
Resolution: --- → EXPIRED
You need to log in
before you can comment on or make changes to this bug.
Description
•