URL in status bar is displayed as garbage if Non-ASCII characters are used in domain name and name attribute. http://<non-ASCII domain name>/index.html#<non-ASCII name attribute> The second non-ASCII characters are displayed correctly, but the first aren't. Original report in Bugzilla-jp http://bugzilla.mozilla.gr.jp/show_bug.cgi?id=3513
What has to be http://賢明.jp/ent_exam/ent_exam.html#メニュー is displayed as http://莖∽??.jp/ent_exam/ent_exam.html#メニュー With a debug build, I got a few assertions in xpconnvert.cpp and nsUTF8Utils.h so that there is a conversion problem somewhere. My guess is that the first four bytes of 賢明 (in EUC-JP) correspond to 莖∽ in UTF-8 and the rest two bytes don't form a valid character in EUC-JP so that they're turned into question marks. I'll check this out when I'm on Linux. (I can do it now, but it's a little cumbersome). This happens probably because somewhere we have a URI with the host address in UTF-8 and the path part in EUC-JP. Given this URI, we try to convert it to Unicode (UTF8 or UTF-16) as if the whole URI is in EUC-JP (originCharset).
OS: Linux → All
Hardware: PC → All
> UTF-8 and the path part in EUC-JP. Given this URI, we try to convert it to > Unicode (UTF8 or UTF-16) as if the whole URI is in EUC-JP (originCharset) We only try this conversion when a given URI spec is NOT a valid UTF-8. With UTF-8 in the host address part and EUC-JP in the path part, it's not a valid UTF-8 as a whole so that we assume the whole URI spec is in EUC-JP. Therefore, this problem doesn't occur if we just have the host part in UTF-8 followed by the path part in ASCII-only. To see that, try http://bugzilla.mozilla.gr.jp/attachment.cgi?id=1954 (quoted in bug 229546) Darin, can I assume that the host part of _any_ URI is _always_ in UTF-8? Then, I can fix this in nsISubTextURI (?). However, that wouldn't be pretty.
Assignee: smontagu → jshin
Summary: IDN: URL in status bar is displayed as garbage → IDN: URL in status bar is displayed as garbage if the path part has non-ASCII characters in non-UTF-8 encoding
I cannot reproduce 2004050304-trunk/WinXP. WORKSFORME?
Sorry... Reproduced with 2004050304-trunk/Win98, 20040503-trunk(Firefox)/Win98, 20040503-trunk(Firefox)/WinXP.
> Darin, can I assume that the host part of _any_ URI is _always_ in UTF-8? Then, > I can fix this in nsISubTextURI (?). However, that wouldn't be pretty. nsIURI::host is always encoded using UTF-8.
Created attachment 171172 [details] [diff] [review] patch This fixes bug 229546 as well and can also be used for fixing bug 200150.
I've got a little more robust patch. This should be fixed before 1.8beta.
Created attachment 171508 [details] [diff] [review] patch asking for review
Comment on attachment 171508 [details] [diff] [review] patch This would have been easier to review with more context, by the way.
Attachment #171508 - Flags: review?(smontagu) → review+
Created attachment 171515 [details] [diff] [review] patch with more context thanks for r and sorry for too little context. I was just too lazy to get rid of another patch nearby (for bug 244754) and took a short-cut by omitting '-u' option.
Comment on attachment 171515 [details] [diff] [review] patch with more context >Index: intl/uconv/src/nsTextToSubURI.cpp >+ nsCOMPtr<nsIURLParser> urlParser; >+ // should we just use net_GetStdURLParser()? >+ urlParser = do_GetService(NS_STDURLPARSER_CONTRACTID, &rv); >+ NS_ENSURE_SUCCESS(rv, rv); net_GetStdURLParser is an internal necko method. since this code is not part of the necko DLL, it cannot use it. How do you know that this is the correct nsIURLParser instance for the given URI? I don't think you can know that it is. What if the given URI scheme does not support an authority section, but would erroneously be parsed as having one by the STDURLPARSER? I think you should use nsIIOService::newURI instead, to construct a nsIURI. Then, call GetHost, and check that instead.
Attachment #171515 - Flags: superreview?(darin) → superreview-
Created attachment 173054 [details] [diff] [review] patch that generates nsIURI With standard-url.encode-utf8 set to true, this patch is not necessary for most uris (except for file url). However, setting the pref to true (which is by default now) doesn't fix bug 229546 and this patch fixes it.
Attachment #173054 - Flags: review?(smontagu) → review+
It seems to me that these functions should take a nsIURI as their parameter instead of a raw character string.
Jshin: I want to fix this issue myself. Are you working on this? Can I take this?
Target Milestone: mozilla1.8beta1 → ---
Assignee: jshin1987 → masayuki
Status: ASSIGNED → NEW
Target Milestone: --- → mozilla1.9alpha
Masayuki Nakano: 2½ years after you "accepted" this bug, no one has objected. Are you still willing to fix it? And are you still experiencing it? (I'm not sure what to check against what).
(In reply to comment #18) > Masayuki Nakano: 2½ years after you "accepted" this bug, no one has objected. > Are you still willing to fix it? And are you still experiencing it? (I'm not > sure what to check against what). No, I'm not sure. I'll clean up my bug list after all Gecko1.9 works finished.
It works for me now with the latest trunk. Can you confirm?
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1a2pre) Gecko/2008073100 SeaMonkey/2.0a1pre After studying the bug again somewhat, I'd say it works for me but not decisively: - I see the same garbage (plus www) in the Location Bar of the xul error page as in the input box of the bug's URL: http://www.è3¢æ¤ô¤3¤ò¤μ.jp/ent_exam/ent_exam.html - In the top URL in comment #1, the blue underlined part stops just before the # sign. Clicking that link gives a xul error page for http://www.賢明.jp/ent_exam/ent_exam.html (Apparently the DNS query gives a null result in both cases. Don't know if relevant.) However, these Bugzilla pages are in UTF-8. A link to a non-Bugzilla non-Unicode page, with a link on it with non-ASCII in it, might be necessary for a "really" valid testcase nowadays.
I'm resetting bugs which are assigned to me but I'm not working on them and I don't have plan for fixing them in near future.
Assignee: masayuki → smontagu
QA Contact: amyy → i18n
Status bar is not supported any longer. So this bug should be closed.
(In reply to Hideo Oshima from comment #23) > Status bar is not supported any longer. > So this bug should be closed. URL preview is still available, even without the status bar. That being said, I can't reproduce this, so I'm inclined to WFM. Anne, what do you think?
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.