Closed Bug 135854 Opened 19 years ago Closed 12 years ago
Japanese in escaped characters in File | Save as dialog
Build: 04-04 trunk build on all platforms. Steps: 1. Launch Netscape and open Composer. 2. Open a Japanese named file. 3. Regardless modify the file or not, and go to File | Save As. Result: There will be a save file dialog show up, and the existing Japanese name is displayed as escaped character, e.g. %93%FA%96%7B%8C%EA.html. This is a regression from N6.2.1, Japanese file name display fine on N6.2.2. I also saw it on 03-06 trunk build, so it has been there for a while.
This is not i18n specific and same for browser saveAs. E.g. "test file.html" turns to "test%20file.html".
Hmm, then it's mostly a dup. of bug 130079.
Yuying, since 130079 has been checked in, could you verify if this is a dup of 130079?
No, it still showing as escaped characters on 04-15 branch build / Mac 10.1.3 and 04-12 trunk build / WinME-JA. (bug 130079 was checked in both trunk and branch build). And the space still display as "%20" in this dialog.
reassign to darin. This is a regression directly caused by bug 124042. The URL for local file is stored in system charset, instead of UTF8. There are 2 options to fix this problem: option A, convert file path from system charset to utf8 and store the utf8 string in url. Converting back to system charset before displaying. (We may display ucs2 string directly in title in future.) option B, (hack) If it is file scheme, we assume it is system charset instead of utf8.
Assignee: yokoyama → darin
how critical is this bug? should this be nsbeta1?
Status: NEW → ASSIGNED
Priority: -- → P3
Target Milestone: --- → mozilla1.1alpha
No matter with critical or not. It is regression.
Yuying, could you attach a screen shot? I think this does not look good but the file is usually saved as a different name, also it is possible to restore the original name by selecting the original file in the save as dialog list. So my opinion is that this is not nsbeta1+.
Notice the original name in Japanese, and the one inside save as dialog is escaped. Yes, this problem happens when "Save As" or "Save charset As(Composer)" but not "Save". The data can be saved properly but the name looks ugly.
The escaped string in the screen shot, it is Shift_JIS which is a system charset for Japanese Windows. I think simply unescaping it would solve the display problem. Shanjian, why do you think this is related to UTF-8 URI?
nhotta: one of the changes that went in w/ support for UTF-8 URI was the switch to make GetFileBaseName not unescape the result before returning it. this was done because the result from GetFileBaseName has to be UTF-8. as a result, the caller must unescape the string if that is what is appropriate. i tried to fixup all of the callers, but i obviously missed this one. if the caller is JS code, then we have a problem because i don't believe that there is any scriptable method to unescape an URL-escaped string... or is there?!
>this was done because the result from GetFileBaseName has to be UTF-8 but looking at the screen shot, it is not UTF-8 >i don't believe that there is any scriptable method to unescape an URL-escaped >string... or is there?! unescape() uses UTF-8 as a default (and use a document charset if available) http://lxr.mozilla.org/seamonkey/source/dom/src/base/nsGlobalWindow.cpp#3188 So when the URI is escaped in not UTF-8 (but Shift_JIS) then calling unescape() would be a problem.
Darin, in the other bug I assigned to you today, that one is in filepicker. I am not sure about this one. In that bug, window title takes system charset only. So we need to converted back from utf8. (In future, we should try to display unicode string directly.)
nhotta: my point was that an URL escaped string is ASCII compatible and can therefore be treated as UTF-8 _unless_ you unescape it. so, the fact is.. GetFileBaseName returns bytes that can be interpreted as UTF-8. however, once unescaped.. there is no guarantee that the text will be UTF-8 or native-charset encoded.
If 'originCharset' is set then unescaped string can be converted to Unicode using that value. How about adding a function to nsIURI which unescape and convert to Unicode using originCharset? I think that would be useful to display URI to UI.
nhotta: given that nsIURI exposes the originCharset attribute, i think it'd be sufficient to write up a helper function (maybe on nsIIOService) that does this conversion. that way the many nsIURI implementations don't have to duplicate that functionality.
mass futuring of untargeted bugs
Target Milestone: --- → Future
*** Bug 154469 has been marked as a duplicate of this bug. ***
Still a problem in 11-07-08 build.
Assignee: darin → smontagu
Status: ASSIGNED → NEW
Target Milestone: Future → ---
WORKSFORME on all platforms
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.