Non-ascii filename on filepicker depends on charset of document

NEW
Assigned to

Status

()

Core
Internationalization
--
major
15 years ago
9 years ago

People

(Reporter: Masaki Katakai, Assigned: nhottanscp)

Tracking

Trunk
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

15 years ago
bug 163682 and bug 162377 have fixed already but I'm still seeing the
problem and my internal customers want to fix the problem.

We usually use non-ascii filename for text and html file which are
stored to local disk.

1. prepare html file with iso-8859-1 charset tag and store to EUC filename
2. Start Mozilla in EUC locale
3. Open the file
4. Try File -> Save As...
   
==> On filepicker, the filename becomes garbage

5. change the charset to EUC-JP from View menu
6. Try File -> Save As...

==> On filepicker, the filename is OK

Even if the charset is iso-8859-1, users wants to store the
original filename in EUC-JP. I want to use the platform charset
here.

Any idea to fix this problem? I need patch asap for internal
customers.
Basic problem -- we have to guess what charset is involved.  There is no way to
win this game.

If you just want a patch internally, see
http://lxr.mozilla.org/seamonkey/source/xpfe/communicator/resources/content/contentAreaUtils.js#758
(change the charset stuff to get a charset from somewhere else...)

If you want to change the Mozilla trunk, suggest a reasonable solution, please.
(Reporter)

Comment 2

15 years ago
Bris,

Thank you for comments. I also think it's too difficult to guess,
but how about the following?

      var charset;
      if (!aDocumentURI.host) {
        // get charset of platform
      } else {
        if (aDocument)
          charset = aDocument.characterSet;
        else if (document.commandDispatcher.focusedWindow)
        ...

if !aDocumentURI.host which may mean browser loads local file,
we try to use platform characters set first. Do you think
this will work?

However, there is one issue that it seems that nsIPlatformCharset
is not defined in IDL.

Bris, Hotta-san,
do you know how to get platform character in JavaScript?

Hmm... that sounds like a reasonable approach, yes. ;)  You may want to put a
try/catch around the .host access, since some URIs will throw an exception on
that getter.

I'm not sure how one would get the platform charset in JS...

Comment 4

15 years ago
file://localhost/c:/foo.txt is a valid file URL btw.

hmm... seems to me that the file picker should always use the native charset
when preparing filenames to be written out to the local filesystem.  if my
platform is setup to use EUC, then why would i ever want filenames in some other
encoding?  i mean, even if a website suggested a non-ASCII and non-EUC filename
(via a Content-Disposition header or whatever), we should still try to generate
a filename in the EUC charset, right?

katakai: have you tried accessing aDocumentURI.originCharset?
Darin, the charset in question is the one used when unescaping the URI.  The
assumption in those cases is that the (escaped) URI is in the same charset as
the document at the URI. When actually creating the file we use the native
charset, of course.  It's the conversion from the URI bytes to the Unicode
string to show in the filepicker that's problematic.

Comment 6

15 years ago
boris: that makes more sense, thanks... not sure what i was thinking.  anyhow,
so, i'm wondering... can we try to QI aDocumentURI to nsIFileURL?  if that
works, and there is a nsIFile, then can't we just jump straight to that instead
of doing whatever we would do to, say, construct a file name from a http:// URL?
Oh, of course!  That's the way to go.  The filename on the nsIFile will already
be unescaped, so there should be no problems....

Comment 8

15 years ago
FYI, 
*Not* the same but related bugs are bug 162765, bug 158006 and bug 198598.

QA Contact: amyy → i18n
You need to log in before you can comment on or make changes to this bug.