Bugzilla

Assignee

Comment 2

•

23 years ago

~ is a perfectly valid character in UNIX filenames. It's only special to the
various shells.

Bill Law

Comment 3

•

23 years ago

I'll have a look.

Status: UNCONFIRMED → ASSIGNED

Ever confirmed: true

Bill Law

Comment 4

•

23 years ago

Fixing url.

URL: http://http://yummysunkist.homestead.... → http://yummysunkist.homestead.com/fil...

Bas Cancrinus

Comment 5

•

23 years ago

This bug is related to all URL-encoded characters and it also appears in the
0.9.1-milestone Win32-build.
Example: if I want to save a file named "IDon'tCare.mp3" then it is displayed in
the "Save As..." box as "IDon%27tCare".

Bas Cancrinus

Comment 6

•

23 years ago

Ignore previous comment: added to wrong bug#.

Comment 7

•

23 years ago

Is this really file?

Comment 8

•

23 years ago

*** Bug 85780 has been marked as a duplicate of this bug. ***

Bill Law

Comment 9

•

23 years ago

Setting target milestone.  A testcase is at http://www.illsley.org/bug85780.htm.

Target Milestone: --- → mozilla1.0

Comment 10

•

23 years ago

What I want to know is if we have code that does local filesystem translation, 
and if so, how up-to-snuff it is. We have had a couple problems with ":" 
handling in MacOS HFS-style filesystems (which is the delimiter).

Summary: Filenames containing ~ are incorrectly mangled when file saved from browser → File...Save As is not unescaping filenames

Updated

•

23 years ago

QA Contact: tever → benc

sairuh (rarely reading bugmail)

Comment 11

•

23 years ago

-> file handling (bug & qa).

Testcase in duplicate bug.

Component: Networking: File → File Handling

QA Contact: benc → sairuh

Comment 12

•

23 years ago

->bryner

Assignee: law → bryner

Status: ASSIGNED → NEW

Brian Ryner (not reading)

Updated

•

23 years ago

Status: NEW → ASSIGNED

Target Milestone: mozilla1.0 → mozilla0.9.9

Brian Ryner (not reading)

Comment 13

•

23 years ago

Nominating for nsbeta1 triage.

Keywords: nsbeta1

Adam Kennedy

Comment 14

•

23 years ago

Things this bug bites on is when saving mp3 files, which often have lots of
spaces in them.

A good example of this is.
04%20Transmission%20on%20JJJ%20-%2023Feb02%20-%20Italic,%20pH,%20Chromatic.mp3

which is actually

04 Transmission on JJJ - 23Feb02 - Italic, pH, Chromatic.mp3

There are a series of these of similar length. Know I know much bug fixers hate
people to bring up the "but in IE it does this", but IE correctly escapes the
spaces. 

Comment 10 mentions having OS specific "safe" file unescapers, but perhaps in
the mean time, the most common ones could be escaped. ( Ones that are commonly
known to be safe on all file systems )

Or failing that, do a simple implementation for the set of obvious platforms (
Win32, Unix ), and leave a spot for the others?

Peter Trudelle

Comment 15

•

23 years ago

nsbeta1- per ADT triage team

Keywords: nsbeta1 → nsbeta1-

Target Milestone: mozilla0.9.9 → mozilla1.2

Assignee

Comment 16

•

22 years ago

*** Bug 129351 has been marked as a duplicate of this bug. ***

Andrew Lin

Comment 17

•

22 years ago

OS -> all (many of the dupes are windows)

OS: Linux → All

Assignee

Comment 18

•

22 years ago

*** Bug 132127 has been marked as a duplicate of this bug. ***

Assignee

Comment 19

•

22 years ago

*** Bug 133746 has been marked as a duplicate of this bug. ***

R.K.Aa.

Comment 20

•

22 years ago

*** Bug 138915 has been marked as a duplicate of this bug. ***

Updated

•

22 years ago

Hardware: PC → All

R.K.Aa.

Comment 21

•

22 years ago

*** Bug 146724 has been marked as a duplicate of this bug. ***

Assignee

Comment 22

•

22 years ago

*** Bug 153197 has been marked as a duplicate of this bug. ***

Assignee

Comment 23

•

22 years ago

*** Bug 154214 has been marked as a duplicate of this bug. ***

Assignee

Comment 24

•

22 years ago

Brian, are you actually working on this?  Or should I take it?

Assignee

Comment 25

•

22 years ago

*** Bug 77475 has been marked as a duplicate of this bug. ***

Assignee

Comment 26

•

22 years ago

*** Bug 140997 has been marked as a duplicate of this bug. ***

Andreas Otte

Comment 27

•

22 years ago

I think the problem lies in the correct conversion between file-urls and
OS-specific filenames presented in the file-picker. Usage of the ESC_FORCED mask
comes to my mind.

Assignee

Comment 28

•

22 years ago

Andreas, care to clarify that?  What is "ESC_FORCED" and what does one do with it?

All the code in question does is to call GetFileName (gets the "filename" prop)
on the nsIURL and then pass that to the filepicker.  The nsIURL documentation
explicitly says that the return value can have some escaped chars in it...

Andreas Otte

Comment 29

•

22 years ago

Typical escaping of urls is smart in a way that it trys to detect if a char is
already escaped or not and if it is it does not escape it again.

Consider a *filename* abc%20ef which contains chars that already look like
escaped chars. Converted into a *fileurl* it would end up like
"file:///path/abc%20ef. Usually file urls get unescaped before presented in a
directory listing as filenames, so we would get "abc ef" which is not the
original filename.

To prevent this from happening you can use the esc_Forced mask on escaping (for
example when converting from filename to fileurl) when you know you deal with a
filename. This way the fileurl would look like "file:///path/abc%2520ef which
when unescaped results in a filename "abc%20ef", which is much better.

My guess is that we have something like this here.

Assignee

Comment 30

•

22 years ago

Well... Let's put it this way.  There is never a fileurl being explicitly used
in any of this code.

This code takes the URI of the thing we are getting and QIs it to an nsIURL.  It
then gets the .filename of the nsIURL (which is almost always an HTTP url, I
must add).  It takes this string and gives it to the filepicker.  All the code
is in JS.  It never does any escaping or unescaping.

I just read over the bug, and I'm confused.  The initial bug was very definitely
not about URL-escaping issues.  Robin, are you still seeing that problem?

Andreas Otte

Comment 31

•

22 years ago

Okay, but it saves to a local file. Please take into account all the other cases
you marked as duplicates. Sometimes it is about escaping, sometimes about
unescaping, always it seems to be about saving to a local file.

Assignee

Comment 32

•

22 years ago

The local file is just an nsIFile, there are no fileurls involved.  And the
local file is created _after_ the filepicker stage (the filepicker creates it).

I thought all the bugs I marked dup of this one were about the fact that we show
an escaped version of the filename in the URI when we put up the filepicker.  If
any of the bugs I marked dup were _not_ about this, please reopen and cc me on
them and I will investigate...

Wesha

Updated

•

22 years ago

Summary: File...Save As is not unescaping filenames → File...Save As is not unescaping filenames (%20)

Wesha

Comment 33

•

22 years ago

*** Bug 158575 has been marked as a duplicate of this bug. ***

Wesha

Comment 34

•

22 years ago

*** Bug 137752 has been marked as a duplicate of this bug. ***

Comment 35

•

22 years ago

Perhaps some clarification is needed. (This is how *I'M* reading the problem, I
could be wrong.) --The URL is a very bad example, since it doesn't exist ^_^

(You'll need a Unicode/ISO-2022-JP font for some of these examples.)

When a file contains a space or any other non-alphanumeric character (é, ü, ®,
½, etc) it remains escaped (for these characters, %E9, %FC, %AE, and %BD,
respectively). Likewise, for any upper-level ISO-2022 or Unicode characters (&#28450;,
&#26085;, &#26412;, &#35486;, etc.), the files are saved with the escape codes (for these
characters, in UTF-8 (Unicode), %E6%BC%A2, %E6%97%A5, %E6%9C%AC, and %E8%AA%9E,
respectively). So, if you tried to save a file (theoretically) that was called "
&#26085;&#26412;&#35486;&#33021;&#21147;&#35430;&#39443;.lha" from a website, you would be downloading
%E6%97%A5%E6%9C%AC%E8%AA%9E%E8%83%BD%E5%8A%9B%E8%A9%A6%E9%A8%93.lha. Even in the
ISO-8859-1 charset, for non-alphanumeric characters, you would get, for trying
to download the file "México.zip", M%E9xico.zip. See the problem? (It's very
frustrating to have to manually fix these files instead of just having Mozilla
automatically -- and properly -- unescape them.

Comment 36

•

22 years ago

Um, sorry about that, it appears Bugzilla doesn't like non-ISO-8859-1 characters
and converted those into decimal escaped characters. >.< (You can still see my
ISO-8859-1 example, though. ^_^)

Assignee

Comment 37

•

22 years ago

> automatically -- and properly -- unescape them

This is actually quite difficult to do.  Consider your example:

%E6%97%A5

is that UTF8? Or IS0-8859-1?  All we have there is the bits, not the encoding
they are encoded in....  And URLs never contain the encoding information needed
to properly escape them.

So at the moment, what's needed is a way to figure out what the "proper"
unescaping is.  Then we can write the code to do it....  I can do the latter,
but I'm stymied on the former; suggestions welcome.

Comment 38

•

22 years ago

ISO (o, not zero)-8859-1 only has one bit. In any other cases, Mozilla knows the
encoding of the pages when they load (either through manually selecting View >
Character Encoding, header information (all proper pages have their encoding
information as a Content-Type header), or auto-detection by Mozilla), and as
such, they should be able to unescape correctly. (The lovely thing about Unicode
is that it doesn't matter what language it's in -- the CJK unified characters
are assigned to the WORD/CONCEPT, and not to what the CHARACTER looks like.) In
any case, simple escaping should be done AT LEAST for ISO-8859-1 until all
character encodings can be implemented (again, quite a simple affair, with all
the selection, auto-detection, and Content-Type headers).

Assignee

Comment 39

•

22 years ago

Attached patch Silly patch — Details — Splinter Review

This fixes all the dups that have actual testcases (99% of them are just %20
<--> space).  It's possible this will fail on non-western pages, but I do not
have a testcase offhand; one would be appreciated.

Comment 40

•

22 years ago

bz: I'm ok with the patch (r=brade) if it works in this scenario:
  create a local file and name it "print%25land.html"
     (I often add printing percentages to file I print often)
  (put a shell of html in it?)
  open the local file in the browser and save as to a different dir (same name).

I expect the new file to have to be print%25land.html (not print%land.html)

Assignee

Comment 41

•

22 years ago

Yep.  Without the patch the suggested name is "print%2525land.html", with it
it's "print%25land.html"

Updated

•

22 years ago

Attachment #92685 - Flags: review+

Brian Ryner (not reading)

Comment 42

•

22 years ago

Comment on attachment 92685 [details] [diff] [review]
Silly patch

r=brade (hurray!)

Comment 43

•

22 years ago

Assigning to Boris since he has a patch (no, I don't think I would have gotten
to this in the near future).

Assignee: bryner → bzbarsky

Status: ASSIGNED → NEW

Comment 44

•

22 years ago

http://www.solon.org/cgi-bin/j-e/tty/dosearch?sDict=on&H=PS&L=E&T=japanese&WC=none

Clicking on any of the links will give you a graphic with what the filename
should look like. Unfortunately, my FTP client wasn't cooperating (seems it
doesn't like non-Western encoding), so I couldn't get things uploaded for a
"real" test, and the particular way that server works (for Western-only
browsers, which was the only way I could get non-Western images out of it) it
doesn't have any encoding specified. :/

jag (Peter Annema)

Comment 45

•

22 years ago

Comment on attachment 92685 [details] [diff] [review]
Silly patch

sr=jag

Attachment #92685 - Flags: superreview+

Assignee

Comment 46

•

22 years ago

Colin, thanks.  What happens on that page is that unescape() fails (even if I
manually switch the page encoding to EUC-JP, which looks like the right one),
throws an exception, and we fall through to using the link text as the filename.
 Which is better than what we were doing before, I guess.

I think that we should land this at the beginning of the 1.2 cycle and look for
a decent way to unescape/decode non-ISO-8859-1 content....

Summary: File...Save As is not unescaping filenames (%20) → [FIX]File...Save As is not unescaping filenames (%20)

Comment 47

•

22 years ago

Actually, that script looks like it uses Shift_JIS encoding. (Don't quote me on
that, though.)

Assignee

Comment 48

•

22 years ago

Nope.  EUC-JP is the one that shows the right stuff in the browser status bar...
It occurs to me that we should consider doing whatever that does; I'll try to
dig it up.

Comment 49

•

22 years ago

Yeah, I noticed the status bar does correctly display these things. Like I said,
between the manual selection of encoding, the Content-Type header, and
everything else, this should be an easy thing to squash. THAT SAID, there needs
to be conversion for non-ISO-8859-1 files to Unicode (Western Windows) or
whatever the system uses for the non-Western language. (J-Windows uses JIS, bla
bla.)

Comment 50

•

22 years ago

Oh, also, could someone please change status to CONFIRMED?

Heikki Toivonen (remove -bugzilla when emailing directly)

Comment 51

•

22 years ago

Yeah, I noticed the status bar does correctly display these things. Like I said,
between the manual selection of encoding, the Content-Type header, and
everything else, this should be an easy thing to squash. THAT SAID, there needs
to be conversion for non-ISO-8859-1 files to Unicode (Western Windows) or
whatever the system uses for the non-Western language. (J-Windows uses JIS, bla
bla.)

Oh, also, could someone please change the status of this bug to CONFIRMED? Thaaanks.

Updated

•

22 years ago

Keywords: nsbeta1- → nsbeta1+

Assignee

Comment 52

•

22 years ago

*** Bug 160977 has been marked as a duplicate of this bug. ***

Assignee

Comment 53

•

22 years ago

Fix checked in.  bug 161242 filed on the remaining intl-related issues.  The
original problem here (%20) is certainly fixed.

Status: NEW → RESOLVED

Closed: 22 years ago

Resolution: --- → FIXED

sairuh (rarely reading bugmail)

Assignee

Comment 54

•

22 years ago

*** Bug 161335 has been marked as a duplicate of this bug. ***

Comment 55

•

22 years ago

url no longer exists. i made a simple test here:

http://hopey.mcom.com/tests/unescaped%20name.html

it has a literal whitespace. but when pasting, in the urlbar or in tab labels,
the whitespace appears as %20.

however, when saving the file (as well as viewing it in an html directory
listing), the whitespace is preserved (ie, %20 is unescaped).

vrfy'd fixed with 2002.09.16.08 comm trunk builds (all platforms).

URL: http://yummysunkist.homestead.com/fil...

Status: RESOLVED → VERIFIED

Jo Hermans

Comment 56

•

22 years ago

*** Bug 171626 has been marked as a duplicate of this bug. ***

Matthias Versen [:Matti]

Comment 57

•

22 years ago

*** Bug 172862 has been marked as a duplicate of this bug. ***