Open
Bug 293611
Opened 20 years ago
Updated 6 months ago
'Web page, complete' isn't saved correctly if filename includes non-ASCII characters
Categories
(Firefox :: File Handling, defect)
Tracking
()
NEW
People
(Reporter: da_neil, Unassigned)
Details
(Keywords: intl)
Attachments
(4 files)
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050421 Firefox/1.0.4 (MOOX M2)
Build Identifier: Mozilla/5.0 /*any*/ (tried with Firefox aviary & trunks and Mozilla Suite 1.8b)
When page is saved as 'Web-page, complete' to a file with non-ASCII characters,
it can't be opened properly in other programs, except Mozilla software. The
programs cannot parse path to <filename>_files directory since they treat '%'
chacters inside the path as *regular percent signs* (and they are allowed in
directory names!), not MIME-encoded strings.
The Mozilla softeware should write HTML with proper (*not* MIME-encoded)
regional (e.g. CP1251) or Unicode paths to the <filename>_files directory.
Reproducible: Always
Steps to Reproduce:
1. Save page to a file with non-ASCII charecters (e.g. Russian)
2. Open it in any other program, except Mozilla *
Actual Results:
Files is displayed without CSS and graphics.
Expected Results:
Files should be opened normally not only in Mozilla software.
Test case:
I save a page as 'forum.mozilla.ru _ Результаты поиска.htm' (Russian characters
here).
File name is: 'forum.mozilla.ru _ Результаты поиска.htm'
Directory name: 'forum.mozilla.ru _ Результаты поиска_files'
Actual path to the directory inside HTML:
'forum.mozilla.ru%20_%20%D0%E5%E7%F3%EB%FC%F2%E0%F2%FB%20%EF%EE%E8%F1%EA%E0_files'
Path in HTML doesn't match with directory name, because
'forum.mozilla.ru%20_%20%D0%E5%E7%F3%EB%FC%F2%E0%F2%FB%20%EF%EE%E8%F1%EA%E0_files'
and 'forum.mozilla.ru _ Результаты поиска_files' are *different* directories.
Updated•20 years ago
|
Assignee: dom-to-text → file-handling
Component: DOM to Text Conversion → File Handling
QA Contact: ian
Reporter | ||
Comment 1•20 years ago
|
||
When page is saved as 'Web-page, complete' to a file with non-ASCII characters,
it can't be opened properly in other programs, except Mozilla software. The
programs cannot parse path to <filename>_files directory since they treat '%'
characters inside the path as *regular percent signs* (and they are allowed in
directory names!), not MIME-encoded strings.
The Mozilla softeware should write HTML with proper (*not* MIME-encoded)
regional (e.g. CP1251) or Unicode paths to the <filename>_files directory.
Component: File Handling → DOM to Text Conversion
Updated•20 years ago
|
Component: DOM to Text Conversion → File Handling
Will this bug be fixed in 1.1? I have to use IE to save pages since it's the
only browser that makes it in a proper way (Opera makes a large mess).
![]() |
||
Comment 3•20 years ago
|
||
I'm tempted to mark this invalid. These are URIs; using URI-escaping in them
should be perfectly reasonable. Unless unescaping the string gives the wrong
bytes (that is, bytes in an encoding different from the page encoding)?
Comment 4•20 years ago
|
||
So the supposed bug is that the directory is (correctly) named 'forum.mozilla.ru
_ Результаты поиска_files', but the HTML contains something like:
<a
href="forum.mozilla.ru%20_%20%D0%E5%E7%F3%EB%FC%F2%E0%F2%FB%20%EF%EE%E8%F1%EA%E0_files">
?
(In reply to comment #4)
> So the supposed bug is that the directory is (correctly) named 'forum.mozilla.ru
> _ Результаты поиска_files', but the HTML contains something like:
>
> <a
href="forum.mozilla.ru%20_%20%D0%E5%E7%F3%EB%FC%F2%E0%F2%FB%20%EF%EE%E8%F1%EA%E0_files">
?
Exactly. The only software that 'understands' such (local) URIs is from MoFo.
![]() |
||
Comment 6•20 years ago
|
||
Er... Anything that works with URIs understands the percent-escaping part. The
only part I can see having issues is if we have an encoding mismatch somewhere...
(In reply to comment #6)
> Er... Anything that works with URIs understands the percent-escaping part. The
> only part I can see having issues is if we have an encoding mismatch somewhere...
That's not true. Such pages don't render properly in IE, Opera, Word, etc.. (See
the screenshot attached).
A screenshot of same page rendered with Trident (IE) and Gecko. Screenshot was
made in Maxthon.
![]() |
||
Comment 9•20 years ago
|
||
> That's not true.
Er... did you even bother TESTING before making that claim? Create the
following two HTML files in a directory:
test.html:
---------------
<body><a href="%66oo.html">Click this</a></body>
---------------
foo.html:
---------------
<body>This is a test</body>
---------------
('f' is ASCII code 0x66). In my IE 5.5 over here the link in the first file
works just dandy; the second file is loaded. So again, the problem is NOT the
percent-escapes. It's something in the assumptions someone somewhere is making
about what character encoding should be used for the bytes gotten after
unescaping. I'm not saying your problem doesn't exist, just that the
percent-escapes are not the issue with it.
Comment 10•19 years ago
|
||
Any progress?
![]() |
||
Comment 11•19 years ago
|
||
Progress would be knowing what assumptions are being made by what software.
Comment 12•19 years ago
|
||
I don't know the inner work details of the browsers, but what I see from the user's POV is that pages saved by Mozilla software cannot be rendered in:
- IE 6
- IE 7 Beta 2 Preview
- Opera 8.5
- Opera 9 TP2
Couldn't test on other (minor) browsers.
(PS: Today yet another user complained about this bug in Ru-board Mozilla support thread (http://forum.ru-board.com/topic.cgi?forum=5&topic=17868&start=720#20)..)
![]() |
||
Comment 13•19 years ago
|
||
> but what I see from the user's POV
Which doesn't help here. What would help would be an idea of what IE and company _think_ they're loading when they see that URI.
Comment 14•19 years ago
|
||
It's not a good practice shifting bug research work on users' shoulders, I thought just reporting it is enough.. =/
I haven't found any specification yet on IE/Opera.
Anyway encoding Russian (*Unicode*) characters into escaped *1-byte ASCII characters* aint' a good idea. It should either encode them in proper way or not encode at all (like IE).
![]() |
||
Comment 15•19 years ago
|
||
I didn't ask _you_ to do the legwork. But please don't add irrelevant comments to the bug if you're not working on it, ok?
Comment 16•18 years ago
|
||
(In reply to comment #6)
> Er... Anything that works with URIs understands the percent-escaping part. The
> only part I can see having issues is if we have an encoding mismatch somewhere...
Well, looks like IE doesn't understand Mozilla's escaping. I'll attach testcase and screenshots.
Comment 17•18 years ago
|
||
It's slightly modified testcase from comment 9. It's zip archive with 2 files: foo.html and тест.html (second file name is written with cyrillic letters).
foo.html:
--------------------
<body>
<a href="%D1%82%D0%B5%D1%81%D1%82.html">Click me - works in Firefox only</a>
</br>
<a href="тест.html">Click me - works in IE and Firefox</a>
</body>
--------------------
тест.html
--------------------
<body>This is a test</body>
--------------------
Comment 18•18 years ago
|
||
When I clicked on first link in IE 6, it showed a warning - can not display page and showed garbage in address bar
Comment 19•18 years ago
|
||
When I clicked on second link in IE 6, it had opened file correctly.
Updated•15 years ago
|
Assignee: file-handling → nobody
QA Contact: ian → file-handling
Updated•9 years ago
|
Product: Core → Firefox
Version: Trunk → unspecified
Updated•2 years ago
|
Severity: normal → S3
Comment 20•2 years ago
|
||
The severity field for this bug is relatively low, S3. However, the bug has 20 votes.
:Gijs, could you consider increasing the bug severity?
For more information, please visit auto_nag documentation.
Flags: needinfo?(gijskruitbosch+bugs)
Comment 21•2 years ago
|
||
The last needinfo from me was triggered in error by recent activity on the bug. I'm clearing the needinfo since this is a very old bug and I don't know if it's still relevant.
Flags: needinfo?(gijskruitbosch+bugs)
Comment 22•6 months ago
|
||
Do we still need a change? Now IE is defunct and percent-encoded links work with Chrome and Edge.
You need to log in
before you can comment on or make changes to this bug.
Description
•