Closed Bug 190821 Opened 22 years ago Closed 18 years ago

Mozilla incorrectly deals with backslash (\) in local A links

Categories

(Core :: Networking: File, defect)

x86
Windows XP
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: bugzilla, Assigned: dougt)

Details

Attachments

(1 file)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.3a) Gecko/20021212 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.3a) Gecko/20021212 Because Mozilla has been programmed *not* to support locally referenced paths for files (eg. C:\website\index.htm), Mozilla will automatically convert such a referenced path to a URI when it is entered in the address bar. However, when a file is referenced locally in an A link tag, Mozilla neither displays an error message nor properly converts it to a URI. Instead what it does is convert it to a URI but escapes the '\' characters to '%5C'. This results in a couple of problems. 1) Inconsistency: it works fine when entering in the address bar but not when in an A tag. 2) Totally weird behaviour: a URI with the '\' characters escaped to '%5C' should technically be totally invalid. The '/' character is the URI delimiter, not '\'. However, Mozilla *will* still load the page despite the '\' characters being escaped, and technically being opaque characters... yet it will not load the images on the page! These come up as broken images. Either the A reference should be fully properly converted to the equivalent URI ('\' changed to '/', not escaped) or Mozilla should display an error message, as the referenced URI is invalid. Reproducible: Always Steps to Reproduce: 1. Create an A tag in an HTML page locally referencing a file in Windows. 2. Click on the A tag. Actual Results: The referenced page loads, but the delimiters in the converted URI are '%5C'(!) and the images are all broken. Expected Results: Displayed an error message or fully properly converted the reference to a URI.
I believe Mozilla escapes '\' characters because '\' is a valid filename character in every filesystem but DOS/Windows. ->Networking:file This bug sprang from the discussion in bug 190785
Assignee: asa → dougt
Component: Browser-General → Networking: File
QA Contact: asa → benc
Please read the comments to Bug 190785 . As to the summary of this bug: Mozilla handles the paths _correctly_. You want it to handle them incorrectly. Please also read sections 2.2 and 2.4.3 of http://www.ietf.org/rfc/rfc2396.txt , showing that using the backslash within an URI may lead to severe problems.
Component: Networking: File → Browser-General
QA Contact: benc → asa
What! The Bugzilla mail showed me that I changed to QA contact and the Component, though I never did that! funny...
->Networking:file
Component: Browser-General → Networking: File
QA Contact: asa → benc
Sorry Chris, I just read your comment I had a midair collision with and did not notice you changed the component... Sorry for the spam.
Well, I think this one is dupe of bug 32895. Somebody check it.
With regards to comment #2: The RFC says (about characters including '\'): Other characters are excluded because gateways and other transport agents are known to sometimes modify such characters, or they are used as ***delimiters***. It mentions that they are used as delimiters. And if Mozilla is going to escape them and treat the converted string as a URI, surely it should *not* recognise something like "c:%5Cwebsite%5Cindex.htm" as valid? Unescaped, that reads as "c:\website\index.htm" and is not a valid URI. However Mozilla will indeed load such a URL, and recognise the '%5C' characters as delimiters.
With regards to comment #6: No, this is not quite a dupe of bug 32895. I accept that Mozilla will not now convert '\' to '/' when you type in a remote URL/URI. However, what it will do is: 1) Convert '\' to '/' when you type in a *local* path into the address bar, as part of the URI-conversion process. 2) Convert '\' to '%5C' when you click on a link that references a *local* path, but still display the page (with broken images!) despite '\' or '%5C' not being URI delimiters.
I don't understand you arguments about local URL's in HTML file. Mozilla could convert \ to / in URL bar in Windows, because mozilla know, that user live under Win and converting this urls for local adresses is not violate RFC (Mozilla could do anuthing with this string, but result should be RFC compilant). If you use local path in file, there is no guarantee, that file alive in windows-system, so this URL SHOULD be transfered to hosting OS, and if hosting OS is any UNIX, than converting / to %5C is a not allowed -- it would be violation of RFC in this case. And even if Mozilla run under WIN, there is absolutely no guarantee, that host computer is WIN.
Ruslan: This is a fair argument. However, if you are to argue this, then a link in an HTML page that references a file using '\' delimiters *should not* be recognised as a valid URI. They should be escaped to '%5C' (as they are), but then Mozilla should come up with an error message stating that it cannot find this address. Instead, it still treats the '%5C' characters as valid delimiters, and displays the local webpage, but with broken images! This is inconsistent behaviour.
URL bar converts the characters in situations where it can figure out that that the user entered a Windows file path. This makes sense since users might cut and paste in a variety of situations. Once you publish content in HTML, the URLs need to be correct, because you do not know what platform the file will be read to. I was going to try to explain the rest of the behavior, but the problem is not entirely clear to me (perhaps Jermey could attach a sample HTML file that shows what he thinks is broken. Meanwhile, +cc:andreas, who can often explain these things on the first try.
Summary: Mozilla handles locally referenced paths with backslash (\) incorrectly → URL: paths with backslash (\) incorrectly
Agreed there is an inconsistency here ... on some occasions when an url goes through the docshell urifixup kicks in even if you click on a link. Then, when loading images in those documents, docshell is not used, urifixup does not happen as it should. I have seen that myself on some occasions, but was unable to pinpoint the exact location (not of the fixup, but what triggers it). Sometimes it seems to happen on error as a fallback mechanism. The only code that does this kind of conversion is located in docshell. I think we all agree that links in a document are *urls* and should be correct. In that context using '\' as delimiter is wrong, you should use '/' and put a 'file:///' or 'file://' before the path to reference local files. Everything else is just plain wrong. In URI context '\' is just a normal character as part of filenames which should not be used in urls and gets escaped in the process. The OS dependent conversion from file url to OS-paths takes care of the right delimiters. If you can bring a testcase that makes this behavior reproducible we can move this bug over to docshell where it belongs.
I've added a test case attachment; unzipped on a Windows platform keeping directories intact, it should illustrate the bug in Mozilla. Interestingly in my Mozilla build it doesn't actually display the image as a broken image now, but instead displays the ALT text as normal text!
The default behavior of broken images is to show only <ALT> and no broken image if <ALT> is provided.
Component: Networking: File → Embedding: Docshell
Has there been any progress in resolving this bug?
Can someone with a windows build verify that mozilla actually loads the page (nextpage.htm) from the testcase and not (as on linux) pops up a message saying the page does not exist?
Also see bug 192816 for a discussion about the general direction of urifixup. Also see bug 122270 for the total removal of \ to / conversion which would also "fix" this bug.
I get this too. The help files for Timbuktu will not display on XP build 2003060304. file:///c:%5Cprogram%20files%5Ctimbuktu%20pro%5CHelp%5C1.htm So, the \ are escaped, but the file is not found!
Status: UNCONFIRMED → NEW
Ever confirmed: true
James, the backslash is not a valid path separator in any kind of url, not remote not local. Instead \ is just a normal character which should not be used in urls and therefore gets escaped, as suggested by RFC 2396. / is the character that must be used as path separator.
Ah, but it works in IE.... Those of us that try to make Mozilla our only browser are continually caught by non-functional programs that may do non-standard things that are accepted by IE. You can say it is a proselytization issue, but it makes life very hard for us users.
Andreas: 2 things... 1) That argument is fine for a hyperlink with '\' in it (the webpage designer should know better), but we all know that if a user types '\' in the address bar as part of a URL, it's far, far more likely that they meant '/' than that they wanted it escaped. 2) Bear in mind that, in Windows, local directories _are_ seperated by '\'. If you're writing the path of a local file in the address bar (ie. c:\blah\test.htm), that is 100% valid as a *Windows* path. It's got nothing to do with RFC 2396, really, and Mozilla should recognise it as a Windows path and escape \ to / accordingly (or not need to escape...)
Let me clarify that because I'm talking rubbish :-) Mozilla DOES correctly fix up the URI if the user types '\' in the address bar. However, if '\' is in a hyperlink, Mozilla will escape it to %5C, then show the webpage (if found) with broken images. This is inconsistent. Either don't interpret %5C as a directory seperator and display an error, or interpret it properly and display the complete page being pointed to by the hyperlink.
I would not call that inconsistent, I just think that pages have another level of quality than something that is typed or dragged into into the urlbar. Inconsistent is that sometimes urifixup is called when it shouldn't (replacement of \ to / happens when it shouldn't). That the example page works in IE violates several RFCs and it is easy to create a wevsite that is not accessible to IE conforming to those RFCs. On the other hand is the usage of \ as part of a file- or directory-name an edge case, it is more likely that the user falsely used \ as path separator. So it might be a good temporay solution to replace all \ with / in links when the path only contains \ and not /. A better end solution might be code that first tries the usage of \ as normal character and if that fails try to replace it with /. That would conform to the RFCs on first try and fixup uris on second (or more) attempt.
"I just think that pages have another level of quality than something that is typed or dragged into into the urlbar." I have no idea what you mean by this, or what it has to do with the point I was making. When you click on a link, you don't touch the URLbar. If the '\'s are escaped to %5c, either the entire webpage should load (with working images), or an error message should result if the file is not found.
I agree with jez. To display a file reference in the browser you should be able to put something like file:///c:%5Cprogram%20files%5Ctimbuktu%20pro%5CHelp%5C1.htm into the URL bar and have it work. It is escaped properly and does not display the file on WinXP.
No, it's not. The given url file:///c:%5Cprogram%20files%5Ctimbuktu%20pro%5CHelp%5C1.htm points to a *file* named c:\program files\timbuktu pro\Help\1.htm which must be located in the root directory. It is a file because it does not contain the / as directory separator. If you put file:///c:%5Cprogram%20files%5Ctimbuktu%20pro%5CHelp%5C1.htm into the urlbar you should get an error message unless you really have a file named like that on your computer. If you put file:///c:\program files\timbuktu pro\Help\1.htm into the urlbar mozilla will try to convert all \ to / and fix the url for you assuming you meant / instead of \. A different thing are HTML documents which can contain urls. For those URLs there are certain standards and those RFCs make it very clear that / is the character that separates directorys not \. \ is just a normal char that should not be used (a gets escaped in the process). If someone is able to create a HTML file we assume that he/she or the program he/she uses does know about the relevant RFCs and follows them. For that reason we do not (or at least in most cases) do any url fixup like converting \ to / inside links. We don't do this because it is possible to have a file that contains a \ as normal character and would not load if it gets converted to /. This is the different level of quality (having a document or typing into urlbar) I was talking about. It is totally irrelevant that windows uses \ as file separator on *local* paths. File-Urls (which are platform independent!) do use / instead and it is the responsibility of mozilla to convert / to \ when converting a file-url to a local windows path and it does just that. \ or %5C is just a normal char in file-urls, you can not expect it to be interpreted as directory separator. Use / instead as you should. The Timbuktu help files must be fixed ...
OK, 3 things: 1) What are the Timbuktu help files? 2) I agree with what you say re: %5C *NOT* being a directory seperator. If that's the case, why does my A HREF link "gamepoint\sites\gemsgames\index.htm", which gets converted in the URLbar to "file:///c:/jeremy/gamepoint%5Csites% 5Cgemsgames%5Cindex.htm", result in Mozilla loading an HTML page with broken images, and not just displaying a 'not found' error? 3) Just out of interest, why does Mozilla want file:/// prefixed onto a file path? Why is the triple slash needed, then '/' on *nix and 'x:\' on Windows signifies a local root?
See comment #19 for what timbuktu help files are. Just another example of bad coded html. The reason those *broken* href links (no url-scheme, using \ as directory separatator) work sometimes (but not always) is an inconsistency error in mozilla. They should not work at all, but sometimes the docshell-urifixup kicks in when it shouldn't. Nobody knows currently why, but it should only kick in when you type into the urlbar, not when you click on a link, even when that link opens a new window. You are not allowed to put a local filepath into a HTML document referencing other files. You are supposed to put *urls* (absolute or relative) into it, see any definition of the <A>-tag in HTML as a guideline. Those urls (and file urls are one of them) have a very clear syntax, and it is *not* the local filepath syntax.
But it doesn't look like urifixup is kicking in, because the '\' in the link are actually being escaped, and appear escaped in the URLbar, but Mozilla STILL loads the page! Isn't urifixup just meant to change '\' to '/'?
I suspect that Timbuktu called mozilla on the command line with the file reference. I do not know if it was called with "\" or "%5c"
Hmmm ... it seems we need to find out more about how Timbuktu calls Mozilla. I think we escape a given string comming over the commandline, that would account for the escaped \. Then urifixup kicks in and prepends file:///, but no conversion from \ to / happens because the \ are escaped. When converting back the fileurl to a local filepath we unescape and end up with a vaild filepath (more or less by accident!) Maybe it makes sense to do a local filepath to file-url conversion instead of just escaping, but I'm not sure what happens. If links in the loaded document are also only local filepaths they will not load in most cases because we don't usually do urifixup on them, because the page author should know better.
I rechecked Timbuktu be turning off Mozilla first, and it definitely transfers the request on the command line because Mozilla started up.
Summary: URL: paths with backslash (\) incorrectly → Mozilla deals with backslash (\) in local A links incorrectly
Summary: Mozilla deals with backslash (\) in local A links incorrectly → Mozilla incorrectly deals with backslash (\) in local A links
I've verified that this bug is still happening with the latest nightly build of Mozilla (Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5b) Gecko/20030726). Not good.
I have confirmed the behavior in comment #23 in Mozilla 1.6 MacOS X. [[Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.6) Gecko/20040113]]. If we need before and after screenshots, let me know and I will provide them. Actually we should change the name of this bug as well, because the hyperlink problems are not restricted to local links, or the Windows environment.
Isn't this a dupe of Bug 122270?
I've refiled this bug as bug #270202 which I think is more accurate and to the point than this bug. It's really a very simple inconsistency I would like fixed. Please could the people associated with this bug CC themselves to the new bug, or do something to fix it? It annoys me, it's like a small itch in the recess of your back that you can't quite reach.
I understand that %5C should be converted to / on Windows, the unfortunate thing is it doesnt if you use the open command on Windows, so the page will load, but images and stylesheets wont unless you replace %5C with / manually in the adress bar. I personally think this is a mayor bug as it makes any local document containing img,link,script with relative path references unviewable in firefox using the default opening handle (Additionally I have a problem with bug #263570 , so I have to open every file in Internet explorer, copy the filename and replace \ with / which destroyes every use of firefox for local files on my pc).
Just tried the tetscase with Firefox 1.5.0.3 and something seems to have been fixed. Instead of the link using \ instead of / for the dir separator converting it to %5C, it now converts it to '/' in the address bar. This means the images and other referenced content on the next page now work. Could anyone tell me what happened? I take it the policy that's been decided upon is to simply convert \ to / in paths, and advise that people shouldn't use \ to mean anything other than a dir separator? Breaks an RFC, but is more pragmatic? IIF this is the case, this bug shall be resolved FIXED.
Component: Embedding: Docshell → Networking: File
A slight amendment to the previous comment, sorry... Instead of the link using \ instead of / for the dir separator converting it to %5C, it now converts it to '/' in the address bar _as well as_ for referenced content on the page, making that referenced content now display correctly. In short, the images are no longer broken, for whetever reason. Does this apply in Mozilla suite / Seamonkey / Camino, as well?
If this has been fixed, #270202 should be resolved fixed too.
Fixed by bug 249282.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: