Closed
Bug 32895
Opened 25 years ago
Closed 23 years ago
Converting \ to / in urls on windows only (was: RFC 2396 $2.4.3 non-compliance?)
Categories
(Core :: Networking, defect, P3)
Tracking
()
VERIFIED
FIXED
Future
People
(Reporter: jacoby, Assigned: gagan)
References
()
Details
(Keywords: platform-parity)
Attachments
(1 file, 1 obsolete file)
2.27 KB,
patch
|
dougt
:
review+
darin.moz
:
superreview+
|
Details | Diff | Splinter Review |
From Bugzilla Helper:
User-Agent: Mozilla/4.72 [en] (X11; U; Linux 2.2.12-20 i686)
BuildID: M14
By the BNF of 3.2.1 in RFC 2068, the path is broken into segments specifically
by "/", which means that the hypothetical page "back\slash.html" is not
equivaltent to "back/slash.html".
Reproducible: Always
Steps to Reproduce:
1. go to http://www.undergrad.math.uwaterloo.ca/~dj3vande/ie.html
2. click on link
3. if browser is rfc-compliant, you go one place. If else, you go another
Actual Results: Browser gets
http://www.undergrad.math.uwaterloo.ca/~dj3vande/rfc/compliance.html
instead of
http://www.undergrad.math.uwaterloo.ca/~dj3vande/rfc\compliance.html
Expected Results: display
http://www.undergrad.math.uwaterloo.ca/~dj3vande/rfc\compliance.html
Comment 1•25 years ago
|
||
Hmmm... On my linux moz build 2000042809 it replaces the '\' with %5C, on my
windows 95 moz build 2000042708 it replaces the '\' with a '/', giving me two
different documents. IMHO a URI should I the same R, no matter what platform
you're on.
RFC 2068 has been obsoleted by RFC 2616, which points at RFC 2396 for allowed
URI. Section 2.4.3 explicitely marks '\' as unwise and therefore not a valid
character in URI as defined in Appendix A. jacoby@ecn.purdue.edu's URL should
therefore officialy not work.
Of course mozilla is allowed to DWIM (do what I mean) and convert "illegal" URIs
to correct ones. The question is, what kind of conversion should be used? IMHO,
for all schemes '\' and other "illegal" characters should be escaped. On OSen
with '\' as file path seperator it would make sense to convert '\' to '/' for
file:// (and related?) schemes.
Can someone with more knowledge / experience in this field please comment on
this?
Changing the summary, changing the OS (the linux build seems to do the right
thing, the win build doesn't), marking confirmed new.
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Linux → Windows 95
Summary: RFC 2068 3.2.1 non-compliance → RFC 2616 $2.4.3 non-compliance?
Comment 2•25 years ago
|
||
Putting myself on the CC.
Comment 3•25 years ago
|
||
Adding timeless to the CC per his request.
Comment 4•25 years ago
|
||
Yes, this was done for all those windows users who can't distinguish a \ from a
/. This happens very often. We even have requests to allow
http:\\server\path\file.htm as a valid url that should do the right thing. We
don't do that, a line has to be drawn somewhere.
But the above thing happens very often ... so whatever we do we will break some
pages. Maybe we could make this configurable, sort of quirks-mode. Making this
protocol depended is also a very good idea. I will look into that.
->andreas.
Assignee: gagan → andreas.otte
Target Milestone: --- → M18
Comment 6•25 years ago
|
||
I think the problem here is that there aren't enough requests which _beg_ you
not to allow \ as path seperator because it only encourages people to not
correct their mistakes and will slowly force other software packages to have to
support the \ in their software. Consider this the first request :-) *beg*
Besides, pages which rely on \ being a path seperator should break (heck, they
will on Nav4 under linux) and their writer should be gently notified of the
existence of RFCs, standards, and why it's a good thing to adhere to them.
I know computers are supposed to make things easier for humans, and not the
other way around, but the moment humans start making life difficult for other
humans with their silly requests, a line should be drawn.
Comment 7•25 years ago
|
||
There are at least two places where we convert \ to / inside urlpaths in
mozilla. One is inside the urlparser. The other is inside the docshell where we
try to fix a string (for XP_PC) to a valid url when the first try to parse the
url fails.
Also there are some converter-functions which convert from a native path to an
url path and the other way around. Normally access from a file to an url should
go through this converter routines, but I'm not sure this always true.
I don't like this conversion inside the urlparser too. I will do some tests
without it and see what breaks.
Status: NEW → ASSIGNED
Comment 8•25 years ago
|
||
I'd have to second the beg that we convert \ to %5C, and not /. Allowing \ only
encourages it, and whats worse, pages that use it will break on other OSes.
Adding pp.
Keywords: pp
Comment 9•25 years ago
|
||
Yes, it's a real problem. NC 4.x supports \ and it's conversion to /, IE does it
too. It's used in some pages. If we want to reach platform parity quick it will
go the other way around, we will convert \ to / for UNIX/Linux/Mac too.
In the long run, it makes sense to remove that support and replace it with a
fallback support. First try it without conversion, if that worked all is okay,
if not and the string contains \ convert them to / and try again. But that is a
massive undertaking since we have to change every place where a relative or
absolute URI string is used to create an URI and wrap it into the fallback code.
Comment 10•25 years ago
|
||
Andreas suggests that all platforms convert \ to / now, at that it be phased out
later. If we can't do it now, during the complete rewrite, it will never be
done. More web page authors will start using \. Breaking the RFC will become the
standard that all web browsers have to implement.
Comment 11•25 years ago
|
||
Gagan voted against removing the \-conversion code now although he agrees that
it is the right thing to do, including implementing some sort of "quirks" mode,
maybe like that one I described above.
Comment 12•25 years ago
|
||
Another problem is that while windows users are used to \ as a file separater,
unix users are used to \ being a shell escape. So in file: urls on unix, you
should be able to use "one\ name" to reference a directory or file with a space
in the middle. Currently you have to use %20. Not saying that \ should be
implemented this way, but I'm just giving another example of the problems with
this situation.
Comment 13•25 years ago
|
||
Well, I don't think Unix users ought to be able to do that. If we assume shell
escapes are resolved first, you still have an unescaped space character in the
URL, which is invalid per the RFC. Of course, this is no greater non-compliance
than converting '\' to '/' in the first place.
Comment 14•25 years ago
|
||
back to gagan for reassignment, my time schedule is getting worse, I have to
stop sitting on this bug.
Assignee: andreas.otte → gagan
Status: ASSIGNED → NEW
Comment 16•24 years ago
|
||
Works on Linux... is this still broken in windows?
Comment 18•24 years ago
|
||
The parsercode for conversion of \ to / is still in there (nsURLHelper.cpp).
Comment 19•24 years ago
|
||
Yes, while Netscape 4.78 gets the "Your browser is compliant with RFC 2068."
page in the testcase! Both on WindowsME (Mozilla trunk CVS build on 20010804).
Comment 20•24 years ago
|
||
Compliance with RFC2396 2.4.3 means in this case: Make the urlparser(!)
completly ignorant of \ as an separator for directory structures. Treat it as
what it is: An unwise character which should and will be escaped.
This involves:
- nsURLHelper/CoaleseDirs will no longer convert from \ to / on XP_PC.
- nsURLHelper/CoaleseDirs will ignore \ as a directory separator (all platforms).
- nsNoAuthUrlparser will no longer recognize \ as a possible directory separator
on XP_PC used for drive detection.
Consequences:
An embedded URL inside a HTML document is always treated as an URL, using \ as
directory separator in it will no longer work! Maybe docshell can do some
urifixup if the first try to load the url fails.
"Doing the right thing" (TM) lies only in the hand of the file system specific
conversionen routines from and to file-urls! This means:
- correct conversion/escaping of filesystem specific characters wich colide with
url systax. For example: having filenames with /, : or something similar in it
needs to trigger special escaping of these chars before making it part of a file
url.
- use nsIFile and its conversion routines everywhere and use them correctly.
- Do the right thing with UNC filepaths.
- Do escaping/unescaping in the right order and number.
Other stuff:
GetFile and SetFile from nsStdURL should be moved into the file system specific
parts of the implementation. Having it ifdefed inside nsStdURL is really bad style!
Summary: RFC 2616 $2.4.3 non-compliance? → RFC 2396 $2.4.3 non-compliance?
Comment 21•24 years ago
|
||
Comment 22•24 years ago
|
||
*** Bug 34239 has been marked as a duplicate of this bug. ***
Comment 23•24 years ago
|
||
Gagan, Judson? Anyone want to do a review?
Updated•24 years ago
|
Summary: RFC 2396 $2.4.3 non-compliance? → Converting \ to / in urls on windows only (was: RFC 2396 $2.4.3 non-compliance?)
Comment 24•24 years ago
|
||
to try to clarify the desired behavior:
RFC 1738 lists "\" as an unsafe character, then says:
" All unsafe characters must always be encoded within a URL. For
example, the character "#" must be encoded within URLs even in
systems that do not normally deal with fragment or anchor
identifiers, so that if the URL is copied into another system that
does use them, it will not be necessary to change the URL encoding."
So we should do encode the offending slash, on all platforms.
I had originally thought that it was a "reserve"-able character, which would
make it URL scheme specific, and possibly legal in file URL's. I was wrong.
In regards to how this affects windows users w/ file paths, I think that what we
need is entrypoints (like the line command parser and "Open URL" dialogs) to be
path friendly, and do conversion to file URLs automatically. We might also want
to change the displays of file URL's to a local path in the local formatings,
but I think our internal handling of slashes should be compliant.
Comment 25•24 years ago
|
||
Yes, that's exactly right. We have functions that convert from local filepaths
to urls and the other way around. These functions are aware of the filesystem
specific delimiters and special characters and convert them. If these functions
are used and are correct, all is fine. We end up with a valid url or a vaild
filepath.
What will get broken when this goes active are websites which have *urls* in
their documents which use \ not as a normal char but as the path delimiter. This
is clearly wrong and a case for tech evangelism. But you have always to deal
with the argument: IE (windows) can handle this, why can't you ...
Comment 26•24 years ago
|
||
Comment 27•24 years ago
|
||
I suggest we get this in early in the 0.9.5 cycle to get early response ...
Keywords: review
Comment 28•24 years ago
|
||
So what's the final fix? Convert typed in URLs with \ to /? What about href
URLs, will clicking a link on MS vs Mac go to the same place?
Comment 29•24 years ago
|
||
No, with the patch we wont do that conversion on any platform anymore. A \ that
is part of an *url* will be just an unwise char which will get escaped to %5C.
On the other hand if we get a \ as part of a *filepath* on windows/os2 we will
convert it to / when doing a conversion from filepath to url. But that is part
of the local filepath handling (and some uri fixup in docshell) and has nothing
to do anymore with the url parser.
Updated•24 years ago
|
Attachment #46676 -
Attachment is obsolete: true
Updated•23 years ago
|
Attachment #48700 -
Flags: review+
Comment 30•23 years ago
|
||
Comment on attachment 48700 [details] [diff] [review]
new diff to prevent bitrott
sr=darin ... who reviewed this?
Attachment #48700 -
Flags: superreview+
Comment 31•23 years ago
|
||
if you look click on the "View Bug Activity" you will see that I did. :-)
Comment 32•23 years ago
|
||
fix checked in. Can be verified by using the test url
http://www.undergrad.math.uwaterloo.ca/~dj3vande/rfc\compliance.html
It will now verify compliance for mozilla windows too. On the other hand we will
probably see some more reports about non working links which use \ instead of /.
All these bugs can go straight to tech evangelism.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Comment 33•23 years ago
|
||
*** Bug 119457 has been marked as a duplicate of this bug. ***
Comment 34•23 years ago
|
||
*** Bug 150475 has been marked as a duplicate of this bug. ***
Comment 36•21 years ago
|
||
another thread on same topic:
http://forums.mozillazine.org/viewtopic.php?p=629335#629335
Firefox should be tolerant as well, or at least provide a means to enable
tolerancy (i.e. give users a choice between strict mode and tolerant mode).
This strictness can only lead to bad press in comparisons. Even the open source
Apache web server is tolerant (or can be configured to be tolerant) for URL and
spelling mistakes (mod speling) . There's no reason Firefox can't be this way too.
Broken page:
http://www.nywatertaxi.com/about.php
One bad press example due to strictness...
http://computergripes.com/firefox.html
mod_speling:
http://httpd.apache.org/docs-2.0/mod/mod_speling.html
Note: one way to workaround Firefox and Mozilla's strictness might be to use a
proxy that can rewrite the URLs, replacing the bad '\' with a good '/'. However
I still think this should be part of a robust browser tolerant of
mistakes...perhaps like how bad javascript is handled (put a warning sign
somewhere that the script isn't quite right, but still make a best-effort at
rendering it correctly.)
Comment 37•21 years ago
|
||
(In reply to comment #36)
> Firefox should be tolerant as well, or at least provide a means to enable
> tolerancy
why should firefox assume that someone meant to type / when he typed \? those
are different characters, and can well be different files on the web server.
Comment 38•21 years ago
|
||
I agree with biesi here. I'm all for tolerance, but if we break a RFC in the
process a line is crossed. This would happen here, we would break all those
pages that have a \ as part of a file- or directory name. There are not very
much, but those pages exist.
![]() |
||
Comment 39•20 years ago
|
||
*** Bug 176918 has been marked as a duplicate of this bug. ***
You need to log in
before you can comment on or make changes to this bug.
Description
•