User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3a) Gecko/20021217 Phoenix/0.5
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3a) Gecko/20021217 Phoenix/0.5
"//" should be treated like "/" - so if we see "//../" in a URL, this should
be treated the same as "/../" - but mozilla is stripping the ".." out and we
look at the wrong directory.
Should be the same as:
But is instead treated as:
(and yes, this is an actual image, but it shouldn't be accessed by the given url).
file: is also effected.
Steps to Reproduce:
1. Load: http://www.google.com/images//../logo.gif
Went to the wrong location [http://www.google.com/images/logo.gif]
Gone to: http://www.google.com/images/../logo.gif
(And see the appropriate 401 from google)
*** Bug 187844 has been marked as a duplicate of this bug. ***
I've been searching for this but the only information I can find on a double
slash is to designate a network device when it's at the beginning of the URI.
(RFC 1808 & 2396) Also, I believe that an absolute path should be passed to the
Accordin to RFC 2396 (http://www.ietf.org/rfc/rfc2396.txt) // could not be
occured in URI, other than scheme delimeter. It definitly newer could be treated
as /. If // appeared not as a scheme delimeter, following rules should apply:
path_segments = segment *( "/" segment )
segment = *pchar *( ";" param )
So // could be treated as /[empty char]/ or have specific meanings (such as
"index"). It could not be parse as /.
I don't see why considering // to be /[empty char]/ would preclude it from
acting like /./, as does UNIX and most OS, as well as most browsers and web
servers that I've known.
Because RFC stricly described process of URI parcing.
First, take in mind, that
URI that are hierarchical in nature use the slash "/" character for
separating hierarchical components. For some file systems, a "/"
character (used to denote the hierarchical structure of a URI) is the
delimiter used to construct a file name hierarchy, and thus the URI
path will look similar to a file pathname. This does NOT imply that
the resource is a file or that the URI maps to an actual filesystem
Within a relative-path reference, the complete path segments "." and
".." have special meanings: "the current hierarchy level" and "the
level above this hierarchy level", respectively. Although this is
very similar to their use within Unix-based filesystems to indicate
directory levels, these path components are only considered special
when resolving a relative-path reference to its absolute form
Second, // inside path is invalid:
The path may consist of a sequence of path segments separated by a
single slash "/" character. Within a path segment, the characters
"/", ";", "=", and "?" are reserved. Each path segment may include a
sequence of parameters, indicated by the semicolon ";" character.
The parameters are not significant to the parsing of relative
e) All occurrences of "<segment>/../", where <segment> is a
complete path segment not equal to "..", are removed from the
buffer string. Removal of these path segments is performed
iteratively, removing the leftmost matching pattern on each
iteration, until no matching pattern remains.
So "first/second/..//" should be parsed as "first/", not as "first".
> Second, // inside path is invalid:
> the characters "/", ";", "=", and "?" are reserved...
Right - so this implies that the path segment between the '/' is ''
which is valid according to the RFC:
segment = *pchar *( ";" param )
The question is, what to do with a null segment in general, and
what to do with a null segment followed by "/../" (arguably this
should be consistent).
> e) All occurrences of "<segment>/../", where <segment> is a
> complete path segment not equal to "..", are removed from the
So the question I have here is, what is a "complete" path segment?
(Does a "null segment" qualify?)
Perhaps there should be an additional rule before rule e) that states
that any '//' should be converted to '/' - though I know this isn't currently
in the RFC.
I will take a look, if this can easily be done. The empty segment argument makes
sense to me. However Windows does not take // kindly when it ends up as part of
an absolute path (and is not an UNC path).
Upon further looking into this I think the current behavior (interpreting // as
empty segment and not as /) is the correct one in conformance with rfc 2396.
Interpreting // as / is a unix filesystem only feature which has no real meaning
for hierachical url paths. The same thing goes for ... which has special meaning
on Win98 but nowhere else. Recommend INVALID.
Resolving invalid per the discussion.