Closed Bug 187845 Opened 22 years ago Closed 21 years ago

URL: "//../" should be treated same as "/../"

Categories

(Core :: Networking, defect)

x86
All
defect
Not set
normal

Tracking

()

RESOLVED INVALID

People

(Reporter: bugzilla.mozilla.org, Assigned: dougt)

References

()

Details

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3a) Gecko/20021217 Phoenix/0.5
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3a) Gecko/20021217 Phoenix/0.5

"//" should be treated like "/" - so if we see "//../" in a URL, this should
be treated the same as "/../" - but mozilla is stripping the ".." out and we
look at the wrong directory.

For example:

http://www.google.com/images//../logo.gif

Should be the same as:

http://www.google.com/images/../logo.gif

But is instead treated as:

http://www.google.com/images/logo.gif

(and yes, this is an actual image, but it shouldn't be accessed by the given url).

file: is also effected.


Reproducible: Always

Steps to Reproduce:
1. Load: http://www.google.com/images//../logo.gif
Actual Results:  
Went to the wrong location [http://www.google.com/images/logo.gif]

Expected Results:  
Gone to:  http://www.google.com/images/../logo.gif

(And see the appropriate 401 from google)
*** Bug 187844 has been marked as a duplicate of this bug. ***
I've been searching for this but the only information I can find on a double
slash is to designate a network device when it's at the beginning of the URI.
(RFC 1808 & 2396) Also, I believe that an absolute path should be passed to the
server exactly.

Changing OS=all
OS: Linux → All
Accordin to RFC 2396 (http://www.ietf.org/rfc/rfc2396.txt) // could not be
occured in URI, other than scheme delimeter. It definitly newer could be treated
as /. If // appeared not as a scheme delimeter, following rules should apply:

    path_segments = segment *( "/" segment )
    segment       = *pchar *( ";" param )

So // could be treated as /[empty char]/ or have specific meanings (such as
"index"). It could not be parse as /.
I don't see why considering // to be /[empty char]/ would preclude it from
acting like /./, as does UNIX and most OS, as well as most browsers and web
servers that I've known.
Because RFC stricly described process of URI parcing. 
First, take in mind, that

 URI that are hierarchical in nature use the slash "/" character for
   separating hierarchical components.  For some file systems, a "/"
   character (used to denote the hierarchical structure of a URI) is the
   delimiter used to construct a file name hierarchy, and thus the URI
   path will look similar to a file pathname.  This does NOT imply that
   the resource is a file or that the URI maps to an actual filesystem
   pathname.

and 

   Within a relative-path reference, the complete path segments "." and
   ".." have special meanings: "the current hierarchy level" and "the
   level above this hierarchy level", respectively.  Although this is
   very similar to their use within Unix-based filesystems to indicate
   directory levels, these path components are only considered special
   when resolving a relative-path reference to its absolute form
   (Section 5.2).

Second, // inside path is invalid:    
The path may consist of a sequence of path segments separated by a
   single slash "/" character.  Within a path segment, the characters
   "/", ";", "=", and "?" are reserved.  Each path segment may include a
   sequence of parameters, indicated by the semicolon ";" character.
   The parameters are not significant to the parsing of relative
   references.

There stated
      e) All occurrences of "<segment>/../", where <segment> is a
         complete path segment not equal to "..", are removed from the
         buffer string.  Removal of these path segments is performed
         iteratively, removing the leftmost matching pattern on each
         iteration, until no matching pattern remains.

So "first/second/..//" should be parsed as "first/", not as "first".
> Second, // inside path is invalid:  

I disagree:

> the characters "/", ";", "=", and "?" are reserved...

Right - so this implies that the path segment between the '/' is ''
which is valid according to the RFC:

    segment       = *pchar *( ";" param ) 

The question is, what to do with a null segment in general, and
what to do with a null segment followed by "/../" (arguably this
should be consistent).

> e) All occurrences of "<segment>/../", where <segment> is a 
>    complete path segment not equal to "..", are removed from the

So the question I have here is, what is a "complete" path segment?
(Does a "null segment" qualify?)

Perhaps there should be an additional rule before rule e) that states
that any '//' should be converted to '/' - though I know this isn't currently
in the RFC.
I will take a look, if this can easily be done. The empty segment argument makes
sense to me. However Windows does not take // kindly when it ends up as part of
an absolute path (and is not an UNC path). 
Upon further looking into this I think the current behavior (interpreting // as
empty segment and not as /) is the correct one in conformance with rfc 2396.
Interpreting // as / is a unix filesystem only feature which has no real meaning
for hierachical url paths. The same thing goes for ... which has special meaning
on Win98 but nowhere else. Recommend INVALID.
Resolving invalid per the discussion.
Status: UNCONFIRMED → RESOLVED
Closed: 21 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.