Closed Bug 207298 Opened 21 years ago Closed 21 years ago

FTP directory problems w/ URL parsing when URL root is not filesystem root

Categories

(Core Graveyard :: Networking: FTP, defect)

defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: benc, Assigned: dougt)

References

Details

(Whiteboard: checkwin, checklinux)

Attachments

(1 file, 4 obsolete files)

I've noticed a couple minor problems with how FTP URLs are generated and work. I
think some of the causes of some problems are related, so I'm going to describe
the various problems here, and we can break out the items as specific bugs as we go.

I think most problems are related to the not-so-standard usage of URL syntax for
FTP URLs. FTP URLs use "ftp://server/" as the URL top level, but the FTP server
can map that to any point in the filesystem, as well as chroot the server process.

In most anonoumous systems, "ftp://server/" points to the top level of a
anonymous ftp folder, which is chrooted for security. Our implementation is
really geared for this environment, and pretty much works w/o problem.

In multi-user environments, logging in as a specific user usually causes the ftp
server to map "ftp://user@server/" to the home directory of the server, but
leaves the filesystem root as "/", since the user has access to the entire
filesystem.

If you want to get to the root of the filesystem "/", you use "ftp://user@server//"

The directory->html stream converter seems to be build on an assumption that the
URL "/" and filesystem "/" would be the same, and that assupmption creates
several problems:

1- ftp://user@server/ does not contain a "Up to higher level directory" link.
2- The directory for the filesystem's top displays a "Up to higher level
directory". Ironically, ftp://user@server//.. is parsed as "ftp://user@server/",
the user home directory.
3- Any subdirectory above the home direcctory (ftp://user@server/../) will cause
 URLs to be constructed that do not work, because of bug 206990). For example:
If the home directory is /users/user, then ftp://user@server/../ will display
"/user/", but the directory "/users/user" will have a link of
ftp://user@server/../user, which is parsed as "ftp://user@server/user",
resulting in a "not found" error.

The way this happened is that we were concerned about fixing ftp URLs that went
to files w/ an absolute path on systems that have home directories. After making
that work, nobody had noticed that since the html/directories are based on a
stream converter, the filename -> link mapping needed to be re-purposed to
handle the assumptions in the world.
While working on bug 206990, I realized that everything I described is about the
html/dir behavior, and that this probably depends on bug 206990.
Depends on: 206990
Summary: FTP problems w/ URL parsing when URL root is not filesystem root → FTP directory problems w/ URL parsing when URL root is not filesystem root
We have a couple of bugs on this. I think we wontfixed them due to conflicting
semantics.

At one stage we kept going back and forward between 'correct' behaviour. The RFC
wasn't any help, and conflicted, and.....
Whiteboard: DUPEME?
There are some people, like squid, who think ftp://foo.example// should 
be written as ftp://foo.example/%2f . Squid also generates up links for 
any ftp url. I guess it's relying on the server to know it's own 
directory structure. I don't see why mozilla shouldn't do the same. 
There really is no way for a client to know how the server's configured 
so why bother.
Point 2 and 3 are indeed related to bug 206990, this can be fixed when we move
to a better tinderbox-system which does not relay on the behaviour that too much
.. are dropped and not retained in the url. So far we are blocked by our own
systems :(

Point 1 may need a special html converter for ftp that always displays a "Up to
higher level" directory unless the ftp url looks like ftp://host// or
ftp://host/%2F. When accessing through an user-home-directory we do not know the
servers filesystem structur I think, but I may be wrong here. Have to check.

thenthumbs: Mozilla should work with ftp://foo.example// as well as
ftp://foo.example/%2f. The later one is correct by the RFC, the first one is for
conveniance, here we violate the RFC, a little compromise.
Looking at the results of an anonymous ftp session vs. a user-login session I
noticed:

(using ftp -d)

(anonymous)
---> PWD
257 "/" is current directory.
got remotepwd as `/'

(user=composer)
---> PWD
257 "/users/composer" is current directory.

So, thankfully, I think we can figure out where we are dumped into the
filesystem. If we couldn't we'd be pretty doomed.
> Mozilla should work with ftp://foo.example// 

I'm not a standards purist so I don't mind if mozilla cheats but I'll 
bet someone will put a file with a "/" in its name on a non-unix server. 
Will that work?


> So, thankfully, I think we can figure out where we are dumped into the
> filesystem. If we couldn't we'd be pretty doomed.

There are all sorts of odd ftp servers out there. Have you checked that 
most of them respond the way you want?
The RFC regards // as / in ftp URLs, so technically ftp://user@host//path should
also point to the users home directory. But most time it is falsely used to
point to the root of the ftp server, mozilla goes here with the croud ...  

A / as part of a filename needs to be escaped (or double escaped to be sure?) or
otherwise converted as part of the server-specific-conversion from filesystem to
ftp-url. If there is another odd ftp server out there we do not already handle
and it would use a / inside a filename, we would have problems, but also would
others, for sure.

 
ben, at that point in the source (in nsIndexedToHTML.cpp) we have no PWD at our
disposal. Also it is not guaranteed for all ftp servers to understand PWD. No
help here ...

Instead I would propose the following fix:

1. fix bug 122133 when we move to tinderbox 3, you can test the patch at that
bug now, but don't expect tinderbox to work correctly afterwards.
2. change nsIndexedToHTML.cpp for ftp urls so that it uses the normal resolve
mechanism with some special casing for // and /%2F.

This will have the following result:

- to much .. will be retained in ftp urls (and all other urls)
- the index for ftp-urls will always show a "Up to higher level directory"
unless the urlpath  is // or /%2F
- with user-home-ftp urls "Up to higher level directory" will always be shown
even if there is no more higher level, but the ftp-index does not know that.
Fortunally going up will not work when the highest level is reached.
Making also dependend on bug 122133
Depends on: 122133
Attached patch patch to fix the problem (obsolete) — Splinter Review
The attached patch fixes the problem I think. I integrated the special ftp urls
into the normal resolver for relative urls and used that in
nsIndexedToHTML.cpp. However this patch includes the one for bug 122133 (which
breaks tinderbox until tinberbox gets fixed), it also may need some better
string fu, but it is a start.
Attachment #125217 - Flags: review?(dougt)
Attachment #125217 - Flags: superreview?(darin)
It just occured to me that now that we have the url-scheme inside CoalesceDir we
can limit the new "too much .."-handling to ftp for now, thus avoiding the
tinderbox problem.
Attachment #125217 - Flags: superreview?(darin)
Attachment #125217 - Flags: review?(dougt)
Attached patch patch v2.0 (obsolete) — Splinter Review
This updated version of the patch limits the new ..-handling to ftp urls. That
limitation can be easily removed when tinderbox gets fixed or replaced.
Attachment #125217 - Attachment is obsolete: true
Attachment #126670 - Flags: superreview?(darin)
Attachment #126670 - Flags: review?(dougt)
Comment on attachment 126670 [details] [diff] [review]
patch v2.0

>Index: mozilla/netwerk/base/src/nsURLHelper.h

>+/* handle .. in dirs while resolving URLs */
>+void net_CoalesceDirs(const char* i_Scheme, char* io_Path);

i don't like the idea of this function taking a scheme parameter.
i think it would make more sense to have a flags parameter instead
and please add ample documentation in the header file to explain
the meaning of the various flags.


>Index: mozilla/netwerk/base/src/nsURLHelper.cpp

>+    /* Remember if this url is a special ftp one */
>+    if (strcasecmp(i_Scheme,"ftp") == 0) {
>+        if (strncasecmp(io_Path,"/%2F",4) == 0) {
>+            special_ftp = 4;
>+        } else if (strncasecmp(io_Path,"//",2) == 0 ) {
>+            special_ftp = 2;
>+        } 

strcasecmp and strncasecmp are not portable.  e.g., WIN32 builds will fail.

>+                // special handling for ftp urls
>+                if (special_ftp > 3 && urlPtr == io_Path) {
>+                    ++urlPtr;
>+                    ++urlPtr;
>+                    ++urlPtr;
>+                }

what's this all about?	please add comments in the code.


>+                // special handling for special ftp urls
>+                if (special_ftp > 3 && urlPtr == io_Path+special_ftp-1) {
>+                    ++urlPtr;
>+                } else {
>+                    *urlPtr++ = *fwdPtr;
>+                }

that comment doesn't help me understand what this code is doing.
i can clearly see that somethign special is being done for FTP, but
what is it that is being done? ;-)


>Index: mozilla/netwerk/base/src/nsStandardURL.cpp

>@@ -503,10 +503,15 @@
>     char *buf = (char *) nsMemory::Alloc(approxLen + 32);
>     if (!buf)
>         return NS_ERROR_OUT_OF_MEMORY;
>+    char *scheme = (char *) nsMemory::Alloc(mScheme.mLen + 1);
>+    if (!scheme)
>+        return NS_ERROR_OUT_OF_MEMORY;
>     PRUint32 i = 0;
>+    PRUint32 j = 0;
> 
>     if (mScheme.mLen > 0) {
>         i = AppendSegmentToBuf(buf, i, spec, mScheme);
>+        j = AppendSegmentToBuf(scheme, j, spec, mScheme);

why do you need to copy the scheme into a heap allocated buffer like this?
i think if you simply did your check for FTP here that you could avoid this
string copy.  also, are you sure we don't want to expose this "flag" via 
nsIStandardURL?  there is the urlType parameter to nsIStandardURL::Init,
which could be extended to be a more generic flags parameter.
Attachment #126670 - Flags: superreview?(darin) → superreview-
Attached patch patch v3.0 (obsolete) — Splinter Review
next version, incorporating darin's comments
Attachment #126670 - Attachment is obsolete: true
Attachment #126670 - Flags: review?(dougt)
Attachment #127105 - Flags: superreview?(darin)
Attachment #127105 - Flags: review?(dougt)
Comment on attachment 127105 [details] [diff] [review]
patch v3.0

looks fine.
Attachment #127105 - Flags: review?(dougt) → review+
Comment on attachment 127105 [details] [diff] [review]
patch v3.0

>Index: mozilla/netwerk/base/src/nsURLHelper.h

>+const PRUint32 URI_STD = 0;
>+
>+/* URI_IS_FTP needs to be set when you want special handling of
>+   FTP-URLs when doing CoalesceDirs: 
>+
>+   - retains .. that reach above dir root
>+   - recognizes /%2F and // as markers for the ftp servers root directory
>+     and handles them properly
>+*/
>+
>+const PRUint32 URI_IS_FTP = (1<<0);

nit: use an enum instead.  something like this perhaps:

enum netCoalesceFlags {
  NET_COALESCE_NORMAL  = 0,
  NET_COALESCE_FOR_FTP = 1
};

void net_CoalesceDirs(netCoalesceFlags flags, char *path);

however, as i said before, i think it is wrong to mention FTP at this level.
instead you should really define two separate flags, one for the reaching
above dir root and another for recognizing double slash markers as the root
dir.

enum netCoalesceFlags
{
  NET_COALESCE_NORMAL = 0,

  /**
   * retains /../ that reach above dir root (useful for FTP
   * servers in which the root of the FTP URL is not necessarily
   * the root of the FTP filesystem).
   */
  NET_COALESCE_ALLOW_RELATIVE_ROOT = 1<<0,

  /**
   * recognizes /%2F and // as markers for the root directory
   * and handles them properly.
   */
  NET_COALESCE_DOUBLE_SLASH_IS_ROOT = 1<<1
};

the point of separating these out is that i can easily imagine a protocol
having behavior similar to FTP.
Attachment #127105 - Flags: superreview?(darin) → superreview-
darin, I agree, its better so separate the stuff, why haven't I thought about
this :( I will work on this when I return from a business trip in two days.
thanks andreas!
Attached patch patch v4.0 (obsolete) — Splinter Review
new version, again using darins ideas
Attachment #127105 - Attachment is obsolete: true
Comment on attachment 127384 [details] [diff] [review]
patch v4.0

doug, want to carry over your review?
Attachment #127384 - Flags: superreview?(darin)
Attachment #127384 - Flags: review?(dougt)
Comment on attachment 127384 [details] [diff] [review]
patch v4.0

>Index: mozilla/netwerk/base/src/nsURLHelper.h

>+void net_CoalesceDirs(PRUint32 i_CoalesceFlags, char* io_Path);

i'm really not a big fan of this variable naming convention... and the
rest of the file doesn't follow this convention, so please keep things
consistent.

  void net_CoalesceDirs(netCoalesceFlags flags, char *path);

using |netCoalesceFlags| as the variable type is a good idea.


>Index: mozilla/netwerk/base/src/nsURLHelper.cpp

>+                } else {
>+                    *urlPtr++ = *fwdPtr;
>+                }
...
>         }
>         else
>         {

nit: bracket style is all over the place in this function.  maybe you
could make it consistent ;-)


>Index: mozilla/netwerk/base/src/nsStandardURL.h

>+    void     CoalescePath(PRUint32 CoalesceFlag, char *path);

  s/CoalesceFlag/coalesceFlag/


>Index: mozilla/netwerk/base/src/nsStandardURL.cpp

>+    if (mDirectory.mLen > 1) {
>+        PRUint32 CoalesceFlag = NET_COALESCE_NORMAL;

  s/CoalesceFlag/coalesceFlag/


>+        if ((mScheme.mLen == 3) && 
>+           !nsCRT::strncasecmp(buf + mScheme.mPos, "ftp", 3)) {

at this point, |buf| is the normalized URL string, so you can just
do this instead:

  strncmp(buf, "ftp:", 4)


>+    PRUint32 CoalesceFlag = NET_COALESCE_NORMAL;

  s/CoalesceFlag/coalesceFlag/


>+        // add some flags to CoalesceFlag if it is an ftp-url
>+        // need this later on when coalescing the resulting URL
>+        if ((scheme.mLen == 3) && 
>+            !nsCRT::strncasecmp(relpath + scheme.mPos, "ftp", 3)) {

hmm... maybe it would be good to create a helper function for this.
something like SegmentIs, but one that operates on a |const char*|
buffer instead of mSpec.


>+                nsCAutoString filename(Filename()); 
>+                if (filename.EqualsIgnoreCase("%2F")) {

avoid buffer copy...

   if (Filename().Equals(NS_LITERAL_CSTRING("%2F"),
			 nsCaseInsensitiveCStringComparator())) {
Attachment #127384 - Flags: superreview?(darin) → superreview-
Attached patch patch v5.0Splinter Review
darin, I hope this finally finds your blessing. Sorry to bother you this much
with all this little stuff. For the record: I did not like the usage of
netCoalesceFlags as variable type that much because I had to use some casts to
make it work. I changed the variable naming and made the bracket style
consistent. I also created another version of SegmentIs as helper function.
Attachment #127384 - Attachment is obsolete: true
Attachment #127384 - Flags: review?(dougt)
Attachment #127476 - Flags: superreview?(darin)
Attachment #127476 - Flags: review?(dougt)
Comment on attachment 127476 [details] [diff] [review]
patch v5.0

sr=darin :)
Attachment #127476 - Flags: superreview?(darin) → superreview+
Comment on attachment 127476 [details] [diff] [review]
patch v5.0

What kind of testing have you done?  What should we instruct QA to do to verify
these changes?

I am going to run these for a few days before marking reviewed.  I am doing
this cause I did have some local changes in my tree (yours, mine, and another
patch) and i saw some regressions.
So far I have seen no regressions, testing with various ftp sites. What problems
do you see?

What can be seen:

When you have a ftp url that starts with // or /%2F and you are at that level
the "Up one level" is missing from the directory listing.

When you have a ftp url that is relative to your login point (typical ftp-url),
you now have a "Up one level" entry in your login directory and you can move up
beyond your login point, in fact you have an "Up one level" every time, even if
you reach the root of the servers filesystem, it just does not do anything if
you want to go beyond that point.

All the three points in the initial report should be fixed.
Attachment #127476 - Flags: review?(dougt) → review+
fix checked in

Checking in mozilla/netwerk/base/src/nsURLHelper.h;
/cvsroot/mozilla/netwerk/base/src/nsURLHelper.h,v  <--  nsURLHelper.h
new revision: 1.25; previous revision: 1.24
done
Checking in mozilla/netwerk/base/src/nsURLHelper.cpp;
/cvsroot/mozilla/netwerk/base/src/nsURLHelper.cpp,v  <--  nsURLHelper.cpp
new revision: 1.52; previous revision: 1.51
done
Checking in mozilla/netwerk/base/src/nsStandardURL.h;
/cvsroot/mozilla/netwerk/base/src/nsStandardURL.h,v  <--  nsStandardURL.h
new revision: 1.15; previous revision: 1.14
done
Checking in mozilla/netwerk/base/src/nsStandardURL.cpp;
/cvsroot/mozilla/netwerk/base/src/nsStandardURL.cpp,v  <--  nsStandardURL.cpp
new revision: 1.55; previous revision: 1.54
done
Checking in mozilla/netwerk/streamconv/converters/nsIndexedToHTML.cpp;
/cvsroot/mozilla/netwerk/streamconv/converters/nsIndexedToHTML.cpp,v  <-- 
nsIndexedToHTML.cpp
new revision: 1.47; previous revision: 1.46
done
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
I'm very behind on some of these issues, so could someone help me understand if
bug 229550 is related to this fix?
VERIFIED:fixed
Mac OS X 1.6b 20040111.
Status: RESOLVED → VERIFIED
Whiteboard: checkwin, checklinux
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: