Closed
Bug 84242
Opened 23 years ago
Closed 23 years ago
FTP URL parsing broken
Categories
(Core Graveyard :: Networking: FTP, defect)
Core Graveyard
Networking: FTP
Tracking
(Not tracked)
RESOLVED
FIXED
Future
People
(Reporter: thanny, Assigned: andreas.otte)
References
Details
(Keywords: testcase)
Attachments
(1 file, 5 obsolete files)
7.30 KB,
patch
|
bbaetz
:
review+
darin.moz
:
superreview+
|
Details | Diff | Splinter Review |
Here are two URL's that Mozilla 0.9 fails to properly parse: ftp://server/%2fF%2finet%2fapache%2fwebpages%2frepository/beach_bit_foggy.JPG ftp://thanny:[password]@server/%2fh%2fdownload/hubble5_hst.jpg Here's what Mozilla should have ended up with, according to RFC1738 and RFC2396: RETR /F/inet/apache/webpages/repository/beach_bit_foggy.JPG RETR /h/download/hubble5_hst.jpg Here's what Mozilla actually did: RETR //F/inet/apache/webpages/repository/beach_bit_foggy.JPG RETR //h/download/hubble5_hst.jpg The key here is that the slash separating the host information from the path information is *not* supposed to be part of the path in FTP URLs. Mozilla ignores this rule, as does all previous versions of Netscape that I have at my disposal. I suspect this bug has gone unnoticed because most Unix servers don't blink at an extra directory slash at the front. It does, however, break other FTP servers, which quite properly report an error at the invalid path information provided. Nixing the initial encoded slash makes Mozilla work, but at the expense of providing an incorrect URL that an RFC-compliant client will not be able to parse properly. Imagine, if you will, a client which follows the FTP scheme defined in RFC1738 (or is there a superceding document? - I can't find one). Such a client would do a series of CWD's for each slash-delimited path component. That is, *before* %-encoded slashes are decoded. Here's what a strictly RFC-compliant FTP client would do given the following URLs: ----- ftp://server/%2fF%2finet%2fapache%2fwebpages%2frepository/beach_bit_foggy.JPG ... CWD /F/inet/apache/webpages/repository ... RETR beach_bit_foggy.JPG ----- ----- ftp://server/%2fF%2finet/apache/webpages/repository/beach_bit_foggy.JPG ... CWD /F/inet CWD apache CWD webpages CWD repository ... RETR beach_bit_foggy.JPG ----- If one "fixes" the URL to be compatible with Mozilla, by removing the (encoded) leading directory slash, the RFC-compliant client would (for the second URL immediately above) do the following: ... CWD F/inet CWD apache CWD webpages CWD repository ... RETR beach_bit_foggy.JPG Given a login directory other than the immediate parent of the tree "F/inet", the result will be failure. This isn't an encoding-triggered problem, either. Consider this: ftp://server/repository/beach_bit_foggy.JPG Assume that the login-directory is the immediate parent of "repository". The RFC-compliant client will do this: ... CWD repository ... RETR beach_bit_foggy.JPG Mozilla does this: ... RETR /repository/beach_bit_foggy.JPG ... This happens to work with most FTP servers (ones on Unix, and those emulating Unix FTP servers). But any server would be behaving perfectly properly by refusing the above request on the grounds that there is not such directory as "/repository". Strictly speaking, since path delimiters are not part of the FTP spec, Mozilla should not be attempting retrievals in this fashion at all. It should instead be performing CWD's on each slash-delimited path component (which may or may not path delimiters, encoded if necessary). An example of where Mozilla fails is with a non-Unix-emulating OS/2 FTP server. This is a valid URL, which allows an RFC-compliant FTP client to retrieve the file: ftp://thanny:[password]@localhost:2222/h%3a%5cdownload/hubble5_hst.jpg Mozilla ends up sending this: RETR /h:\download/hubble5_hst.jpg The OS/2 FTP server quite properly chokes on this completely invalid pathname. The RFC-compliant client, however, got to the file in this manner: ... CWD h:\download ... RETR hubble5_hst.jpg This what a client is supposed to do, according to the only remotely standardized information available on FTP URL schemes. It would be silly for Mozilla to remain broken. Obviously, the program should attempt a transfer based on the incorrect URL parsing, since countless HTML authors put bodged FTP URLs into their documents, if correct parsing fails to obtain a favorable result (there's definitely room for the problem of getting the incorrect file here, but it's unlikely).
Comment 1•23 years ago
|
||
Can you please give real URLs for these rather than made up URLs?
Reporter | ||
Comment 2•23 years ago
|
||
Those *are* real URLs. They point to files on my internal LAN. I used an IP trace to figure out exactly what Mozilla was doing (which revealed the odd fact that it's chopping network packets up into absurdly small pieces, incidentally). If you're explicitly looking for a URL that points to a server on the Internet which won't be able to handle Mozilla's improper request, then I can't help you. I don't have a list of FTP servers which would let me find troublesome ones. But if you've got an OS/2 machine, just run the built-in FTP server (properly configured, of course), and create a URL for any file on that machine. The complete inability of Mozilla to retrieve a file from the built-in OS/2 FTP server that does not reside on the login drive will become apparent. If you're simply looking for a URL which will show Mozilla improperly parsing, virtually any will do - just put "%2F" after the host-path delimiter for a Unix server. That won't prevent a successful transfer, but it will show how Mozilla sends an improper request. But I've already explained in copious detail what Mozilla is doing, and in equally copious detail what it should be doing.
-> New. I'm buying this b/c we need to get on the fasttrack if it's going to be fixed for rtm.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Updated•23 years ago
|
Target Milestone: --- → mozilla0.9.3
Updated•23 years ago
|
Target Milestone: mozilla0.9.3 → mozilla1.0
Comment 4•23 years ago
|
||
what is milestone "mozilla1.0" anyway? Moving to future.
Target Milestone: mozilla1.0 → Future
Comment 5•23 years ago
|
||
I'm taking this out of the OS/2 queue since nothing about it is OS/2 related
OS: OS/2 → All
Hardware: PC → All
Comment 6•23 years ago
|
||
This is invalid, I think RFC1738 is updated by RFC2396, which states in its changes section (G.2): " RFC 1738 specified that the path was separated from the authority portion of a URI by a slash. RFC 1808 followed suit, but with a fudge of carrying around the separator as a "prefix" in order to describe the parsing algorithm. RFC 1630 never had this problem, since it considered the slash to be part of the path. In writing this specification, it was found to be impossible to accurately describe and retain the difference between the two URI <foo:/bar> and <foo:bar> without either considering the slash to be part of the path (as corresponds to actual practice) or creating a separate component just to hold that slash. We chose the former. " Thus the / is part of the path. (Also, the ; syntax has been obsoleted) Note the difference between ftp://server/, which goes to the root dir of the server, and ftp://server, which goes to the login directory. (There is an unconfirmed bug that this is currently broken in mozilla, though) Regarding sending a CWD separately, the problem with that is that, as RFC1738 section 3.2.5 says, is that the control connection would have to be recreated each time the user changes directories. That would suck. Does any client do that? dougt?
Reporter | ||
Comment 7•23 years ago
|
||
>This is invalid, I think RFC1738 is updated by RFC2396, which states in its >changes section (G.2): > >" RFC 1738 specified that the path was separated from the authority > portion of a URI by a slash. RFC 1808 followed suit, but with a > fudge of carrying around the separator as a "prefix" in order to > describe the parsing algorithm. RFC 1630 never had this problem, > since it considered the slash to be part of the path. In writing > this specification, it was found to be impossible to accurately > describe and retain the difference between the two URI > <foo:/bar> and <foo:bar> > without either considering the slash to be part of the path (as > corresponds to actual practice) or creating a separate component just > to hold that slash. We chose the former. >" > >Thus the / is part of the path. > >(Also, the ; syntax has been obsoleted) I just read through RFC2396 briefly, and it's quite a mess, at least with regards to the path component. There's no way anyone could possibly write a parser based on that document. It doesn't once say how the host component is to be separated from the path component, so one has to return to RFC1738. Outside of the above, it does not address how the host-path delimiter is to be treated. Finally, the above is, quite frankly, gibberish. The tokens <foo:/bar> and <foo:bar> are meaningless. Unless... >Note the difference between ftp://server/, which goes to the root dir of the >server, and ftp://server, which goes to the login directory. (There is an >unconfirmed bug that this is currently broken in mozilla, though) If those tokens are supposed to represent your example here, then their comments become the part that's gibberish. It is entirely possible to determine the difference between the two URLs above, if one simply follows the description laid out by RFC1738. The simple fact is, that by any sensible standard, "ftp://server" and "ftp://server/" do in fact point to the exact same resource. The alternative (with the latter indicating the root directory) makes URLs completely functionless for retrieving files relative to the login directory. For any server which regularly changes the underlying file system structure of the login directories, a sensibly parsed (i.e. according to RFC1738) URL would make the changes transparent (provided, of course, they were relative to the login directory). The approach that includes the delimiter in the path would make URLs invalid whenever a change was made. It would makes it impossible to retrieve resources located on machines which do not use the forward slash character as a path delimiter. In other words, what RFC2396 suggest is simply daft. The fact is that doing it the RFC1738 way will work on just about every server already in operation. At the very least, after attempting the full-path-including-delimiter approach, and failing, Mozilla should parse the URL sensibly. To do otherwise is to go the road of "it works on most systems", when a relatively simple change would make it work on all. >Regarding sending a CWD separately, the problem with that is that, as RFC1738 >section 3.2.5 says, is that the control connection would have to be recreated >each time the user changes directories. That would suck. Does any client do >that? No. That section addresses a change of URL where the server and login information remain the same, not a change of directory. Because all path information (when treated sensibly) is relative to the login directory, a completely new URL will not necessarily work from the current directory. The somewhat simple (and perfectly reliable) solution to that problem is to perform a PWD upon login, and after each directory change. Store and tie the results to the displayed page, and change back to the login directory before processing any directory changes from a new URL. Note that this wouldn't apply at all to downloading files in the current working directory. Mozilla knows that these files are in the current directory, and treating their retrieval as the processing of a new URL is nonsensical.
Comment 9•23 years ago
|
||
Yes, RFC2396 has lots of problems. You cannot use RFC1738 to work out what to do, since RFC2396 explicitly says that its changing the behaviour. "Store and tie the results to the displayed page". The ftp viewer doesn't have state associated with it, and really can't. Does any web browser do what you expect? cc andreas for comment
Reporter | ||
Comment 10•23 years ago
|
||
>Yes, RFC2396 has lots of problems. >You cannot use RFC1738 to work out what to do, since RFC2396 explicitly >says that its changing the behaviour. It says that it's updating RFC1738, but it does not in fact define any changed behavior for FTP URL's. A few indecipherable sentences hardly qualify as a standing change. >"Store and tie the results to the displayed page". The ftp viewer doesn't >have state associated with it, and really can't. It's really not necessary, either. Any clickable links are going to be created by Mozilla, so the possibility of clicking on a new URL for that session which has the same host and user information, but not an absolute path, won't really be an issue. Care just needs to be taken that any non-absolute paths are handled by a new control connection, and that all Mozilla-generated URL's in the FTP viewer are absolute. I'm sure there will be snags here and there, but overall, it's workable. >Does any web browser do what you expect? The only ones I've tested are Mozilla (in NS 2.x, 4.x, and the betas) and IBM WebExplorer. I try to avoid IE whenever possible. WebEx doesn't do CWD's with each path component, but it does properly exclude the host/path separator slash. This makes it possible to use relative Unix paths, but since it doesn't do a CWD for each path component, it's still not ideal.
Comment 11•23 years ago
|
||
> It says that it's updating RFC1738, but it does not in fact define any > changed behavior for FTP URL's. It defines changed behaviour for all urls, and ftp is included in that set. > A few indecipherable sentences hardly > qualify as a standing change. Unfortunately it does in this case. Yes, the URL RFCs are ambiguous in lots of places - see the bugs + discussions on what <a href="?foo"> should do. (summary - RFC2396 has an error in some bits) dougt? andreas? comments? Ignoring the leading / will allow us to support the home directory stuff a little easier. However, it will break almost every ftp link in existance. I'm really against this unless you can point me to an ftp server + link to it where we fail but other web browsers succeed.
Assignee | ||
Comment 12•23 years ago
|
||
This has nothing to do with the core url parsing, this is ftp specific. If I remember correctly there is a whole set of *hardcoded* identifiers somewhere inside necko that deals with different ftp servers and their different behaviour when it comes to handle directorys. Judson might know something about it.
Comment 13•23 years ago
|
||
andreas: this is to do with a link in an <a> tag, not the direcotry parsing stuff.
Assignee | ||
Comment 14•23 years ago
|
||
Hmmm ... I can't find it at the moment but I have the distinct memory about some code that dealt with creating the right commands from ftp urls based on the ftp server that was used. Maybe this code needs to be adapted. One other thing I noticed: The url ftp://thanny:[password]@localhost:2222/h%3a%5cdownload/hubble5_hst.jpg maybe valid (because of it's encoding) but don't expect it to work because interpreting %5c as a path identifier (seems to be intended) is clearly wrong. The url should look like: ftp://thanny:[password]@localhost:2222/h|/download/hubble5_hst.jpg or ftp://thanny:[password]@localhost:2222/h%3a/download/hubble5_hst.jpg.
Comment 15•23 years ago
|
||
I'll confess that I haven't read RFC 2396 carefully, but I don't see any ftp: URL specific BNF that postdates RFC 1738 and RFC 1630. In that light, section "3.2.2. FTP url-path" of RFC 1738 becomes very relevant, especially the comments here: Within a name or CWD component, the characters "/" and ";" are reserved and must be encoded. The components are decoded prior to their use in the FTP protocol. In particular, if the appropriate FTP sequence to access a particular file requires supplying a string containing a "/" as an argument to a CWD or RETR command, it is necessary to encode each "/". For example, the URL <URL:ftp://myname@host.dom/%2Fetc/motd> is interpreted by FTP-ing to "host.dom", logging in as "myname" (prompting for a password if it is asked for), and then executing "CWD /etc" and then "RETR motd". This has a different meaning from <URL:ftp://myname@host.dom/etc/motd> which would "CWD etc" and then "RETR motd"; the initial "CWD" might be executed relative to the default directory for "myname". On the other hand, <URL:ftp://myname@host.dom//etc/motd>, would "CWD " with a null argument, then "CWD etc", and then "RETR motd". Additionally: "3.2.4 Hierarchy For some file systems, the "/" used to denote the hierarchical structure of the URL corresponds to the delimiter used to construct a file name hierarchy, and thus, the filename will look similar to the URL path. This does NOT mean that the URL is a Unix filename. " thanny: thanks for writing this bug. I was just about to try to create a similar test case, but I don't think my examples would have been as illustrative as yours. Two general points about URL's I've learned recently: 1- For accesssing hierarcies, they messed up by not describing "/" path as "/<VOID>" (which is what it is in a lot of cases), and then explaining where it goes in each URL scheme. For example, in HTTP, this "3rd slash is not part of the path" principle is a not true (you ask for "/" when you mean "/". So right out the door, general theory did not match specific practice. In FTP, it is implicitly easy to figure out where "/", b/c the FTP server presents a filespace from a mount point on local disk (not always a real "/" either...) In file, you can kind of deduce that it should be the top of the filesystem, so it's "/" in UNIX (otherwise file:/// != "/"), but that wasn't so bright because I don't think every filesystem allows every file-accessing user transparent visibility to the top of the file system. (In fact, whoever made "file:/// -> "C:\" in Windows has argued otherwise via the current Mozilla Win32 file implementation). 2- URL's have two fundamental structures: Stateless systems get connection data and then an object reference (HTTP is basically a socket + the target of "GET"). ftpurl = "ftp://" login [ "/" fpath [ ";type=" ftptype ]] fpath = fsegment *[ "/" fsegment ] Connection based systems get a single-line script. (FTP is a socket, loging and password data, then a sequence of CWD commands, ending w/ RETR, with some optional commands along the way). You can't just map the fpath to a single CWD unless you KNOW "/" is the only filesystem delimiter. I need to look at the FTP RFC more, but I'm reasonably sure that RFC 1738 provides a VMS example that would probably break. If you take RFC 1738, you can write URL's that have the Mozilla behavior, by encoding the filesystem delimiters "/", and having only one fpath. You also have a syntax that is robustly cross-file system compatible because it "walks" down the file system, rather than taking a presumptive short-cut. I'll continue to look for newer RFC's that clarify this issue.
Keywords: testcase
Comment 16•23 years ago
|
||
benc: please read my comments.
Assignee | ||
Comment 17•23 years ago
|
||
Assignee | ||
Comment 18•23 years ago
|
||
The above patch removes the first / from the path and if the result is empty replaces it with a . Seems to work fine for me so far. Anyone want to give it a testdrive?
Assignee | ||
Comment 19•23 years ago
|
||
Sometimes navigating through a directory tree with ftp results in "file not found" the first time, clicking again on the same item works. Don't know if it is related to the patch. Anyone else sees this?
Assignee | ||
Comment 20•23 years ago
|
||
Okay it's related. We just can't navigate through the tree with the whole path. When the navigation fails we do a reconnect and the it works again. RFC1630 suggests a reconnect should be done with every CWD command. This would also make sense in this case since it would reset to the home directory again.
Comment 21•23 years ago
|
||
Yes, and theres an XXX comment on us not disconnectiong the control connection for a file not found. I know that the RFC says to do this, and I consider it unacceptable from a perf point of view. What I suggest (and what we used to do, although I still can't find code which did so) was that when we log on with no directory, then we get teh current directory, and update the url bar (and thus the relative stuff) to match. This would almost match the 4.x behaviour. 4.x differentiated between ftp://host and ftp://host/, and only did this for the first of these. We used to do the same, but I suspect that one of the urlparser rewrites lost this difference, and then one of the ftp rewrites lost the ftp code to handle the difference. See bug 95277 - I was planning on digging through the ftp code from March this weekend to see how we handled this then. I strongly suspect that this bug would be fixed by fixing that bug.
Assignee | ||
Comment 22•23 years ago
|
||
There is no longer a difference between ftp://host and ftp://host/. The urlparser will move ftp://host to ftp://host/ before the protocol has a chance, and I see no reason to change this general behaviour that for this bug. Is there a way to figure out if a list item is a file or a directory? If so there is a way to deal with this. I have a version that operates always from the login point but currently does sometimes CWDs on files.
Comment 23•23 years ago
|
||
I know that there is no longer a difference. However, thats how ns4 does it - again, see the other bug. We figure this out by trying to CWD into the url, and if that fails, then we try to RETR it (assuming that its a file). Thats the only way, so attempting to CWD into files is correct, as long as we don't get confused afterwards. I'm interested to see a patch which doesn't mean that we have to get a new control connection each time, and which updates the urlbar to the 'real' path....
Assignee | ||
Comment 24•23 years ago
|
||
Assignee | ||
Comment 25•23 years ago
|
||
Is it wise to disclose the directory structure from the root? Usually I start from a users directory and relative to it. No need to show where it is located on the server.
Comment 26•23 years ago
|
||
That patch breaks ns4 compatability. I know that that is what the rfc says, but see my comments (based on the later rfc) for why we shouldn't do this. We cannot break every ftp link in existance - consider an ftp mirror with a home directory for the anonymous user of /pub. With your patch, ftp://foo.com/pub/mirrors/file.txt would load ~/pub/mirrors/file.txt -> /pub/pub/mirrors/file.txt, which is wrong. You also need to update the urlbar, since from the client side there is nothing stopping them from going up a directory level, and this would probably stuff up the caching we do. Again, ns4 does that (although it doesn't do caching, IIRC) Besides, the ~ notation is not standard, AFAIK. You cannot rely on it.
Reporter | ||
Comment 27•23 years ago
|
||
A few notes: RFC2396 is useless as a guide to handling FTP URLs. It provides no coherent information on them at all. So, RFC1738 is where we have to look for behavior. It just so happens that RFC1738 is both coherent and very explicit. Contrary to what has been posted, properly parsing URLs (issuing iterative CWDs) would not break every public FTP link in existence. In fact, I've yet to see a single one that was broken. The URL fetcher program I wrote does URL parsing according to RFC1738, and it has thus far never failed to retrieve a file in a public FTP path. An example, which reflects all public FTP sites I've seen: ftp://hobbes.nmsu.edu/pub/os2/apps/internet/ftp/client/00global.txt After logging in as "anonymous", the client should issue the following commands (at a minimum): CWD pub CWD os2 CWD apps CWD internet CWD ftp CWD client PORT [add,port] RETR 00global.txt This correctly retrieves the file pointed to. Every /pub/* URL I've seen works in this fashion. If the server weren't Unix, it'd still work, because no path delimiter is assumed by properly parsing the URL. To address some earlier comments: >One other thing I noticed: The url >ftp://thanny:[password]@localhost:2222/h%3a%5cdownload/hubble5_hst.jpg maybe >valid (because of it's encoding) but don't expect it to work because >interpreting %5c as a path identifier (seems to be intended) is clearly wrong. The URL above is not clearly wrong. In fact, it is clearly correct. There's a transport, host, and path segment - all properly delimited. The path segment has two parts - a CWD argument, and a file to retrieve. It does not in fact matter what purpose the %5c character has, since the FTP client is entirely ignorant of it. It's task is to send the path component string, decoded, as an argument to CWD, period. It just so happens that in the above example, %5c ("\") is, in fact, a path delimiter. Passing the entire string as the argument to a CWD works, because it's a completely proper path for the FTP server in question.
Assignee | ||
Comment 28•23 years ago
|
||
I agree mostly with Mike. RFC 2396 is useless regarding ftp urls. I would propose the following: 1) get the path 2) remove leading slashes 3) unescape it This would work on all given examples from RFC 1738. Then we can work as usual, just do one CWD/RETR with the given path. The result should be the same as doing incremental cwds. After showing a listing or showing/downloading a file we work with the cached connection for the next access. That is dangerous for the reasons given in RFC 1738. When there is no reliable way to get back to the login point then we have to login again to force it. Instead just password and servertype should be cached and the control connection not be reused, again logging in and be at the right place (user home) for the next access. I dont't think that is such an performance hit. I also have to agree with Mike on his comments about my comments about ftp://thanny:[password]@localhost:2222/h%3a%5cdownload/hubble5_hst.jpg He is right, there is no problem.
Assignee | ||
Comment 29•23 years ago
|
||
Assignee | ||
Comment 30•23 years ago
|
||
Running with the patch above I was able to access and navigate every site so far.
Comment 31•23 years ago
|
||
No. We cache control connections for a reason. It was really annoying on ns4 to try to ftp to a busy site, and have every click give you another "too many users" error. Also see bug 94354 for another parsing issue. I looked through the mozclassic code. What it does is: If the path is empty, (ie ftp://ftp.netscape.com) _or_ the path starts with /./ (bug 94354), then we send PWD, and tack that onto the front of the request. So to retrieve the file "foo" from your home directory, you use ftp://server/./foo I will also note that (While its broken now due to bug 103514), mkaply had os2 ftp servers working. The problem with your argument about disconnecting the control connection each time is this: Imagine that the initial url is a directory listing. We get a list of files/dirs within that directory. Now we need to print out this list in html. Our code (all over the place) will translate that into an absolute url before it gets to ftp. So now we have a brand new ftp url. There is no difference between doing CWD /this_new_url, and recreating the control connection and then doing that. So I agree that in some extreme edge cases, an ftp server may not work with mozilla, ns4, ns6, and lynx (at least). But recreating the control connection won't help that. Getting back to the original position can be down by issuing a PWD first. Then we can CWD back to that initial directory if we had to, but since the base url would be changed by prepending that, we won't need it. CWD `PWD at login` must take you back to the original login position. Again, your way does not allow me to ftp to /etc on my machine, when my home directory is /home/bbaetz. In the patch itsself, you should test *fwdPtr=='\0', rather than PL_strlen. But I'm still convinced that it is the wrong approach. I think I'll have to put aside some time this weekend to work on a patch for my approach.
Assignee | ||
Comment 32•23 years ago
|
||
You can go to /etc exactly as specified by RFC 1738, just do ftp://user@host/%2Fetc. I will take a look at the PWD stuff.
Assignee | ||
Comment 33•23 years ago
|
||
Assignee | ||
Comment 34•23 years ago
|
||
This patch uses pwd to find and store the login path after login. Then it uses it to create the correct path for CWD/RETR. The connection is again cached. Also still compliant with RFC 1738 regarding the url syntax although it does no subsequent CWDs to get to the right point.
Comment 35•23 years ago
|
||
OK, I like this patch a lot more. What do you think about only removing the first / ? I know that this isn't what the rfc says, but it would seem to better match current browser behaviour. I still think that we will get complaints about this being the incorrect thing to do, but I think that that is ok. What does ie do? I'd still like some way to update the url bar, though. darin: Theres an nsIHttpEventSink which takes an nsIHttpChannel as the first paramater to OnRedirect. Could we make this more generic, somehow? The other concern is taht we don't update the cache entry key for the correct url. This worries me, although I can't think of a particular case where we'd get teh wrong data, . Chaning which user you log in as could do it, though.
Assignee | ||
Comment 36•23 years ago
|
||
Personally I think, we can go with only removing the first slash. It might be a good compromise between what 4.x does and what the rfc says, although it is very clear that we violate rfc 1738 with it. Mike, what's your opinion on this? Also I like it very much that we currently hide the login path from the user in the ftp url. I see no gain in exposing it to the user. It would irritate we very much when the ftp url changes that much after the login.
Reporter | ||
Comment 37•23 years ago
|
||
>------- Additional Comments From Andreas Otte 2001-10-12 03:43 ------- >Personally I think, we can go with only removing the first slash. It might be >a good compromise between what 4.x does and what the rfc says, although it is >very clear that we violate rfc 1738 with it. Mike, what's your opinion on >this? Well, it would be an improvement. However, it still leaves open the possibility of problems for any FTP server running on an operating system which doesn't use the '/' character to deliminate path elements, and doesn't explicitly emulate Unix functionality. The VMS example in RFC1738 serves well to illustrate the potential problem. Unless the FTP server in use acted like it was a Unix server, translating all '/'-based paths to VMS paths, simply removing the first slash would solve nothing for such a system. Of course, I can only speculate that such FTP servers exist. I don't have a VAX laying around to test with <g>. Even the basic OS/2 FTP server will recognize a '/' as being analogous to '\', and work with such a system. So, it will almost certainly work with the vast majority of systems. It's still easy, of course, to imagine a setup where that would fail, for any number of reasons, where doing iterative CWD's would work. It would be my nature to do things completely properly, but I would not be opposed to adopting the simplest possible solution, and dealing with the problems if and when they arise in the future. >Also I like it very much that we currently hide the login path from the user >in the ftp url. I see no gain in exposing it to the user. It would irritate we >very much when the ftp url changes that much after the login. I don't see why it'd be necessary to update the URL. Just do what was already stated previously - store the result of a PWD upon login, and use it as the parameter to a CWD before issuing any CWD's or RETR's for links created by the FTP browsing code. On a related note, re-establishing the control connection should only be done when necessary. I haven't given it a great deal of thought, but comparing the accessed URL with the established control connection should allow a correct decision about whether or not to reconnect (or make a new connection - how many open control connections to maintain is another issue entirely). If the user/host and first path component are the same, clearly the existing control connection is sufficient - simply CWD to the login PWD before continuing. If the first path component is different, however, a new control connection is the only way to guarantee that the correct resource is accessed. For a Unix system, if the first character of the first path component (after decoding, of course) is a slash, then the path is absolute, and a new control connection is not required. Any FTP server with a SYST response that has "UNIX" (case-insensitive, I'd think) as the first word would allow such an assumption to be safe. Otherwise, trying to determine whether or not a path is absolute is inappropriate (unless customized for other reported operating system types). With a relative first path component that doesn't match the first path component of the new URL, a new control connection is necessary. This situation should never occur with any links generated by the FTP browser, of course, so it will be fairly rare. Finally, on a tangentially related note, I notice in passing that the program WGET already discards the first slash, and I find it unlikely that such behavior would remain if it caused problems in retrieving URLs.
Comment 38•23 years ago
|
||
the mozillaclassic code has every line dealing with paths wrapped in an: if (foo->server_Type == VMS) { ... } else { ... } Since we don't parse vms directory listings, I think that spending time working out the correct behaviour isn't worth it. We should update the url because: a) 4.x did it b) Its clearer exactly what path you are using Note that the <title> of the page could be made correct reasonably easily - just look at where the 300: line is printed out for directory listings, and change that. Its probably not worth the confusion though.
Assignee | ||
Comment 39•23 years ago
|
||
Assignee | ||
Updated•23 years ago
|
Attachment #52429 -
Attachment is obsolete: true
Assignee | ||
Updated•23 years ago
|
Attachment #52465 -
Attachment is obsolete: true
Assignee | ||
Updated•23 years ago
|
Attachment #52997 -
Attachment is obsolete: true
Assignee | ||
Comment 40•23 years ago
|
||
Maybe this can go in first and then see about updating the urlbar with the whole path in a second step. The last patch only removes the first leading slash not all of them, no need to mask a slash any more.
Assignee | ||
Comment 41•23 years ago
|
||
Bradley: Any decision yet on this? Shall we go with what we have for now (latest patch, which fixes FTP url parsing) and future the update of the urlbar or wait for that?
Comment 42•23 years ago
|
||
Comment on attachment 53486 [details] [diff] [review] path that only removes the first slash >+FTP_STATE >+nsFtpState::R_pwd() { >+ if (mResponseCode/100 != 2) >+ return FTP_ERROR; >+ nsCAutoString respStr(mResponseMsg); >+ PRInt32 pos = respStr.Find("\""); >+ if (pos > -1) { >+ respStr.Cut(0,pos+1); >+ pos = respStr.Find("\""); You should change all those Find calls to FindChar. >@@ -1800,7 +1851,20 @@ > rv = mURL->GetPath(&path); > > if (NS_FAILED(rv)) return rv; >- mPath.Adopt(nsUnescape(path)); >+ // Skip leading slash >+ char* fwdPtr= (char*) path; >+ if (fwdPtr && (*fwdPtr != '\0') && (*fwdPtr == '/')) >+ fwdPtr++; You don't need the != '\0' if you're checking for == '/', also, change the cast to char* to an NS_CONST_CAST of path.get(). r=bbaetz if you do that. This is going to mean a change in behaviour, though, and I'm nervous :) If someone comes up with a case which the old code could handle (or is in common use), but this code breaks, and isn't easy to change to, then we'll have to reconsider.
Attachment #53486 -
Flags: review+
Assignee | ||
Comment 43•23 years ago
|
||
Assignee | ||
Updated•23 years ago
|
Attachment #53165 -
Attachment is obsolete: true
Assignee | ||
Updated•23 years ago
|
Attachment #53486 -
Attachment is obsolete: true
Comment 44•23 years ago
|
||
Comment on attachment 54417 [details] [diff] [review] patch incorporating Bradleys comments sr=darin
Attachment #54417 -
Flags: superreview+
Comment 45•23 years ago
|
||
Comment on attachment 54417 [details] [diff] [review] patch incorporating Bradleys comments remarking r=bbaetz on new patch
Attachment #54417 -
Flags: review+
Comment 46•23 years ago
|
||
->patch author. Andreas, do you want me to check this in?
Assignee: bbaetz → andreas.otte
Assignee | ||
Comment 47•23 years ago
|
||
I will check it in, but that will happen on the weekend since I'm away for three days. If you want to have it in prior to that please step in, but I can't go on the hook during those three days.
Assignee | ||
Comment 48•23 years ago
|
||
fix checked in.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 49•23 years ago
|
||
At least once I have seen a crash when navigating through the ftp-tree that might be related to this. I was unsuccesfull in recreating it, maybe an uninitialized variable that is now exposed, I don't know. It would be nice if someone else could check this out and have a backtrace if the crash happens too. Currently I see this when trying gdb (after updating to SuSE 7.3) mit mozilla: [New Thread 1024 (LWP 9414)] Assertion failure: 0.5 <= maxAlpha && maxAlpha < 1 && 0 <= minAlpha, at pldhash.c:227 Program received signal SIGABRT, Aborted. [Switching to Thread 1024 (LWP 9414)] 0x40601861 in kill () from /lib/libc.so.6
Comment 50•21 years ago
|
||
Still problematic in version 1.2.1 ( mozilla-1.2.1-26 on Red Hat Linux 9 ) There is a bootload of failing ( legal ) links listed at http://rcum.uni- mb.si/~uel003r2a/ftp_url_test.html ( there are HTML comments for almost every failure , what and how went wrong ) As I can not reopen this bug, somebody else should or I will open a new one.
Comment 51•21 years ago
|
||
Hmm, the link I posted was broken in two. Here it is again : http://rcum.uni-mb.si/~uel003r2a/ftp_url_test.html
Comment 52•21 years ago
|
||
David: please file a new bug. thx!
Updated•3 months ago
|
Product: Core → Core Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•