Closed Bug 212852 Opened 21 years ago Closed 18 years ago

Expand FTP-protocol guessing to ftp1.example.com, etc

Categories

(Core :: DOM: Navigation, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.9alpha1

People

(Reporter: jari.j.laaksonen, Assigned: mkmelin)

References

()

Details

Attachments

(1 file, 3 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)

Write ftpsearch.net in address field. Browser uses ftp protocol and shows
ftp://ftpsearch.net/ when the desired address is http://ftpsearch.net/

These are totally different services. There may be lots of similar addresses
where ftp and http server in same host provides totally different results.

Reproducible: Always

Steps to Reproduce:



Expected Results:  
Assume only http protocol if URL is written without protocol in address field.
Users expect a browser to show web pages, and if they are educated enough they
know to use ftp:// for ftp services.
very similar to bug 202196 (gopher, but comments mention ftp should be taken
care of).
re-assigning based on component of bug 202196.

is it worth it to run try HTTP first, then fallback to FTP for hosts like this?

relevant code at:
http://lxr.mozilla.org/seamonkey/source/docshell/base/nsDefaultURIFixup.cpp#274
Assignee: general → hewitt
Status: UNCONFIRMED → NEW
Component: Browser-General → Location Bar
Ever confirmed: true
QA Contact: general → claudius
OS: Windows 2000 → All
Hardware: PC → All
> is it worth it to run try HTTP first, then fallback to FTP for hosts like this?

No.

imo, we should assume ftp if the start is "ftp." or "ftpN" where N is a digit.

In any case, both this and bug 202196 are in the wrong component (though that
one is assigned to someone working on it, so that's ok).
Assignee: hewitt → adamlock
Component: Location Bar → Embedding: Docshell
QA Contact: claudius → adamlock
I sympathize, but in the absence of a properly formed URL, is it more likely
that ftpfoo.something refers to an ftp site or an http site?

In this instance there is an ftp://ftpsearch.net so which choice is correct?
I'd prefer what louis bennett said earlier: try HTTP first, then fallback to FTP
for hosts like this.

Software should not assume anything based on e.g. name of resource. Software
should act according to predefined pattern if multiple choices must be tried.

It is _user_ who assumes software to do something. As I said in the bug
description: Users expect a browser to show web pages. Also with current
versions of web browsers, users are accustomed to write URLs without scheme, and
usually they mean http scheme.
imho, 'ftp' in the host portion of a url where there's no schme specified should
defintely attempt http first. the current behavior is not intuitive.
>>I sympathize, but in the absence of a properly formed URL, is it more 
>>likely that ftpfoo.something refers to an ftp site or an http site?
>>In this instance there is an ftp://ftpsearch.net so which choice is 
>>correct?
Yes, it is an unfortunate situation.  I just happened to CC'ing myself to some
ftp bugs this evening and just happen to notice about bug #158148 with one of
the comments there that contain a link to
http://www.mozilla.org/projects/ui/communicator/browser/search/ ....  From the
look of it, it seem to be a correct behavior that way but you be the judge of it.

I don't like any of this assume the "hostname that is a scheme is the scheme of
the URL" logic. Like most network-based fixup logic, it was clever in 1996 and
now  outdated. Domain Guessing is a good example of this problem.

In this case, we should have been string matching "ftp.". I'm not sure I would
go as far as bz and support ftpN, because to me, that is why we have DNS.

We certainly should not be matching simply "ftp".

Also, it should be clear, there is no idea of fallback. The URL is fixed and
then a connection attempt is made. This is ideal, clever fallback logic is
usually bad (again, I will mention Domain Guessing).

One last comment, I think there is a typo in the comments:

275     //   ftp4.no-scheme,com

There shouldn't be a comma there, should there? "ftp4.no-scheme.com"
QA Contact: adamlock → benc
*** Bug 280435 has been marked as a duplicate of this bug. ***
Would it be possible to fix this silly longstanding bug before FF 1.1 ?

This is a browser for the love of god, not an FTP client. All URLs without a
protocol should be considered HTTP by default!

Another example of this: ftprush.com. both FTP and HTTP exist but FF goes to the
FTP one by default which is silly.
*** Bug 295117 has been marked as a duplicate of this bug. ***
Basically, after chatting on IRC with timeless, gavin, hannibal, and bsmedberg,
we came to the conclusion that ftp* should not be considered as ftp protocol.

Instead, URLs that are more likely to be FTPs, such as ftp5.blah.com.

Basically, a regex like this:
ftp\d*\.(?:.*\.)+.*
Gavin: 

From looking at the code, I'm not even sure if that code calls the URL parser
correctly. Then it does what I think is a string compare of the left side...

if (hostSpec.EqualsIgnoreCase("ftp", 3)) 
             uriString.Assign(NS_LITERAL_CSTRING("ftp://") + uriString);

It would be much simpler to remove the entire feature. I know how to code that
patch. :)
*** Bug 316221 has been marked as a duplicate of this bug. ***
Attached patch proposed fix (obsolete) — Splinter Review
Only assume ftp if the host name starts with "ftp."  or "ftpN.".
Assignee: adamlock → mkmelin+mozilla
Status: NEW → ASSIGNED
Attachment #246538 - Flags: review?(cbiesinger)
Comment on attachment 246538 [details] [diff] [review]
proposed fix

+          ++iter; ++iter; ++iter; // move past the "ftp" part

you could use .advance(3)

why do you have "start"? you never use it, only to initialize "iter".
Attachment #246538 - Flags: review?(cbiesinger) → review-
Attached patch proposed fix, v2 (obsolete) — Splinter Review
Addressing review comments.
I also added made it not assume ftp for the ftp\d*\.tld edge cases.
Attachment #246538 - Attachment is obsolete: true
Attachment #246602 - Flags: review?(cbiesinger)
Attachment #246602 - Flags: review?(cbiesinger) → review+
Attachment #246602 - Flags: superreview?(darin.moz)
Comment on attachment 246602 [details] [diff] [review]
proposed fix, v2

I think it'd be better if this code could be written without iterating over characters.  Perhaps you could make use of the FindChar function?  Also, the complexity of this code addition suggests that it should be broken out into a new subroutine.
Attached patch proposed fix, v3 (obsolete) — Splinter Review
I don't see how I'd get it done with FindChar... Broke it out to a method of it's own though. How does this look?
Attachment #246602 - Attachment is obsolete: true
Attachment #247202 - Flags: superreview?(darin.moz)
Attachment #246602 - Flags: superreview?(darin.moz)
Comment on attachment 247202 [details] [diff] [review]
proposed fix, v3

I think the parameter to IsLikelyFTP should be a |const nsCString&|, then you will not need to construct a nsCAutoString to use EqualsIgnoreCase
Attached patch proposed fix, v4Splinter Review
Addressing review comment.
(BTW, the r is still valid, no?)
Attachment #247202 - Attachment is obsolete: true
Attachment #247268 - Flags: superreview?(darin.moz)
Attachment #247202 - Flags: superreview?(darin.moz)
Attachment #247268 - Flags: superreview?(darin.moz) → superreview+
Whiteboard: [checkin needed]
mozilla/docshell/base/nsDefaultURIFixup.cpp 	1.51
mozilla/docshell/base/nsDefaultURIFixup.h 	1.11
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Whiteboard: [checkin needed]
Target Milestone: --- → mozilla1.9alpha
(I'm still unconvinced about ftp + some digit. Can someone give some real-world examples? Just for the record...)

(In reply to comment #24)
> (I'm still unconvinced about ftp + some digit. Can someone give some real-world
> examples? Just for the record...)

ftp://ftp2.de.debian.org/, ftp://ftp2.pl.freebsd.org/, ftp://ftp3.itu.ch/, ftp://ftp3.itu.int/ and ftp://ftp3.ds.pg.gda.pl/ are some examples that I found after a little bit of Googling. I think it's fairly common.
Why would we want to do this? Make this at least an option in "about:config".

There is several URLs, which are "ftp.example.com" but available both in http and ftp, one of examples is "ftp.debian.org". Mostly they are mirrors of each other, but the fact is that http is usually served better (more connections available to general public, more robust etc) than ftp these days (yeah, ftp is an old protocol, http is newer and more developed for distribution of files across the Internet - which is what a browser basically does - downloads files and shows them to the user. So there would be more logic if we don't try to "guess" ftp from the domain name but instead default to http. A browser is mostly used to browse pages with the http protocol use, ftp client is mostly a nice addition - why defaulting to ftp then?
In bug 1174254, we might disable FTP protocol guessing entirely.
Summary: Browser assumes ftp protocol for URL's beginning with 'ftp' and written without protocol in address field → Expand FTP-protocol guessing to ftp1.example.com, etc
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: