Closed Bug 32966 (relative-http) Opened 24 years ago Closed 22 years ago

URL: http:/ (one slash) treated as http:// rather than /

Categories

(Core :: Networking, defect, P3)

defect

Tracking

()

VERIFIED FIXED
mozilla1.0.2

People

(Reporter: jkane, Assigned: andreas.otte)

References

()

Details

(Keywords: compat, testcase, topembed)

Attachments

(1 file, 1 obsolete file)

specifically, <form action="http:/dirone/target> (I know, a dumb thing way to do
it, but it works in communicator and ie).  Not only does it fail, it actually
gets all the way to looking up www.dirone.com

Expected result is of course that "http:/dirone/target" is functionally
identical to "/dirone/target".  (I'm tracking down the uses of this silly syntax
on my site, but I'm sure there are more of them out there).
eric, I am sure that this actually your issue.
Assignee: rods → pollmann
I wonder if <A HREF="http:/dirone/target> would have the same effect?  If so
perhaps this should go to netlib.
Keywords: 4xp
Summary: Common (but out-of-spec?) form action url's break the beast → http:/ (one slash) treated as http:// rather than /
After a bit more poking around, it looks like this is the standard behavior all
over including href links and explicitly typing a URL into the browser box.  You
can see the parsing in the browser box, it goes from http:/sub/target to
http://sub/target immediatly, then tries to figure out how to handle "sub" as a
domain.
Thanks - this sounds like a general URL resolution issue.  I'll defer to Warren
on this as he knows more about it than I do.
Assignee: pollmann → warren
Component: HTML Form Controls → Networking
Personally, I'd bet that more people who did this:
<FORM ACTION="http:/mozilla.org">

Want the browser to go to http://mozilla.org than /mozilla.org.  Just my $0.02
=> andreas
Assignee: warren → andreas.otte
This is from rfc 2396:

      If the scheme component is defined, indicating that the reference
      starts with a scheme name, then the reference is interpreted as an
      absolute URI and we are done.  Otherwise, the reference URI's
      scheme is inherited from the base URI's scheme component.

      Due to a loophole in prior specifications [RFC1630], some parsers
      allow the scheme name to be present in a relative URI if it is the
      same as the base URI scheme.  Unfortunately, this can conflict
      with the correct parsing of non-hierarchical URI.  For backwards
      compatibility, an implementation may work around such references
      by removing the scheme if it matches that of the base URI and the
      scheme is known to always use the <hier_part> syntax.  The parser
      can then continue with the steps below for the remainder of the
      reference components.  Validating parsers should mark such a
      misformed relative reference as an error.

By rfc 2396 this type of relative url is clearly invalid, there is a similar
type of relative url that uses this loophole in rfc1630, see bug 22251 for that
discussion.

It was decided to don't fix that problem. Instead we assume a malformed absolute
uri and try to correct it. I suggest fix the documents according to rfc 2396 by
removing the prepending http:.

Marking: Wontfix!
Status: UNCONFIRMED → RESOLVED
Closed: 24 years ago
Resolution: --- → WONTFIX
verified Wontfix
Status: RESOLVED → VERIFIED
*** Bug 51055 has been marked as a duplicate of this bug. ***
I think there's a difference between entering it in the URL bar versus finding
it on an HTML page. In the URL bar, I expect it it treat http:/ as http://
(that's just kind of a nice feature). In an HTML page, I think it would be nice
if we made it an option to either use the "broken" behaviour as described below
or treat the URL as http://. Perhaps prompt the user as to what to do or make it
an "advanced tweak".

In either case, this is a candidate for the release notes. 
*** Bug 51971 has been marked as a duplicate of this bug. ***
*** Bug 51971 has been marked as a duplicate of this bug. ***
*** Bug 58537 has been marked as a duplicate of this bug. ***
*** Bug 61128 has been marked as a duplicate of this bug. ***
*** Bug 61890 has been marked as a duplicate of this bug. ***
*** Bug 61582 has been marked as a duplicate of this bug. ***
*** Bug 56426 has been marked as a duplicate of this bug. ***
*** Bug 58522 has been marked as a duplicate of this bug. ***
*** Bug 57146 has been marked as a duplicate of this bug. ***
*** Bug 65474 has been marked as a duplicate of this bug. ***
*** Bug 68528 has been marked as a duplicate of this bug. ***
*** Bug 68977 has been marked as a duplicate of this bug. ***
*** Bug 71441 has been marked as a duplicate of this bug. ***
Adding mostfreq because this comes up so often.

I wonder if we should revisit the decision to handle these links differently
than IE and ns4.x do.  Are we gaining anything for our toeing the line of RFC
2396, especially if no other browser does?

I've begun to wonder if we should have a "full compliance" mode that defaults to
Off in Advanced prefs. (Or a "compatability with other browsers" mode that
defaults to On.)  You could tie quirks to this as well (after all, that's what
quirks.css really is, right?)
Keywords: mostfreq
*** Bug 79369 has been marked as a duplicate of this bug. ***
*** Bug 84456 has been marked as a duplicate of this bug. ***
*** Bug 84507 has been marked as a duplicate of this bug. ***
Today alone there's been three bugs filed about this. Bug 84456, bug 84474, bug
84507. Seems they are normally handed over to evangelism and don't land here.
*** Bug 88356 has been marked as a duplicate of this bug. ***
*** Bug 84474 has been marked as a duplicate of this bug. ***
torturing the evang component owner.  At this point bugs filed about 
specific sites which are determined to have the problem here should not be 
resolved as dupes of this bug, but instead be given to the Evangelism component 
for evangelism.
If you look at the comments for BUGID 56426, you'll notice that all Cisco
devices with inbuilt web servers use this syntax of URL.

If this doesn't get changed, no-one with a Cisco device is going to be able to
use Mozilla. (And there are a lot of people with Cisco)

I think having this as a wontfix is a bit like shoting yourself in the foot to
spite your face.

Please can you reconsider ?

Just my 2 pennies worth.
As far as I can tell from the relevant RFCs, the single slash syntax is valid.
I repeat my comment from 2000-03-24:

From chapter 5.2 of rfc2396:

   3) If the scheme component is defined, indicating that the reference
      starts with a scheme name, then the reference is interpreted as an
      absolute URI and we are done.  Otherwise, the reference URI's
      scheme is inherited from the base URI's scheme component.

      Due to a loophole in prior specifications [RFC1630], some parsers
      allow the scheme name to be present in a relative URI if it is the
      same as the base URI scheme.  Unfortunately, this can conflict
      with the correct parsing of non-hierarchical URI.  For backwards
      compatibility, an implementation may work around such references
      by removing the scheme if it matches that of the base URI and the
      scheme is known to always use the <hier_part> syntax.  The parser
      can then continue with the steps below for the remainder of the
      reference components.  Validating parsers should mark such a
      misformed relative reference as an error.

I think that makes it clear. This kind of relative urls are deprecated, parsers
*may* resolve such urls for backwards compatibility. We deceided to do not back
then, but I will start to dig up my patch for bug 22251 and see how it behaves
today.
*** Bug 91180 has been marked as a duplicate of this bug. ***
*** Bug 105898 has been marked as a duplicate of this bug. ***
I'd also like to suggest changing this one from WONTFIX.
cnn.com has this problem all over their site.
It's really hard to sell other people on using moz if they can't browse CNN.
(I'm assuming the back bug will get fixed at some point)

Example From:
http://www.cnn.com/2001/US/10/25/inv.investigation.facts/index.html


How will the expansion of law-enforcement powers affect Americans' civil
liberties? <a href="http:/2001/US/09/25/inv.civil.liberties/index.html"
class="text1">Click here for more.</a>

*** Bug 107061 has been marked as a duplicate of this bug. ***
Adding compat keyword. Based on mostfreq bug reports and old browser behavior, I
believe this should be reopened and fixed. Failing to support management of
Cisco routers (Bug 56426) as well as many websites is a problem.

Keywords: compat
I too am hoping it will be reopened.  I have one-way cable modem service.  I 
can't dialup my modem with Mozilla because the built-in web server uses this 
syntax.  It's not just an NT bug either, as I am running Linux and have probs...
OS: Windows NT → All
Hardware: PC → All
*** Bug 123176 has been marked as a duplicate of this bug. ***
I can't help hoping that http:/ will be handled the way other browsers do.
Unfortunately, this isn't an ideal world, and Moz isn't at the moment in a 
position to dictate standards, especially if they break compatibility for many 
users. If their page/router diag page etc works fine in IE/Netscape, they will 
assume Mozilla is to blame. 

We gain nothing by having an ultimate standards-compliant browser if no-one is 
using it!

Please reconsider at least an option somewhere to allow this behaviour - ie 
IE/netscape compatibility mode, rather than just pointing the finger at the 
webmaster. Our 'Evangelism' department is unlikely to be able to sway every 
site.. I can't blame webmasters for saying 'It's a lot of work, and only 1-2% 
of our users are using Mozilla'.

David
*** Bug 135549 has been marked as a duplicate of this bug. ***
Also from RFC 2396:
   However, a subset of URI do share a common syntax for
   representing hierarchical relationships within the namespace.  This
   "generic URI" syntax consists of a sequence of four main components:

      <scheme>://<authority><path>?<query>

   each of which, except <scheme>, may be absent from a particular URI.
   [...]
      absoluteURI   = scheme ":" ( hier_part | opaque_part )

      hier_part     = ( net_path | abs_path ) [ "?" query ]

      net_path      = "//" authority [ abs_path ]

      abs_path      = "/"  path_segments

This seems to me as requiring the <net_path> to start with two slashes. One
slash after <scheme> ":" indicates <abs_path>. So unless I'm misreading the BNF,
current Mozilla behaviour is incorrect according to RFC 2396.
But this is also from RFC 2396, indicating it differently:

      If the scheme component is defined, indicating that the reference
      starts with a scheme name, then the reference is interpreted as an
      absolute URI and we are done.  Otherwise, the reference URI's
      scheme is inherited from the base URI's scheme component.

      Due to a loophole in prior specifications [RFC1630], some parsers
      allow the scheme name to be present in a relative URI if it is the
      same as the base URI scheme.  Unfortunately, this can conflict
      with the correct parsing of non-hierarchical URI.  For backwards
      compatibility, an implementation may work around such references
      by removing the scheme if it matches that of the base URI and the
      scheme is known to always use the <hier_part> syntax.  The parser
      can then continue with the steps below for the remainder of the
      reference components.  Validating parsers should mark such a
      misformed relative reference as an error.

According to this a present scheme always indicates an absolute URL.
The above section also suggests a technique for backwards compatibility, since 
links that don't conform exactly to the spec are common.  If a particular type 
of non-conforming link is so common that an RFC goes out of its way to mention 
it, it seems to me like it's worth supporting...
 
It is implemented, just not in the tree, see the latest patch on bug 40670. It's
a political thing, it was decided to not support this deprecated stuff not even
for backwards compatibility, this decision maybe changed in the future or not.
The paragraph from RFC 2396 that Andreas now quoted for the third time in this
thread is from Section 5.2, which starts by saying: "This section describes an
example algorithm for resolving URI references that might be relative to a given
URI." So, firstly, I think the BNF defines the official syntax and should be the
guiding principle that Mozilla's implementation should try to follow.

Secondly, I think Andreas is misunderstanding what section 5.2 says by "if the
scheme component is defined [...] then the reference is an absolute URI". Look
at my comment above: in Section 3 of the RFC, absolute URI is defined as having
a <hier_part> following the <scheme> and the colon. But the <hier_part> consists
of <net_path> *iff* there are double slashes following the colon. If there is a
single slash, then it's an <abs_path>, and the host name is inherited from base
URI.

I'm sorry, but I don't see what's the political decision to be made here, unless
I'm completely misreading the grammar definition.
As I understand it an absolute URL does not have a base url only relative urls
do, based on (RFC 2396):

1.4. Hierarchical URI and Relative Forms

   An absolute identifier refers to a resource independent of the
   context in which the identifier is used.  In contrast, a relative
   identifier refers to a resource by describing the difference within a
   hierarchical namespace between the current context and an absolute
   identifier of the resource.

So, its very clear we have a contradiction in RFC 2396. The definitions in the
text do not fit the BNF. This is not the first contradiction, one might remember
the query-problem which was only resolved by an email from one of the authors.
It seems to me the BNF was either not looked upon closely (copy-paste) or it was
made with the backwards compatible implementation in mind.

By definition 1.4 "http:/path" can not be a vaild absolute url, There are three
options: Take it as a malformed absolute URL and try to "fix" it or report a
malformed URL or implement backwards compatibility to take it as a relative url.
Currently we do option one.    
 The issue really is this.  Rather than arguing about BNF or RFCs...  This type of link DOES exist. Regardless of what the BNF or the RFCs say, it is there nonetheless.  It isn't going to go away because we say that it's invalid. CNN.com etc don't care about RFCs, they care that people can view their pages. IE and Netscape + Konqueror resolve these http:/ urls 'correctly' as in that they use it as a relative url.  I am unable to use Mozilla for some applications because of it's handling of URLs. Regardless of the technicalities about who is correct, if IE/Netscape/Konq do it correctly, and Mozilla doesn't, people will blame Mozilla.  Please reconsider the WONTFIX, and consider at least giving an option. Such as :  Support IE/Netscape features   [ ]   This is strictly incorrect, but can help compatibility...   David 
I've r='d the patch in bug 40670. Yes, its deprecated, but the arguments for not
supporting it are very weak, IMHO.

That bug (and bug 22251) should then be marked as dupes of this one.

andreas: Do you want to mail darin for sr of that patch?

reopening for reconsideration.
Status: VERIFIED → UNCONFIRMED
Resolution: WONTFIX → ---
The patch from bug 40670 is already under consideration. Partys involved are
gagan and evangelism. I havn't heard anything yet. There was a manager decision
to not be backwards compatible back then and there already has been a big amount
of evangelism going into this.

In my opinion it doesn't make sense to sr this stuff until gagan (for
networking) and evangelism both give a green light. cc-ing gagan.
*** This bug has been confirmed by popular vote. ***
Status: UNCONFIRMED → NEW
Ever confirmed: true
*** Bug 139326 has been marked as a duplicate of this bug. ***
*** Bug 139330 has been marked as a duplicate of this bug. ***
*** Bug 140826 has been marked as a duplicate of this bug. ***
*** Bug 141408 has been marked as a duplicate of this bug. ***
*** Bug 145045 has been marked as a duplicate of this bug. ***
*** Bug 149370 has been marked as a duplicate of this bug. ***
*** Bug 150163 has been marked as a duplicate of this bug. ***
*** Bug 152579 has been marked as a duplicate of this bug. ***
*** Bug 151302 has been marked as a duplicate of this bug. ***
RFC2396 says:
   absoluteURI   = scheme ":" ( hier_part | opaque_part )
   hier_part     = ( net_path | abs_path ) [ "?" query ]
   net_path      = "//" authority [ abs_path ]
   abs_path      = "/"  path_segments

"http:/dirone/target" is a valid absolute URI with "http" as the scheme and
"/dirone/target" as the absolute path.
Since the scheme part is present, section 5 of RFC2396 is not relevant and it is
not a relative URI.

See also bug 154195 .
*** Bug 157337 has been marked as a duplicate of this bug. ***
Attached patch patch freshed upSplinter Review
Since it seems we are doing MARQUEE and document.all in the near future why not
support these deprecated relative urls too. Seems appropiate.
Attachment #42222 - Attachment is obsolete: true
Comment on attachment 92767 [details] [diff] [review]
patch freshed up

Untested, but looks fine to me.

r=bbaetz
Attachment #92767 - Flags: review+
we are not going to do document.all btw :)

This does affect quite a few european topsites and can help topsite compatability.

Who would sr this?
Keywords: topembed
I'm glad that does not happen!

I asked darin a few days ago for sr= ... no answer yet.
*** Bug 147701 has been marked as a duplicate of this bug. ***
Darin, this patch want's your sr= (I guess you've lost Andreas' mail ;)
Comment on attachment 92767 [details] [diff] [review]
patch freshed up

sr=darin
Attachment #92767 - Flags: superreview+
what gives, why are we all of a sudden compromising on standards?
shear number of duplicates... i don't think we can "fix" the world when it comes
to this bug.  too many websites break.
that tracker has 14 bugs and don't count the dups in here - this should be of
_way_ more use then marquee, maybe the doc.all would. Also this breaks many
pages, which the "typical" current mozilla users (Students, IT people) use.
darin: so will you sign up to get a crummy document.all impl committed to cvs
and to get <img alt="tooltip"> to do what people want instead of what the w3
dictated?

sheEr number of duplicates is not necessarily a good way to set policy.
those things mention aren't in my area of expertise.  i feel confident with the
change andreas has coded up here, but those other ones are much bigger changes
in my opinion, and they should be reviewed by folks working in that area (which
does not include me).  in short, talk to jst :-)
but i don't want those changes :-), just like i don't want this change.

what i'd like is a clear, logical, sane policy to handling mass hysteria (fiat
does not satisfy this or me).  What I see is anything but that.  Mozilla has set
a track record for eventually caving (MPL => GPL, TLS, <style type=text/plain>,
Marquee, and now http:/ - i'm sure there are more, but i think that's enough for
now)
The problem with this stuff is that there is no definitive answer. The issue
came up in December 1999. I made a first patch in February 2000. It was then
deceided by Warren Harris to not support these deprecated relative urls. I
supported that. Then the bugs came in. By now it is a very impressive list. Part
of it is about hardware, not easy to evangelize!

I and others brought the topic up again, rethinking that previous decision. I
made a change of our policy a long time ago dependend on a word from Evangelism-
and Necko-Managers. No answer yet. So I'm currently thinking about reversing the
approach. Get Review and Super-Review (got it!) and then schedule a checkin in
say about two or three weeks unless there is a veto from Evangelism- and/or
Necko-Managers. Maybe we get an answer this way.

So if you don't want this to happen make yourself a voice.
I also don't want this. We have enough crap in Mozilla. My "Mozilla world" is
destoyed in a few days. 
first marquee and now this and i think they will implement document.all and/or
document.layer in one year.
It's not easy for Mozilla because we have no 90% market share but many people start 
to use mozilla (and also many web developers) we should not go back to NS4.7x days.
We are going back instead of forward :-(

BTW: Thanks Andreas for all your work !

my 0.02€ (but 99% of the developers doesn't care about the community)
Timeless

We are not going against the standars. These type of URLs were described in
RFC1630. And also part of a standar RFC1808: See
http://www.freesoft.org/CIE/RFC/1808/4.htm 

Most browsers support these kind of "URLs". Including NNav 4.x

Then RFC2396 said

      Due to a loophole in prior specifications [RFC1630], some parsers
      allow the scheme name to be present in a relative URI if it is the
      same as the base URI scheme.  Unfortunately, this can conflict
      with the correct parsing of non-hierarchical URI.  For backwards
      compatibility, an implementation may work around such references
      by removing the scheme if it matches that of the base URI and the
      scheme is known to always use the <hier_part> syntax.  The parser
      can then continue with the steps below for the remainder of the
      reference components.  Validating parsers should mark such a
      misformed relative reference as an error.

This last sentence does not apply to Mozilla because Mozilla is not a
"validating parser".

By applying this patch we are implementing the workaround described in RFC2396
for BACKWARD COMPATIBILITY. 
i've actually worked with an employee at cisco to get some of their products
changed. i haven't seen a long list of broken products, but i'm willing to write
letters to hardware vendors about this and speaking http on :1080.

in short i'm opposed to this and will work to improve the world.

can someone please cull out a list of products whose latest versions are known
to be broken?
irs.gov is broken, at least, plus somewhere on my uni's intranet had this at one
stage, IIRC. I've also seen it on various places arround the web.

The point is that this behaviour _is_ part of the standard. True, its
deprecated, and its only there because of a bug in the previous rfve, but its
still there. If and when we start warning for page problems (bug
whatever-it-is), we can add this there. It also will not break any
non-deprecated url, so it _doesn't_ break existing pages.

http-on-1080 is a separate issue, and I disagree with what we're doing there anyway.
this issue should not be compared to other compromises.  by supporting this, i'm
not saying mozilla should cow-tow to broken, inconsistent standards.  instead,
i'm saying let's do backwards compatibility when backwards compatibility makes
sense.   in this case it makes sense since it won't break anybody.  sure we've
tried to evangelize websites to encourage strict conformance with the spec, but
in this case there really isn't much value in being so strict.  for better or
for worse these kinds of relative URLs have become ubiquitous.

if we are to follow the RFCs recommended implementations to the letter then we'd
be a pretty lame browsers, unable to visit most sites.  take our http
implementation for example.  so many compromises are required in order to make
the thing work.  ever heard of "pragma: no-cache" used as a response header? 
sure you have, but guess what?  it's invalid HTTP.  according to the spec it's
not meant to be used as a response header, but ignore it, and you suddenly break
all these websites.  should we evangelize them too?

or how about pipelining?  a webserver says it supports HTTP/1.1, but that
doesn't mean it won't barf on a pipelined request.  but the spec says it should
handle pipelined requests if it says it speaks HTTP/1.1 -- but many many servers
and transparent proxies get this wrong!!  enabling pipelining by default for
mozilla won't force all of those websites to fix or replace their broken
webservers... it'll just make people not want to use mozilla or mozilla-based
products.

frankly, it just seems like the benefits of this patch greatly outweigh the
advantages of the current implementation.
Does the patch in this bug also fix bug 22251, which deals with just the
protocol and colon, with no slash afterwards?
Yes, of course, it's the whole package ....
Blocks: 22251
Answer from aruner:

> As far as I'm concerned, Darin's comment 
> (http://bugzilla.mozilla.org/show_bug.cgi?id=32966#c85) makes sense to me and 
> voices how I stand on this subject.  So I'm not against the patch, but I'd like 
> to CC other people just in case they can think of bad things -- mainly bclary .
If we are not prevented by the standard from fixing this we should. This is a
pain in the ...
Okay, I think Darins argument is very convincing and evangelism has nothing
against it ... expect a checkin within the next days ...
fix is in ... now who wants to visit all those dups ...
Status: NEW → RESOLVED
Closed: 24 years ago22 years ago
Resolution: --- → FIXED
*** Bug 107061 has been marked as a duplicate of this bug. ***
the patch seamt to do only half of it's work - http:/path still does NOT work.
only http:file works now.
*** Bug 163352 has been marked as a duplicate of this bug. ***
It works sort of, the problem is, it works only with links in pages this way. It
currently does not work with redirections, because:

1. The redirection code was not made specifically with relative urls in mind.
2. The code that is used does it's own checking for relative urls, which it
shouldn't, it should leave that to ::Resolve.

I filed bug 163225 on the problem and attached a patch.
Alias: relative
Alias: relative → relative-http
Keywords: mozilla1.0.1
Comment on attachment 92767 [details] [diff] [review]
patch freshed up

a=rjesup@wgate.com for 1.0 branch

Change mozilla1.0.2+ to fixed1.0.2 when checked in
Attachment #92767 - Flags: approval+
Target Milestone: --- → mozilla1.0.2
Keywords: testcase
QA Contact: ckritzer → benc
Summary: http:/ (one slash) treated as http:// rather than / → URL: http:/ (one slash) treated as http:// rather than /
Please verify the bug. Once verified, change the keyword fixed1.0.2 to
verified1.0.2 
I'll handle the dupes.
VERIFIED:
This did not work in 1.1, but as I understand it, it shouldn't because it wasn't
checked into branch 1.1

Mozilla 1.2a, allplats
Status: RESOLVED → VERIFIED
VERIFIED/BRANCH:
Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-US; rv:1.0.2) Gecko/20020924
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.2) Gecko/20020924
Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.0.2) Gecko/20020924
*** Bug 190275 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: