Last Comment Bug 34648 - Netscape 6/Mozilla does not handle relative urls right
: Netscape 6/Mozilla does not handle relative urls right
Status: VERIFIED INVALID
0d
:
Product: Core
Classification: Components
Component: Networking (show other bugs)
: Trunk
: x86 Windows 98
: P3 blocker (vote)
: M16
Assigned To: Gagan
: Tom Everingham
: Patrick McManus [:mcmanus]
Mentors:
http://www.imandi.com/
: 71727 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2000-04-05 12:49 PDT by michaelj
Modified: 2001-03-12 13:55 PST (History)
3 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments

Description michaelj 2000-04-05 12:49:55 PDT
We are a Digital City partner site (1.2M hits last month) and this bug renders
our site useless. Here are notes on this from our VP of Development:

review standard at:
http://sunsite.cnlab-switch.ch/ftp/doc/standard/rfc/18xx/1808

If you go look at the BNF (section 2.2) it shows that our URLs are properly
formed. Therefore this is a mozilla bug.

We are encoding our URLs as:
http:/page or http:/go/run.dll?Command

Netscape 6/Mozilla is not expecting this format, it wants:
http://host/page. We are leaving out //host, implying that its an absolute URL
relative to the current host. Netscape 6/Mozilla will apparently handle
relative URLs, but not when preceded by a protocol tag (http:) (?)

To prove we are right, look at the BNF:

URL = {absoluteURL}
absoluteURL = scheme ":" relativeURL
relativeURL = abs_path
abs_path = "/" rel_path
rel_path = path

We are following standards. This problem does not exist with Netscape 4.x, or
any versions of IE. This has been tested and confirmed on Win98, WinNT, and
MacOS 8/9
Comment 1 Asa Dotzler [:asa] 2000-04-05 14:03:26 PDT
-> Networking
Comment 2 Andreas Otte 2000-04-06 13:49:15 PDT
I think rfc2396 marks this usage as deprecated and we have currently decided to
not support this kind of relative URLs any longer. Look at bug 22251 for some
details. 
Comment 3 michaelj 2000-04-06 17:08:50 PDT
I am mistified why you have choosen to stop supporting this form of relative 
URL, yes I did read the notes on 22251 and find nothing solid about the 
decision. Being deprecated is not a good reason as if it were, you would need 
to break most of the sites on the web with code changes. This works just fine 
in prior versions of Netscape, IE, Opera, and iCab. So we would have to do a 
major code base change (along with many others using this form of url). 
Comment 4 Andreas Otte 2000-04-06 23:56:55 PDT
Not supporting it was not my decision, in fact as you might have noticed on bug 
22251, I had a solution for this problem which was saved on a branch. I'm CCing 
Warren on this.
Comment 5 Warren Harris 2000-04-07 08:44:05 PDT
michaelj: From the snippet of bnf you gave, I don't see how your urls fit into 
the relative syntax. They look absolute to me (because they have a scheme), and 
don't specify a host which is why they don't work. The proper relative form 
would be just 'page' or 'go/run.dll?Command'. I think what you're experiencing 
is a bug in the interpretation of the spec in our earlier products that was 
duplicated in IE. I guess you could say this is now a defacto standard, but we'd 
like to eliminate it since it leads to numerous ambiguities that make other 
types of urls very difficult to parse. Any chance you can fix the relative links 
on your site?
Comment 6 Andreas Otte 2000-04-07 09:11:20 PDT
This comes from rfc1808:

   URL         = ( absoluteURL | relativeURL ) [ "#" fragment ]

   absoluteURL = generic-RL | ( scheme ":" *( uchar | reserved ) )

   generic-RL  = scheme ":" relativeURL

   relativeURL = net_path | abs_path | rel_path

   net_path    = "//" net_loc [ abs_path ]
   abs_path    = "/"  rel_path
   rel_path    = [ path ] [ ";" params ] [ "?" query ]

I think the problem was the generic-RL which combined the scheme with a relative 
URL. This stuff was misused. The relative URL is clearly defined as 

   relativeURL = net_path | abs_path | rel_path

which says nothing about schemes. Please take a look at this snippet from rfc 
1808, especially Step 2b):

   Step 2: Both the base and embedded URLs are parsed into their
           component parts as described in Section 2.4.

           a) If the embedded URL is entirely empty, it inherits the
              entire base URL (i.e., is set equal to the base URL)
              and we are done.

           b) If the embedded URL starts with a scheme name, it is
              interpreted as an absolute URL and we are done.

           c) Otherwise, the embedded URL inherits the scheme of
              the base URL.


If the URL starts with a scheme its absolut and we are done ...


It really comes down to the question how many sites break if we do not support 
this.
Comment 7 michaelj 2000-04-07 10:56:14 PDT
We can fix it, it just takes time and effort. My concern is that we can not be 
the only people using this style of link. Due to the related bugs that have 
been posted it would seem to be something worth fixing. My other concern is the 
ill-will this type of break can create. From a marketing standpoint is it a 
good move. By the way, I want to compliment the responsivness on this issue. As 
a fellow developer I respect your effort. You can imagine our surprise at 
loading 6 and seeing links on our site take us to go.com. It might be 
worthwhile for you to re-evaluate this issue and look at the related bugs.
Comment 8 ejohnson 2000-04-10 19:51:24 PDT
Hi guys, I've been working with Michael on this one from our end.

First, to give you a bit of background on why we use this form, we can assign 
tertiary domain names to different servers within our cluster (for example 
www1.imandi.com, www2.imandi.com) and at times we prefer that a user stay on 
the same server from page to page. If we construct the URLs 
as: "http:/file.htm" then we can use the exact same html on each server without 
modification, and the user will "stick" on whichever server they are initially 
served to. We may be able to remove the http: without ill effect, I'm not sure 
it seems like we originally added that to appease some minor browser, but maybe 
I'm hallucinating. The guy that would know definitively is on vacation right 
now...

Anyway, as discussed others out there somewhere will probably have similar 
issues.

To the point at hand, I looked at rfc2396 and am nonplussed. I acknowledge the 
section of text that you referred to, but the BNF included in 2396 still goes 
out of its way to support exactly the form that we use: 

>>>   URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]

>>>   absoluteURI   = scheme ":" ( hier_part | opaque_part )
      relativeURI   = ( net_path | abs_path | rel_path ) [ "?" query ]

>>>   hier_part     = ( net_path | abs_path ) [ "?" query ]
      opaque_part   = uric_no_slash *uric

      uric_no_slash = unreserved | escaped | ";" | "?" | ":" | "@" |
                      "&" | "=" | "+" | "$" | ","

      net_path      = "//" authority [ abs_path ]
>>>   abs_path      = "/"  path_segments
      rel_path      = rel_segment [ abs_path ]

As you can see, "http:/go/run.dll?foo" could be parsed as follows:

scheme : {heir_part} ==>
scheme : {abs_path} ? {query} ==>
scheme : / {path_segments} ? {query}

The interpretation I make of this is that when the RFC refers to "absolute URI" 
they mean absolute from the host, not absolute with respect to the entire 
internet...?
Comment 9 Warren Harris 2000-04-10 20:04:00 PDT
Well, technically, you're right that this does make an absolute url, but it 
makes one without a host. And since it's absolute, we don't try to resolve it 
relative to the current page, consequently we never get a host. (Our rule says 
that if it's has a scheme, then it's absolute -- and that looks like what the 
syntax says too.)

I think this form of absolute url syntax (without a host) should only be used 
for file: urls and others that don't need a host to be fetched. 
Comment 10 Andreas Otte 2000-04-10 23:59:30 PDT
Why not use /file.htm instead of http:/file.htm? That would do the same job, 
that would resolve to the same server as the base page. Granted, you are not 
able to change protocol staying on the same server, and that's the only reason I 
can think of to use this style of "relative" URI.
Comment 11 Gagan 2000-04-12 02:37:02 PDT
I agree with Andreas. It should be possible to use a /foo syntax to fix all your
http:/foo type URLs and retain your requirement to allow these pages to work on
multiple websites. And it should be a reasonably simple search and replace on
your website. Let us know why this solution would not work for you.

Otherwise this bug is going to be marked INVALID.
Comment 12 Gagan 2000-04-14 19:23:01 PDT
haven't heard from the reporter, marking as INVALID.
Comment 13 Tom Everingham 2000-07-05 14:23:45 PDT
verified INVALID
Comment 14 Peter ``jag'' Annema 2001-03-12 13:55:28 PST
*** Bug 71727 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.