Netscape 6/Mozilla does not handle relative urls right

VERIFIED INVALID

Status

()

Core
Networking
P3
blocker
VERIFIED INVALID
18 years ago
17 years ago

People

(Reporter: michaelj, Assigned: Gagan)

Tracking

Trunk
x86
Windows 98
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: 0d, URL)

(Reporter)

Description

18 years ago
We are a Digital City partner site (1.2M hits last month) and this bug renders
our site useless. Here are notes on this from our VP of Development:

review standard at:
http://sunsite.cnlab-switch.ch/ftp/doc/standard/rfc/18xx/1808

If you go look at the BNF (section 2.2) it shows that our URLs are properly
formed. Therefore this is a mozilla bug.

We are encoding our URLs as:
http:/page or http:/go/run.dll?Command

Netscape 6/Mozilla is not expecting this format, it wants:
http://host/page. We are leaving out //host, implying that its an absolute URL
relative to the current host. Netscape 6/Mozilla will apparently handle
relative URLs, but not when preceded by a protocol tag (http:) (?)

To prove we are right, look at the BNF:

URL = {absoluteURL}
absoluteURL = scheme ":" relativeURL
relativeURL = abs_path
abs_path = "/" rel_path
rel_path = path

We are following standards. This problem does not exist with Netscape 4.x, or
any versions of IE. This has been tested and confirmed on Win98, WinNT, and
MacOS 8/9

Comment 1

18 years ago
-> Networking
Assignee: asadotzler → gagan
Status: UNCONFIRMED → NEW
Component: Browser-General → Networking
Ever confirmed: true
QA Contact: jelwell → tever

Comment 2

18 years ago
I think rfc2396 marks this usage as deprecated and we have currently decided to
not support this kind of relative URLs any longer. Look at bug 22251 for some
details. 

Updated

18 years ago
Target Milestone: --- → M16
(Reporter)

Comment 3

18 years ago
I am mistified why you have choosen to stop supporting this form of relative 
URL, yes I did read the notes on 22251 and find nothing solid about the 
decision. Being deprecated is not a good reason as if it were, you would need 
to break most of the sites on the web with code changes. This works just fine 
in prior versions of Netscape, IE, Opera, and iCab. So we would have to do a 
major code base change (along with many others using this form of url). 

Comment 4

18 years ago
Not supporting it was not my decision, in fact as you might have noticed on bug 
22251, I had a solution for this problem which was saved on a branch. I'm CCing 
Warren on this.

Comment 5

18 years ago
michaelj: From the snippet of bnf you gave, I don't see how your urls fit into 
the relative syntax. They look absolute to me (because they have a scheme), and 
don't specify a host which is why they don't work. The proper relative form 
would be just 'page' or 'go/run.dll?Command'. I think what you're experiencing 
is a bug in the interpretation of the spec in our earlier products that was 
duplicated in IE. I guess you could say this is now a defacto standard, but we'd 
like to eliminate it since it leads to numerous ambiguities that make other 
types of urls very difficult to parse. Any chance you can fix the relative links 
on your site?

Comment 6

18 years ago
This comes from rfc1808:

   URL         = ( absoluteURL | relativeURL ) [ "#" fragment ]

   absoluteURL = generic-RL | ( scheme ":" *( uchar | reserved ) )

   generic-RL  = scheme ":" relativeURL

   relativeURL = net_path | abs_path | rel_path

   net_path    = "//" net_loc [ abs_path ]
   abs_path    = "/"  rel_path
   rel_path    = [ path ] [ ";" params ] [ "?" query ]

I think the problem was the generic-RL which combined the scheme with a relative 
URL. This stuff was misused. The relative URL is clearly defined as 

   relativeURL = net_path | abs_path | rel_path

which says nothing about schemes. Please take a look at this snippet from rfc 
1808, especially Step 2b):

   Step 2: Both the base and embedded URLs are parsed into their
           component parts as described in Section 2.4.

           a) If the embedded URL is entirely empty, it inherits the
              entire base URL (i.e., is set equal to the base URL)
              and we are done.

           b) If the embedded URL starts with a scheme name, it is
              interpreted as an absolute URL and we are done.

           c) Otherwise, the embedded URL inherits the scheme of
              the base URL.


If the URL starts with a scheme its absolut and we are done ...


It really comes down to the question how many sites break if we do not support 
this.
(Reporter)

Comment 7

18 years ago
We can fix it, it just takes time and effort. My concern is that we can not be 
the only people using this style of link. Due to the related bugs that have 
been posted it would seem to be something worth fixing. My other concern is the 
ill-will this type of break can create. From a marketing standpoint is it a 
good move. By the way, I want to compliment the responsivness on this issue. As 
a fellow developer I respect your effort. You can imagine our surprise at 
loading 6 and seeing links on our site take us to go.com. It might be 
worthwhile for you to re-evaluate this issue and look at the related bugs.

Comment 8

17 years ago
Hi guys, I've been working with Michael on this one from our end.

First, to give you a bit of background on why we use this form, we can assign 
tertiary domain names to different servers within our cluster (for example 
www1.imandi.com, www2.imandi.com) and at times we prefer that a user stay on 
the same server from page to page. If we construct the URLs 
as: "http:/file.htm" then we can use the exact same html on each server without 
modification, and the user will "stick" on whichever server they are initially 
served to. We may be able to remove the http: without ill effect, I'm not sure 
it seems like we originally added that to appease some minor browser, but maybe 
I'm hallucinating. The guy that would know definitively is on vacation right 
now...

Anyway, as discussed others out there somewhere will probably have similar 
issues.

To the point at hand, I looked at rfc2396 and am nonplussed. I acknowledge the 
section of text that you referred to, but the BNF included in 2396 still goes 
out of its way to support exactly the form that we use: 

>>>   URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]

>>>   absoluteURI   = scheme ":" ( hier_part | opaque_part )
      relativeURI   = ( net_path | abs_path | rel_path ) [ "?" query ]

>>>   hier_part     = ( net_path | abs_path ) [ "?" query ]
      opaque_part   = uric_no_slash *uric

      uric_no_slash = unreserved | escaped | ";" | "?" | ":" | "@" |
                      "&" | "=" | "+" | "$" | ","

      net_path      = "//" authority [ abs_path ]
>>>   abs_path      = "/"  path_segments
      rel_path      = rel_segment [ abs_path ]

As you can see, "http:/go/run.dll?foo" could be parsed as follows:

scheme : {heir_part} ==>
scheme : {abs_path} ? {query} ==>
scheme : / {path_segments} ? {query}

The interpretation I make of this is that when the RFC refers to "absolute URI" 
they mean absolute from the host, not absolute with respect to the entire 
internet...?

Comment 9

17 years ago
Well, technically, you're right that this does make an absolute url, but it 
makes one without a host. And since it's absolute, we don't try to resolve it 
relative to the current page, consequently we never get a host. (Our rule says 
that if it's has a scheme, then it's absolute -- and that looks like what the 
syntax says too.)

I think this form of absolute url syntax (without a host) should only be used 
for file: urls and others that don't need a host to be fetched. 

Comment 10

17 years ago
Why not use /file.htm instead of http:/file.htm? That would do the same job, 
that would resolve to the same server as the base page. Granted, you are not 
able to change protocol staying on the same server, and that's the only reason I 
can think of to use this style of "relative" URI.
(Assignee)

Comment 11

17 years ago
I agree with Andreas. It should be possible to use a /foo syntax to fix all your
http:/foo type URLs and retain your requirement to allow these pages to work on
multiple websites. And it should be a reasonably simple search and replace on
your website. Let us know why this solution would not work for you.

Otherwise this bug is going to be marked INVALID.
(Assignee)

Updated

17 years ago
Whiteboard: 0d
(Assignee)

Comment 12

17 years ago
haven't heard from the reporter, marking as INVALID.
Status: NEW → RESOLVED
Last Resolved: 17 years ago
Resolution: --- → INVALID

Comment 13

17 years ago
verified INVALID
Status: RESOLVED → VERIFIED

Comment 14

17 years ago
*** Bug 71727 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.