Closed Bug 34648 Opened 25 years ago Closed 25 years ago

Netscape 6/Mozilla does not handle relative urls right

Categories

(Core :: Networking, defect, P3)

x86
Windows 98
defect

Tracking

()

VERIFIED INVALID

People

(Reporter: michaelj, Assigned: gagan)

References

()

Details

(Whiteboard: 0d)

We are a Digital City partner site (1.2M hits last month) and this bug renders our site useless. Here are notes on this from our VP of Development: review standard at: http://sunsite.cnlab-switch.ch/ftp/doc/standard/rfc/18xx/1808 If you go look at the BNF (section 2.2) it shows that our URLs are properly formed. Therefore this is a mozilla bug. We are encoding our URLs as: http:/page or http:/go/run.dll?Command Netscape 6/Mozilla is not expecting this format, it wants: http://host/page. We are leaving out //host, implying that its an absolute URL relative to the current host. Netscape 6/Mozilla will apparently handle relative URLs, but not when preceded by a protocol tag (http:) (?) To prove we are right, look at the BNF: URL = {absoluteURL} absoluteURL = scheme ":" relativeURL relativeURL = abs_path abs_path = "/" rel_path rel_path = path We are following standards. This problem does not exist with Netscape 4.x, or any versions of IE. This has been tested and confirmed on Win98, WinNT, and MacOS 8/9
-> Networking
Assignee: asadotzler → gagan
Status: UNCONFIRMED → NEW
Component: Browser-General → Networking
Ever confirmed: true
QA Contact: jelwell → tever
I think rfc2396 marks this usage as deprecated and we have currently decided to not support this kind of relative URLs any longer. Look at bug 22251 for some details.
Target Milestone: --- → M16
I am mistified why you have choosen to stop supporting this form of relative URL, yes I did read the notes on 22251 and find nothing solid about the decision. Being deprecated is not a good reason as if it were, you would need to break most of the sites on the web with code changes. This works just fine in prior versions of Netscape, IE, Opera, and iCab. So we would have to do a major code base change (along with many others using this form of url).
Not supporting it was not my decision, in fact as you might have noticed on bug 22251, I had a solution for this problem which was saved on a branch. I'm CCing Warren on this.
michaelj: From the snippet of bnf you gave, I don't see how your urls fit into the relative syntax. They look absolute to me (because they have a scheme), and don't specify a host which is why they don't work. The proper relative form would be just 'page' or 'go/run.dll?Command'. I think what you're experiencing is a bug in the interpretation of the spec in our earlier products that was duplicated in IE. I guess you could say this is now a defacto standard, but we'd like to eliminate it since it leads to numerous ambiguities that make other types of urls very difficult to parse. Any chance you can fix the relative links on your site?
This comes from rfc1808: URL = ( absoluteURL | relativeURL ) [ "#" fragment ] absoluteURL = generic-RL | ( scheme ":" *( uchar | reserved ) ) generic-RL = scheme ":" relativeURL relativeURL = net_path | abs_path | rel_path net_path = "//" net_loc [ abs_path ] abs_path = "/" rel_path rel_path = [ path ] [ ";" params ] [ "?" query ] I think the problem was the generic-RL which combined the scheme with a relative URL. This stuff was misused. The relative URL is clearly defined as relativeURL = net_path | abs_path | rel_path which says nothing about schemes. Please take a look at this snippet from rfc 1808, especially Step 2b): Step 2: Both the base and embedded URLs are parsed into their component parts as described in Section 2.4. a) If the embedded URL is entirely empty, it inherits the entire base URL (i.e., is set equal to the base URL) and we are done. b) If the embedded URL starts with a scheme name, it is interpreted as an absolute URL and we are done. c) Otherwise, the embedded URL inherits the scheme of the base URL. If the URL starts with a scheme its absolut and we are done ... It really comes down to the question how many sites break if we do not support this.
We can fix it, it just takes time and effort. My concern is that we can not be the only people using this style of link. Due to the related bugs that have been posted it would seem to be something worth fixing. My other concern is the ill-will this type of break can create. From a marketing standpoint is it a good move. By the way, I want to compliment the responsivness on this issue. As a fellow developer I respect your effort. You can imagine our surprise at loading 6 and seeing links on our site take us to go.com. It might be worthwhile for you to re-evaluate this issue and look at the related bugs.
Hi guys, I've been working with Michael on this one from our end. First, to give you a bit of background on why we use this form, we can assign tertiary domain names to different servers within our cluster (for example www1.imandi.com, www2.imandi.com) and at times we prefer that a user stay on the same server from page to page. If we construct the URLs as: "http:/file.htm" then we can use the exact same html on each server without modification, and the user will "stick" on whichever server they are initially served to. We may be able to remove the http: without ill effect, I'm not sure it seems like we originally added that to appease some minor browser, but maybe I'm hallucinating. The guy that would know definitively is on vacation right now... Anyway, as discussed others out there somewhere will probably have similar issues. To the point at hand, I looked at rfc2396 and am nonplussed. I acknowledge the section of text that you referred to, but the BNF included in 2396 still goes out of its way to support exactly the form that we use: >>> URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ] >>> absoluteURI = scheme ":" ( hier_part | opaque_part ) relativeURI = ( net_path | abs_path | rel_path ) [ "?" query ] >>> hier_part = ( net_path | abs_path ) [ "?" query ] opaque_part = uric_no_slash *uric uric_no_slash = unreserved | escaped | ";" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | "," net_path = "//" authority [ abs_path ] >>> abs_path = "/" path_segments rel_path = rel_segment [ abs_path ] As you can see, "http:/go/run.dll?foo" could be parsed as follows: scheme : {heir_part} ==> scheme : {abs_path} ? {query} ==> scheme : / {path_segments} ? {query} The interpretation I make of this is that when the RFC refers to "absolute URI" they mean absolute from the host, not absolute with respect to the entire internet...?
Well, technically, you're right that this does make an absolute url, but it makes one without a host. And since it's absolute, we don't try to resolve it relative to the current page, consequently we never get a host. (Our rule says that if it's has a scheme, then it's absolute -- and that looks like what the syntax says too.) I think this form of absolute url syntax (without a host) should only be used for file: urls and others that don't need a host to be fetched.
Why not use /file.htm instead of http:/file.htm? That would do the same job, that would resolve to the same server as the base page. Granted, you are not able to change protocol staying on the same server, and that's the only reason I can think of to use this style of "relative" URI.
I agree with Andreas. It should be possible to use a /foo syntax to fix all your http:/foo type URLs and retain your requirement to allow these pages to work on multiple websites. And it should be a reasonably simple search and replace on your website. Let us know why this solution would not work for you. Otherwise this bug is going to be marked INVALID.
Whiteboard: 0d
haven't heard from the reporter, marking as INVALID.
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → INVALID
verified INVALID
Status: RESOLVED → VERIFIED
*** Bug 71727 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.