Closed Bug 22251 Opened 25 years ago Closed 22 years ago

chin.buffnet.net - Relative URLs with scheme (e.g., http:page.html) not loading - treated as absolute

Categories

(Tech Evangelism Graveyard :: English US, defect, P2)

defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: jcarpenter0524, Unassigned)

References

()

Details

(Keywords: helpwanted, Whiteboard: [SYNTAX-URL][aok])

Attachments

(3 obsolete files)

Overview Description: This page uses <base href="http://www.chin.buffnet.net/"> for it's internal links, but they don't work. Attempted to recreate this problem locally, but it worked fine on all examples...? Steps to Reproduce: - Go to the URL: http://www.chin.buffnet.net/ - Move down the page to a section called "Other ChinNet information" - Click on any of the links in this section (except the last one which goes outside the site) - Notice in the URL bar that, for example, "Cage Designs" goes to the link: "http://cages.html/" when instead it should go to: "http://www.chin.buffnet.net/cages.html" Actual Results: internal link attemts to take you to the .html file without using the base tag ref. http://cages.html/ Expected Results: expect the base tag to be inserted before the .html file name http://www.chin.buffnet.net/cages.html Build Date & Platform Bug Found: 1999-12-20-12 Win98 Additional Builds and Platforms Tested On: 1999-12-20-08 Linux 1999-12-20-08 Mac Additional Information:
Priority: P3 → P2
QA Contact: gerardok → janc
changing qa assignment to janc
<html> <head><base href="http://www.chin.buffnet.net/"></head> <body> <A href="http:cages.html">Cage designs.</A> </body></html> Notice the 'http:cages.html' ...
Whiteboard: [TESTCASE]
Assignee: vidur → warren
Component: DOM Level 1 → Networking
The code that resolves relative URLs (NS_MakeAbsoluteURI) thinks that the value of the href attribute of the anchor ("http:cages.html") is already an absolute URL and doesn't do any further resolution. The attribute isn't really well-formed, but I'll let Warren figure out whether we want to be backward compatibile in this case.
Vidur is right: RFC 2396 says If the scheme component is defined, indicating that the reference starts with a scheme name, then the reference is interpreted as an absolute URI and we are done. Otherwise, the reference URI's scheme is inherited from the base URI's scheme component. Due to a loophole in prior specifications [RFC1630], some parsers allow the scheme name to be present in a relative URI if it is the same as the base URI scheme. Unfortunately, this can conflict with the correct parsing of non-hierarchical URI. For backwards compatibility, an implementation may work around such references by removing the scheme if it matches that of the base URI and the scheme is known to always use the <hier_part> syntax. The parser can then continue with the steps below for the remainder of the reference components. Validating parsers should mark such a misformed relative reference as an error. Adding this backwards compatibility will be a real pain, but I will take a look.
*** Bug 22894 has been marked as a duplicate of this bug. ***
Summary: base href tag not working on this URL → Relative URLs with scheme (e.g., http:page.html) not loading - treated as absolute
As seen in bug 22894, the problem with having schemes on relative URLs happens on pages without a <BASE> element. The base URL should be interpolated if the page is loaded from its "home" server according to the rules in section 5.2 of RFC 2396, "Uniform Resource Identifiers (URI): Generic Syntax", <URL:http://www.ietf.org/rfc/rfc2396.txt>, whether or not there is a <BASE> element. Changing summary from "base href tag not working on this URL" to "Relative URLs with scheme (e.g., http:page.html) not loading - treated as absolute"
The problem here is to make the distinction between http:page.html which is relative and rdf:bookmarks which is absolute. I tried to make nsStdURL:Resolve smart enough to resolve http:page.html correctly, it works, but NS_MakeAbsoluteURI (which calls Resolve) from nsNetUtil.h wants to be even smarter and trys to detect if there is an absolute URI on its own. Removing that stuff from NS_MakeAbsoluteURI the rdf-URIs are no longer absolute and mozilla looks a bit ugly afterwards. Is it possible to make rdf-URIs of the type nsSimpleURI? nsSimpleURI::Resolve could return the string put in instead of an Assertion. Warren?
Blocks: 13449
Target Milestone: M14
I think the criterion for relative vs. absolute should be whether there's a // (or //<something>/) or not. E.g. given a base URL of http://foo.com/bar/baz.html and a relative URL of http:page.html, http:/page.html, or http://page.html (with no directory portion), these should all resolve to http://foo.com/bar/page.html. However, if the relative URL is http://a/page.html, this should resolve to http://a/page.html (where 'a' is interpreted as the hostname).
But this does not help with http:page.html versus rdf:bookmarks. rdf:bookmarks is absolute. NS_MakeAbsoluteURI detects this by searching for a : in the spec, so http:page.html is also absolute for this function.
rdf: should be using nsSimpleURI, not nsStdURL.
From looking at the sources, there seems to be a list of URIs that is read in on startup. This can be all sort of URIs, hierachical or not (nsStdURL or nsSimpleURI), files, chrome, http, whatever ... rdf: is not the only problem, urn: comes to my mind, same problem ... There is no simple fix to this. I think we need a kind of registration service here which registers schemes mozilla knows about with some additional information (hierachical or not, host/authentification or not). The parser/resolver then has to react on that additional information.
I think you're getting sidetracked here. Remember that the protocol dictates the URL parsing scheme. For rdf, it decides to use nsSimpleURI and doesn't have this problem. But for http (and other protocols that use nsStdURL), nsStdURL needs to treat things like "http:page.html" as a relative spec, i.e. MakeAbsolute/Resolve need to not treat the string as if it were absolute.
You are right, it should do that, but I don't think there is such a thing as a rdf protocol-handler or a urn protocol-handler. Regardless of that, the base URL is in most of these cases a chrome-URI, "rdf:bookmarks" is a simple string given to MakeAbsolute. If we detect a scheme in the string given to Resolve we could query the protocol-handler (if existing) for the kind of URI it would create and then act on that.
Bulk moving [testcase] code to new testcase keyword. Sorry for the spam!
Keywords: testcase
Keywords: beta1
Whiteboard: [TESTCASE] → [TESTCASE][PDT+]
Andreas, Is this fixed now?
Assignee: warren → andreas.otte
Warren: No, I think we still need to think about this. Have you read my last comments on this? We need to have a way to distinguish simple from normal uris.
Keywords: helpwanted
Okay, I implemented a GetUritype method on nsIProtocolHandler which gives some information on the type of url that is used. This gives resolve the means to distinguish between http:page.html (relative) and rdf:bookmarks.html (absolute). So far it works fine, but I'm still testing ...
Status: NEW → ASSIGNED
I'm getting confused here. The Resolve method should just behave differently for nsStdURL vs. nsSimpleURI. Why do we need a global way to ask if something is relative or absolute?
Because there are places in the code where MakeAbsoluteURI/Resolve is called on strings from a list of uris (with a std-baseurl) which can be both Simple or Std. If it is a simple-uri to resolve (which we don't know because we only have a string, not a uri) it is absolute, otherwise it may be relative or absolute.
Marking this bug WONTFIX per Warrens request, but I will save my set of fixes for this problem in case it pops up again and no one has a better solution.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → WONTFIX
Andreas, Please either attach your final patches for this to this bug report, or put a note in to say what your base and branch tags are. Thanks.
I will do that immediatly when I have made the branch.
The branch is named BUG22251_BRANCH. You also have to unzip the latest attachement in mozilla/rdf to add the missing protocolhandlers. Mac project files are missing.
*** Bug 29012 has been marked as a duplicate of this bug. ***
*** Bug 30707 has been marked as a duplicate of this bug. ***
Is someone working on this bug? It is marked won't fix, but then there are fixes attached, and other bugs marked as duplicates of this one... Should it be reopened?
By the way, here's one (likely rare) reason to reopen this bug. Go to http://www.ehvert.com/ That URL has does a redirect to http:/Ehvert/default.asp <meta HTTP-EQUIV="REFRESH" CONTENT="0;URL=http:/Ehvert/default.asp"> and that sets off an _infinite_ loop of redirects. Error loading URL http://ehvert/default.asp Document: Done (0.184 secs) Document http://keyword.netscape.com/keyword/ehvert loaded successfully Document: Done (1.241 secs) Error loading URL http://ehvert/default.asp Document: Done (0.23 secs) Document http://keyword.netscape.com/keyword/ehvert loaded successfully Document: Done (1.241 secs) (I'm embarassed to admit that friends of mine run the company :-\).
The fixes for this bug are on the BUG22251_BRANCH, combined with the attachement 5806. There are concerns about the way the fix is done (not from me btw.), so if you have a better idea, feel free to submit it. For now nobody has a better idea and so it stays as WONTFIX.
I don't think that fixing this bug would also fix the above redirected URL: http:/Ehvert/default.asp It might, if the url would be changed to http:Ehvert/default.asp, but with the above case it simply thinks this is a case of a malformed URI and trys a correction to http://Ehvert/default.asp.
The url syntax causing this bug was deemed a heinous abuse of the spec, and leads to numerous other inconsistencies, so we decided not to fix it. Perpetrators should adjust their pages.
I realize that I didn't look closely enough at the URL http:/Ehvert/default.asp and that it is a different thing than the http:page.html kind of problem. Apologies. However, this URL does lead to an infinite loop of redirects (yes, this is in error, and is probably extremely rare). Do you want me to file a separate bug for this? (perhaps an upper bound on the number of redirects?)
John, have a look at bug 26438 - jst@netscape.com just nailed a problem with infinite reloads with bad syntax in a <meta HTTP-EQUIV="REFRESH" ...> element, and it's possible that fix will solve the problem at http://www.ehvert.com/ A hard limit on either the number of refreshes that will be done between non- refresh page loads, or the number that will be done within one minute, makes sense as a fallback for the unknown panoply of ways to get an infinite refresh sequence started - some of which may not be explainable as bugs or unfollowed specs.
Ok, marking verified
Status: RESOLVED → VERIFIED
*** Bug 32803 has been marked as a duplicate of this bug. ***
*** Bug 36821 has been marked as a duplicate of this bug. ***
*** Bug 48014 has been marked as a duplicate of this bug. ***
as submitter of bug 48014, two comments: 1) I agree with warren@netscape.com that http:foo.html is ugly, although andreas's quote from rfc2396 indicates that it's allowed to be interpreted as local in hierarchial urls (like http://) 2) given that both NS (<6) and IE interpret it as releative and there are probably millions of pages out there that use that fact, I expect that you'll be very, very sorry to take a hard-line interpretation on the spec on this one. (For example, consider the 4 times this has been submitted as a bug) because of (2), I suggest you consider reopening this bug.
Just wanted to point out that O'Reilly book by Musciano "HTML: The Definitive Guide" 1996 on page 177 gives example: ftp:special/README.txt I know it is an old book, but there are a lot of peope still using it, so maybe support for it is needed.
*** Bug 62624 has been marked as a duplicate of this bug. ***
*** Bug 58204 has been marked as a duplicate of this bug. ***
*** Bug 60308 has been marked as a duplicate of this bug. ***
*** Bug 50629 has been marked as a duplicate of this bug. ***
*** Bug 69722 has been marked as a duplicate of this bug. ***
*** Bug 70886 has been marked as a duplicate of this bug. ***
*** Bug 75807 has been marked as a duplicate of this bug. ***
*** Bug 82365 has been marked as a duplicate of this bug. ***
*** Bug 83269 has been marked as a duplicate of this bug. ***
For the last 6 months we are having at least a dup per month (see also bug 84450) This feature works fine on MSIE, NS<6, Opera, Konqueror and Lynx. Are we sure we wont fix this ? Are we doing the right thing and all these other browsers are wrong ?
Maybe it is time to take again a look at the fixes to this bug on the BUG22251_BRANCH. Of course they are totally bit rotted by now. But the dup bug count on this stuff is very high. I'm really surprised how much this deprecated relative urls are still being used.
Some dups of this bug are hidden because they are classified as component "Evangelism" ( see bug 65650 and bug 84450 ) Bug 32966 is *very* related to this bug and also has a long list of duplicates. It wont be easy to "evagelize" all these people.
I know the main issue here seems to be http: (which I think is silly). But what about the issue of having an href="?query". It seems a lot of sites do this and expect that a browser will put the curent url in there... but mozilla is nice enough to leave off the page. So say you were at http://www.blah.com/somedir/index5.php and there is an href="?blah=value" then moziall does a http://www.blah.com/somedir/?blah=value. I did not test to see what it does with # but I think it is an agreement that this is not correct. I do think it is bad coding on these peoples part to write code like this but at the same time so darn many people do it it becomes annoying when browsing sites.
Which build do you use? The query handling has been changed after 0.9.3 back to the pre 0.9.2 behaviour.
Mozilla 0.9.3 Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.3) Gecko/20010801
Okay, then it's clear. Just wait for 0.9.4 and all will be well again. However it is sad that we have two milestones with the different query behaviour including Netscape 6.1. But changing it was okay according to RFC 2396 (Opera does it also this way) and we found only later out that the authors of RFC 2396 did not meant what they had written :), so we changed it back ...
Sometimes things have to go back and forth a few times before the right solution is found. Thanks for spending some of your valuable time on this for me. I will be looking forward to all that 0.9.4 has to offer :)
*** Bug 99263 has been marked as a duplicate of this bug. ***
sending this old thing to Tech Evang
Status: VERIFIED → REOPENED
Component: Networking → English: US
Product: Browser → Tech Evangelism
Resolution: WONTFIX → ---
Target Milestone: M14 → ---
Version: other → unspecified
reassigning
Assignee: andreas.otte → bclary
Status: REOPENED → NEW
QA Contact: janc → zach
Summary: Relative URLs with scheme (e.g., http:page.html) not loading - treated as absolute → chin.buffnet.net - Relative URLs with scheme (e.g., http:page.html) not loading - treated as absolute
sent an email to webmaster@chinnet.net
Assignee: bclary → _basic
Keywords: helpwanted, testcase
Whiteboard: [TESTCASE][PDT+] → [REL-URL]
No longer blocks: 13449
Whiteboard: [REL-URL] → [SYNTAX-URL]
changign to assigned since emailw as sent
Status: NEW → ASSIGNED
Whiteboard: [SYNTAX-URL] → [SYNTAX-URL][aok]
*** Bug 121350 has been marked as a duplicate of this bug. ***
*** Bug 139872 has been marked as a duplicate of this bug. ***
assigning to nobody
Assignee: basic → nobody
Status: ASSIGNED → NEW
Keywords: helpwanted
The fix for bug 32966 also fixes this issue.
Depends on: relative-http
Attachment #5209 - Attachment is obsolete: true
Attachment #5806 - Attachment is obsolete: true
Attachment #42332 - Attachment is obsolete: true
This is now fixed with the checkin for bug 32966.
Status: NEW → RESOLVED
Closed: 25 years ago22 years ago
Resolution: --- → FIXED
v
Status: RESOLVED → VERIFIED
Product: Tech Evangelism → Tech Evangelism Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: