Closed Bug 22251 Opened 25 years ago Closed 22 years ago

chin.buffnet.net - Relative URLs with scheme (e.g., http:page.html) not loading - treated as absolute

Categories

(Tech Evangelism Graveyard :: English US, defect, P2)

defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: jcarpenter0524, Unassigned)

References

()

Details

(Keywords: helpwanted, Whiteboard: [SYNTAX-URL][aok])

Attachments

(3 obsolete files)

Overview Description:
This page uses <base href="http://www.chin.buffnet.net/"> for it's internal
links, but they don't work.  Attempted to recreate this problem locally, but it
worked fine on all examples...?

Steps to Reproduce:
- Go to the URL: http://www.chin.buffnet.net/
- Move down the page to a section called "Other ChinNet information"
- Click on any of the links in this section (except the last one which goes
outside the site)
- Notice in the URL bar that, for example, "Cage Designs" goes to the link:
    "http://cages.html/"
  when instead it should go to:
    "http://www.chin.buffnet.net/cages.html"

Actual Results:
internal link attemts to take you to the .html file without using the base tag
ref.
http://cages.html/

Expected Results:
expect the base tag to be inserted before the .html file name
http://www.chin.buffnet.net/cages.html


Build Date & Platform Bug Found:
1999-12-20-12 Win98

Additional Builds and Platforms Tested On:
1999-12-20-08 Linux
1999-12-20-08 Mac

Additional Information:
Priority: P3 → P2
QA Contact: gerardok → janc
changing qa assignment to janc
<html>
<head><base href="http://www.chin.buffnet.net/"></head>
<body>
   <A href="http:cages.html">Cage designs.</A>
</body></html>

Notice the 'http:cages.html' ...
Whiteboard: [TESTCASE]
Assignee: vidur → warren
Component: DOM Level 1 → Networking
The code that resolves relative URLs (NS_MakeAbsoluteURI) thinks that the value
of the href attribute of the anchor ("http:cages.html") is already an absolute
URL and doesn't do any further resolution. The attribute isn't really
well-formed, but I'll let Warren figure out whether we want to be backward
compatibile in this case.
Vidur is right: RFC 2396 says

      If the scheme component is defined, indicating that the reference
      starts with a scheme name, then the reference is interpreted as an
      absolute URI and we are done.  Otherwise, the reference URI's
      scheme is inherited from the base URI's scheme component.

      Due to a loophole in prior specifications [RFC1630], some parsers
      allow the scheme name to be present in a relative URI if it is the
      same as the base URI scheme.  Unfortunately, this can conflict
      with the correct parsing of non-hierarchical URI.  For backwards
      compatibility, an implementation may work around such references
      by removing the scheme if it matches that of the base URI and the
      scheme is known to always use the <hier_part> syntax.  The parser
      can then continue with the steps below for the remainder of the
      reference components.  Validating parsers should mark such a
      misformed relative reference as an error.

Adding this backwards compatibility will be a real pain, but I will take a look.
*** Bug 22894 has been marked as a duplicate of this bug. ***
Summary: base href tag not working on this URL → Relative URLs with scheme (e.g., http:page.html) not loading - treated as absolute
As seen in bug 22894, the problem with having schemes on relative URLs
happens on pages without a <BASE> element. The base URL should be interpolated
if the page is loaded from its "home" server according to the rules in
section 5.2 of RFC 2396, "Uniform Resource Identifiers (URI): Generic Syntax",
<URL:http://www.ietf.org/rfc/rfc2396.txt>, whether or not there is a <BASE>
element.

Changing summary from "base href tag not working on this URL" to
"Relative URLs with scheme (e.g., http:page.html) not loading - treated as
absolute"
The problem here is to make the distinction between http:page.html which is
relative and rdf:bookmarks which is absolute. I tried to make nsStdURL:Resolve
smart enough to resolve http:page.html correctly, it works, but
NS_MakeAbsoluteURI (which calls Resolve) from nsNetUtil.h wants to be even
smarter and trys to detect if there is an absolute URI on its own. Removing that
stuff from NS_MakeAbsoluteURI the rdf-URIs are no longer absolute and mozilla
looks a bit ugly afterwards.

Is it possible to make rdf-URIs of the type nsSimpleURI? nsSimpleURI::Resolve
could return the string put in instead of an Assertion. Warren?
Blocks: 13449
Target Milestone: M14
I think the criterion for relative vs. absolute should be whether there's a
// (or //<something>/) or not. E.g. given a base URL of
http://foo.com/bar/baz.html and a relative URL of http:page.html,
http:/page.html, or http://page.html (with no directory portion), these should
all resolve to http://foo.com/bar/page.html. However, if the relative URL is
http://a/page.html, this should resolve to http://a/page.html (where 'a' is
interpreted as the hostname).
But this does not help with http:page.html versus rdf:bookmarks. rdf:bookmarks
is absolute. NS_MakeAbsoluteURI detects this by searching for a : in the spec,
so http:page.html is also absolute for this function.
rdf: should be using nsSimpleURI, not nsStdURL.
From looking at the sources, there seems to be a list of URIs that is read in on
startup. This can be all sort of URIs, hierachical or not (nsStdURL or
nsSimpleURI), files, chrome, http, whatever ...

rdf: is not the only problem, urn: comes to my mind, same problem ...

There is no simple fix to this. I think we need a kind of registration service
here which registers schemes mozilla knows about with some additional
information (hierachical or not, host/authentification or not). The
parser/resolver then has to react on that additional information.
I think you're getting sidetracked here. Remember that the protocol dictates
the URL parsing scheme. For rdf, it decides to use nsSimpleURI and doesn't have
this problem. But for http (and other protocols that use nsStdURL), nsStdURL
needs to treat things like "http:page.html" as a relative spec, i.e.
MakeAbsolute/Resolve need to not treat the string as if it were absolute.
You are right, it should do that, but I don't think there is such a thing as a
rdf protocol-handler or a urn protocol-handler. Regardless of that, the base URL
is in most of these cases a chrome-URI, "rdf:bookmarks" is a simple string given
to MakeAbsolute.

If we detect a scheme in the string given to Resolve we could query the
protocol-handler (if existing) for the kind of URI it would create and then act
on that.
Bulk moving [testcase] code to new testcase keyword. Sorry for the spam!
Keywords: testcase
Keywords: beta1
Whiteboard: [TESTCASE] → [TESTCASE][PDT+]
Andreas, Is this fixed now?
Assignee: warren → andreas.otte
Warren: No, I think we still need to think about this. Have you read my last
comments on this? We need to have a way to distinguish simple from normal uris.
Keywords: helpwanted
Okay, I implemented a GetUritype method on nsIProtocolHandler which gives some
information on the type of url that is used. This gives resolve the means to
distinguish between http:page.html (relative) and rdf:bookmarks.html (absolute).
So far it works fine, but I'm still testing ...
Status: NEW → ASSIGNED
I'm getting confused here. The Resolve method should just behave differently 
for nsStdURL vs. nsSimpleURI. Why do we need a global way to ask if something 
is relative or absolute?
Because there are places in the code where MakeAbsoluteURI/Resolve is called on
strings from a list of uris (with a std-baseurl) which can be both Simple or
Std. If it is a simple-uri to resolve (which we don't know because we only have
a string, not a uri) it is absolute, otherwise it may be relative or absolute.
Marking this bug WONTFIX per Warrens request, but I will save my set of fixes
for this problem in case it pops up again and no one has a better solution.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → WONTFIX
Andreas, Please either attach your final patches for this to this bug report, or 
put a note in to say what your base and branch tags are. Thanks.
I will do that immediatly when I have made the branch.
The branch is named BUG22251_BRANCH. You also have to unzip the latest
attachement in mozilla/rdf to add the missing protocolhandlers. Mac project
files are missing.
*** Bug 29012 has been marked as a duplicate of this bug. ***
*** Bug 30707 has been marked as a duplicate of this bug. ***
Is someone working on this bug?  It is marked won't fix, but then there are 
fixes attached, and other bugs marked as duplicates of this one...  Should it be 
reopened?
By the way, here's one (likely rare) reason to reopen this bug. 
Go to http://www.ehvert.com/ That URL has does a redirect to 
http:/Ehvert/default.asp

   <meta HTTP-EQUIV="REFRESH" CONTENT="0;URL=http:/Ehvert/default.asp">

and that sets off an _infinite_ loop of redirects.

  Error loading URL http://ehvert/default.asp 
  Document: Done (0.184 secs)
  Document http://keyword.netscape.com/keyword/ehvert loaded successfully
  Document: Done (1.241 secs)
  Error loading URL http://ehvert/default.asp 
  Document: Done (0.23 secs)
  Document http://keyword.netscape.com/keyword/ehvert loaded successfully
  Document: Done (1.241 secs)

(I'm embarassed to admit that friends of mine run the company :-\). 
The fixes for this bug are on the BUG22251_BRANCH, combined with the attachement
5806. There are concerns about the way the fix is done (not from me btw.), so if
you have a better idea, feel free to submit it. For now nobody has a better idea
and so it stays as WONTFIX. 
I don't think that fixing this bug would also fix the above redirected URL:
http:/Ehvert/default.asp

It might, if the url would be changed to http:Ehvert/default.asp, but with the
above case it simply thinks this is a case of a malformed URI and trys a
correction to http://Ehvert/default.asp.
The url syntax causing this bug was deemed a heinous abuse of the spec, and 
leads to numerous other inconsistencies, so we decided not to fix it. 
Perpetrators should adjust their pages.
I realize that I didn't look closely enough at the URL http:/Ehvert/default.asp
and that it is a different thing than the http:page.html kind of problem. 
Apologies. 

However, this URL does lead to an infinite loop of redirects (yes, this is in 
error, and is probably extremely rare). Do you want me to file a separate bug
for this? (perhaps an upper bound on the number of redirects?)

John, have a look at bug 26438 - jst@netscape.com just nailed a problem with
infinite reloads with bad syntax in a <meta HTTP-EQUIV="REFRESH" ...> element,
and it's possible that fix will solve the problem at http://www.ehvert.com/

A hard limit on either the number of refreshes that will be done between non-
refresh page loads, or the number that will be done within one minute, makes
sense as a fallback for the unknown panoply of ways to get an infinite refresh
sequence started - some of which may not be explainable as bugs or unfollowed
specs.
Ok, marking verified
Status: RESOLVED → VERIFIED
*** Bug 32803 has been marked as a duplicate of this bug. ***
*** Bug 36821 has been marked as a duplicate of this bug. ***
*** Bug 48014 has been marked as a duplicate of this bug. ***
as submitter of bug 48014, two comments:

1) I agree with warren@netscape.com that http:foo.html is ugly,
although andreas's quote from rfc2396 indicates that it's allowed
to be interpreted as local in hierarchial urls (like http://)

2) given that both NS (<6) and IE interpret it as releative and there
are probably millions of pages out there that use that fact,
I expect that you'll be very, very sorry to take a hard-line interpretation
on the spec on this one.  (For example, consider the 4 times this has been
submitted as a bug)

because of (2), I suggest you consider reopening this bug.
Just wanted to point out that O'Reilly book by Musciano
 "HTML: The Definitive Guide" 1996 on page 177 gives example:

ftp:special/README.txt

I know it is an old book, but there are a lot of peope still using it, so maybe
support for it is needed.
*** Bug 62624 has been marked as a duplicate of this bug. ***
*** Bug 58204 has been marked as a duplicate of this bug. ***
*** Bug 60308 has been marked as a duplicate of this bug. ***
*** Bug 50629 has been marked as a duplicate of this bug. ***
*** Bug 69722 has been marked as a duplicate of this bug. ***
*** Bug 70886 has been marked as a duplicate of this bug. ***
*** Bug 75807 has been marked as a duplicate of this bug. ***
*** Bug 82365 has been marked as a duplicate of this bug. ***
*** Bug 83269 has been marked as a duplicate of this bug. ***
For the last 6 months we are having at least a dup per month (see also bug 84450)
 
This feature works fine on MSIE, NS<6, Opera, Konqueror and Lynx.

Are we sure we wont fix this ?
Are we doing the right thing and all these other browsers are wrong ?
Maybe it is time to take again a look at the fixes to this bug on the
BUG22251_BRANCH. Of course they are totally bit rotted by now. But the dup bug
count on this stuff is very high. I'm really surprised how much this deprecated
relative urls are still being used.  
Some dups of this bug are hidden because they are classified as component 
"Evangelism" ( see bug 65650 and bug 84450 )

Bug 32966 is *very* related to this bug and also has a long list of duplicates.

It wont be easy to "evagelize" all these people.
I know the main issue here seems to be http: (which I think is silly). But what
about the issue of having an href="?query". It seems a lot of sites do this and
expect that a browser will put the curent url in there... but mozilla is nice
enough to leave off the page. So say you were at
http://www.blah.com/somedir/index5.php and there is an href="?blah=value" then
moziall does a http://www.blah.com/somedir/?blah=value. I did not test to see
what it does with # but I think it is an agreement that this is not correct.

I do think it is bad coding on these peoples part to write code like this but at
the same time so darn many people do it it becomes annoying when browsing sites.
Which build do you use? The query handling has been changed after 0.9.3 back to
the pre 0.9.2 behaviour.
Mozilla 0.9.3
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.3) Gecko/20010801
Okay, then it's clear. Just wait for 0.9.4 and all will be well again. However
it is sad that we have two milestones with the different query behaviour
including Netscape 6.1. But changing it was okay according to RFC 2396 (Opera
does it also this way) and we found only later out that the authors of RFC 2396
did not meant what they had written :), so we changed it back ...
Sometimes things have to go back and forth a few times before the right solution
is found. Thanks for spending some of your valuable time on this for me. I will
be looking forward to all that 0.9.4 has to offer :)
*** Bug 99263 has been marked as a duplicate of this bug. ***
sending this old thing to Tech Evang
Status: VERIFIED → REOPENED
Component: Networking → English: US
Product: Browser → Tech Evangelism
Resolution: WONTFIX → ---
Target Milestone: M14 → ---
Version: other → unspecified
reassigning
Assignee: andreas.otte → bclary
Status: REOPENED → NEW
QA Contact: janc → zach
Summary: Relative URLs with scheme (e.g., http:page.html) not loading - treated as absolute → chin.buffnet.net - Relative URLs with scheme (e.g., http:page.html) not loading - treated as absolute
sent an email to webmaster@chinnet.net
Assignee: bclary → _basic
Keywords: helpwanted, testcase
Whiteboard: [TESTCASE][PDT+] → [REL-URL]
No longer blocks: 13449
Whiteboard: [REL-URL] → [SYNTAX-URL]
changign to assigned since emailw as sent
Status: NEW → ASSIGNED
Whiteboard: [SYNTAX-URL] → [SYNTAX-URL][aok]
*** Bug 121350 has been marked as a duplicate of this bug. ***
*** Bug 139872 has been marked as a duplicate of this bug. ***
assigning to nobody
Assignee: basic → nobody
Status: ASSIGNED → NEW
Keywords: helpwanted
The fix for bug 32966 also fixes this issue.
Depends on: relative-http
Attachment #5209 - Attachment is obsolete: true
Attachment #5806 - Attachment is obsolete: true
Attachment #42332 - Attachment is obsolete: true
This is now fixed with the checkin for bug 32966.
Status: NEW → RESOLVED
Closed: 25 years ago22 years ago
Resolution: --- → FIXED
v
Status: RESOLVED → VERIFIED
Product: Tech Evangelism → Tech Evangelism Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: