doesn't follow clicked link when using base url

VERIFIED WONTFIX

Status

()

P3
normal
VERIFIED WONTFIX
19 years ago
19 years ago

People

(Reporter: foo, Assigned: andreas.otte)

Tracking

Trunk
x86
Windows NT
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(5 attachments)

(Reporter)

Description

19 years ago
When the address above is selected and I try to click on a language item in the
navigation bar the browser does nothing. Probably it has to do with the lengthy
base URL address and also the lengthy relative URL specified behind the
corresponding image.

I used the latest build (2000-03-13) with ID: 2000031306 on Windows NT 4 SP5, enUS

Cheers
 Marcel

Comment 1

19 years ago
-> Networking
Component: Browser-General → Networking

Comment 2

19 years ago
Reassign to gagan.
Assignee: cbegle → gagan
QA Contact: asadotzler → tever

Comment 3

19 years ago
Confirmed on Linux build 2000.03.13.03
Status: UNCONFIRMED → NEW
Ever confirmed: true

Comment 4

19 years ago
When I right click on a link and chose "Open Link in New Window" I get this in
the console:

JavaScript Error: uncaught exception: [Exception... "Component returned failure
code: 0x804b000a [nsIBrowserInstance.loadUrl]"  nsresult: "0x804b000a
(<unknown>)"  location: "JS frame :: chrome://navigator/content/navigator.js ::
onLoadViaOpenDialog :: line 243"  data: no]

Comment 5

19 years ago
andreas could you take this?
Assignee: gagan → andreas.otte
(Assignee)

Comment 6

19 years ago
To quote Judson Valeski: "There should be laws against this kind of URLS!" The
problem here is the unescaped colon in the relative URL! The current version of
MakeAbsoluteURI/Resolve thinks this is an absolute URL because of it. I'm pretty
sure this URLs would work with the fixes I proposed for bug 22251 which were
able to detect a valid protocol and if not valid try it as relative URL.

Since the current status of bug 22251 is WONTFIX, I'm afraid this is what will
happen to this one too. Alternativly the author of this page could escape the :
as %3A which works fine.
 
(Assignee)

Comment 7

19 years ago
Gagan, Warren, what do you think about this? The fix for this would be to check
if the chars before the colon are a valid protocol by asking for a
protocolhandler with that scheme.
This would involve a subset of the fixes for bug 22251 and definitly the three
dummy-protocolhandlers for rdf, urn and NC protocols.

I see no other way to do this in an extensible form. If in doubt if the string
is an absolute URL or not try the chars before the colon as a protocol scheme.
If it is, it's an absolute URL, if not, it's relative. This will not fix bug
22251, but would fix this one.
Status: NEW → ASSIGNED
(Assignee)

Updated

19 years ago
Target Milestone: M15

Comment 8

19 years ago
I see a bunch of links that say HREF="X2=b64:Qmxvd2Zpc2hTY3Jh... I assume these 
are the ones that don't work, but I've never seen this form before. What does it 
mean? b64 is base-64 encoding, but what is x2= ?
(Assignee)

Comment 9

19 years ago
Yes, these links are the problem and the colon is the killer. The current
implementation simply assumes that because of the colon the URL is absolute. It
does not check if the chars before the colon are a valid protocol.

I also haven't seen this type uf URLs before.
(Assignee)

Comment 10

19 years ago
This is from rfc 2396:

   Authors should be aware that a path segment which contains a colon
   character cannot be used as the first segment of a relative URI path
   (e.g., "this:that"), because it would be mistaken for a scheme name.
   It is therefore necessary to precede such segments with other
   segments (e.g., "./this:that") in order for them to be referenced as
   a relative path.

The relative urls in question in this site violate that rule. However I don't
think we should be that strict. 

Also currently we do not recognize ./this:that version because of the very
strict code in mozilla/netwerk/base/public/nsNetUtil.h that only searches for a
colon. The code in nsStdURL::Resolve as well as SetRelativePath can handle this.

That code in nsNetUtil.h should be removed anyway.

I think it makes more sense to deal with the chars before the colon and figure
out if it is one of the registered protocols. If it is then this should be
handled as an absolute URL otherwise as an relative one.
(Assignee)

Comment 11

19 years ago
Created attachment 6948 [details]
zip file of the three missing protocol handlers, extract under mozilla/rdf
(Assignee)

Comment 12

19 years ago
Created attachment 6949 [details] [diff] [review]
patch to get Resolve/SetRelativePath working with these kind of relative urls

Comment 13

19 years ago
I don't like this. It means that Mozilla needs to "know" how to interpret a URI 
in order to refer to it in RDF. What if somebody wants to send us an RDF/XML 
that talks about "tv:" or "zip:" or "z9350:", none of which we support now? 
Certainly it's still valid to talk about this as meta-data.

I'd prefer doing something like this: RDF can *ask* Necko if it has a protocol 
handler for some prefix, and if so, will use Necko to resolve an absolute URI. 
If not, I won't send you the URL.
(Assignee)

Comment 14

19 years ago
Okay, I haven't seen the whole picture. Am I getting this right? You can invent 
any form of url scheme in XML/RDF meta descriptions? If this is the case you are 
certainly right, rdf should then use another resolver.

Comment 15

19 years ago
Well, it's not so much that you can "invent" any URI scheme as that you can
*use* any URI scheme. Mozilla should be able to parse RDF/XML that talks about a
"z3950:" resource, even if it doesn't have a protocol handler that could be used
to *display* it.

I would like to continue to use the Necko resolver to "make absolute" with
protocols that Mozilla *does* know about. How would you suggest I do that? All
of the calls should be funneled through this code:

http://lxr.mozilla.org/seamonkey/source/rdf/base/src/rdfutil.cpp#92
http://lxr.mozilla.org/seamonkey/source/rdf/base/src/rdfutil.cpp#118
http://lxr.mozilla.org/seamonkey/source/rdf/base/src/rdfutil.cpp#138

So I can re-implement these to do whatever needs to be done.
(Assignee)

Comment 16

19 years ago
Yes, I found that code too and I'm on it. What I want to try now is to insert
that code that I removed from nsNetUtil.h into rdf_MakeAbsolute. This makes two
versions of MakeAbsolute, one (rdf-version) that acts on a colon and declares
the URI in that case as absolute, and the necko version that trys to look for
the protocolhandler if necessary. Of course the rdf-version will call the
necko-version if its code to identify an absolute URI fails.
(Assignee)

Comment 17

19 years ago
Created attachment 7187 [details] [diff] [review]
New version that works without the additional protocolhandlers

Comment 18

19 years ago
Ugh, I didn't realize how horrible these rdf_MakeAbsoluteURI() routines are. I 
can't even remeber what they're supposed to do (sigh, I suck). IIRC, the first 
argument is always a "base URL", and the second argument is the "relative URL" 
we want to resolve wrt. the base. The second argument may already be absolute, 
in which case we'll just leave it alone. (I will fix that up to be clearer in 
the code.)

After staring at these for a bit, I think the logic is basically sound, 
although a bit squirrely :-). I'd suggest we write it like this -- same up to 
the second "if (NS_FAILED(rv))":

    PRUint32 startPos, endPos;
    NS_WITH_SERVICE(nsIIOService, serv, kIOServiceCID, &rv);
    if (NS_FAILED(rv)) return rv;
    char* i_Uri = aURI.ToNewCString();
    rv = serv->ExtractScheme(i_Uri, &startPos, &endPos, nsnull);
    CRTFREEIF(i_Uri);
    if (NS_FAILED(rv)) {
        // there is no scheme in the URI, use Necko to resolve
        // as absolute.
        rv = NS_MakeAbsoluteURI(result, aURI, base);
        if (NS_SUCCEEDED(rv))
            aURI = result;
    }
    // any failure leaves aURI unchanged
    return NS_OK;

Comments:

1. Does ExtractScheme() really throw an exception if there is no scheme? Seems 
like it should probably just return negative indexes?

2. Do we want to use ToNewUTF8String() here? I can't ever remember what the 
right thing to do is.
(Assignee)

Comment 19

19 years ago
Yes, ExtractScheme throws an NS_ERROR_MALFORMED_URI and of course, thinking 
about it, ToNewUTF8String() should be used. 

Comment 20

19 years ago
Ok: if you're up for posting one more round o' diffs, I think we've got this in
the bag. Thanks for all your help Andreas!
(Assignee)

Comment 21

19 years ago
Created attachment 7226 [details] [diff] [review]
Next round of patches
(Assignee)

Comment 22

19 years ago
This will not make it for M15.
Target Milestone: M15 → M16

Comment 23

19 years ago
Thanks andreas. The RDF changes look great, and that part is r=waterson. Not
sure about the impact of the other stuff. I'll run with these patches in my tree
and let you know if I have any problems.

Comment 24

19 years ago
Ack, I tried applying the patch and I'm crashing while building nsStdURL.cpp,
NS_ERROR_UNSUPPORTED_PROTOCOL not defined. Am I missing something?
(Assignee)

Comment 25

19 years ago
Yikes !!!! I'm an idiot! Don't do more than one bugfix per file. Next round
follows!
(Assignee)

Comment 26

19 years ago
Created attachment 7234 [details] [diff] [review]
This time without some unrelated other changes

Comment 27

19 years ago
Moving to M17 which is now considered part of beta2.
Target Milestone: M16 → M17
(Assignee)

Comment 28

19 years ago
After thinking about this some more time, I think, I will put this one
also in the WONTFIX category as bug 22251 (and others) before.

RFC 2396 is very clear about this:

   Scheme names consist of a sequence of characters beginning with a
   lower case letter and followed by any combination of lower case
   letters, digits, plus ("+"), period ("."), or hyphen ("-").  For
   resiliency, programs interpreting URI should treat upper case letters
   as equivalent to lower case in scheme names (e.g., allow "HTTP" as
   well as "http").

      scheme        = alpha *( alpha | digit | "+" | "-" | "." )

   Relative URI references are distinguished from absolute URI in that
   they do not begin with a scheme name.  Instead, the scheme is
   inherited from the base URI, as described in Section 5.2.

    ...

   Authors should be aware that a path segment which contains a colon
   character cannot be used as the first segment of a relative URI path
   (e.g., "this:that"), because it would be mistaken for a scheme name.
   It is therefore necessary to precede such segments with other
   segments (e.g., "./this:that") in order for them to be referenced as
   a relative path.

   ...

5.2. Resolving Relative References to Absolute Form

    ...

   3) If the scheme component is defined, indicating that the reference
      starts with a scheme name, then the reference is interpreted as an
      absolute URI and we are done.  Otherwise, the reference URI's
      scheme is inherited from the base URI's scheme component.

The relative URLs in that bug are in clear violation of that RFC and
only backwards compatibility and 4.x parity can be arguments in favour
of fixing this. But since we decided to not fix bug 22251 and all
related bugs, the same applys here.

I now view the proposed fixes attached to the bugs as a test of concept.
We can fix this if we want, but we don't want to. The author of the web page can
always fix the problem with escaping the colon.

Status: ASSIGNED → RESOLVED
Last Resolved: 19 years ago
Resolution: --- → WONTFIX

Comment 29

19 years ago
verified Wontfix
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.