msu.edu - URL: resolution of protocol:/path

VERIFIED FIXED

Status

Tech Evangelism Graveyard
English US
P3
major
VERIFIED FIXED
18 years ago
3 years ago

People

(Reporter: Gerd Kortemeyer, Assigned: Doron Rosenberg (IBM))

Tracking

Details

(Whiteboard: [SYNTAX-URL], URL)

Attachments

(8 obsolete attachments)

(Reporter)

Description

18 years ago
The (weird but correct) URL

 http:/cgi-bin/lecture.pl

gets interpreted as

 http://cgi-bin/lecture.pl

instead of

 http://(current host)/cgi-bin/lecture.pl

Both Netscape and Internet Exploder interpret the URL correctly.

Comment 1

18 years ago
->andreas
Assignee: gagan → andreas.otte

Comment 2

18 years ago
Another one of this usages of a deprecated relative URL version. This kind of
relative URL is no longer supported (see RFC 2396) and we decided to drop the
support. Marking WONTFIX. See bug 32966 for more details.
Status: UNCONFIRMED → RESOLVED
Last Resolved: 18 years ago
Resolution: --- → WONTFIX
(Reporter)

Comment 3

18 years ago
I guess I don't quite understand what's so wrong with the workaround suggested
in rfc 2396 - does that create any problems and result in more than one line of
code? I admit that in this case it is my own web application which creates the
deprecated URL, so I can pick up the advise in bug32966 and "fix the documents
according to rfc" (it's just a pain, that's all). But I wonder how many other 
pages and web applications have such a URL which was at least correct at one 
point in time and happily works with Netscape and Exploder, and which I cannot 
fix. In this decision, interpretation of a clearly malformed URL is given 
priority over the interpretation of an "only" deprecated URL. Also: I think I 
had put in the awkward "http:" for some reason three years ago, which I have 
forgotten now - with an ancient Apache server and an ancient Netscape or Mosaic 
on a VAX there was some problem if I did NOT have the "http:", maybe because 
"cgi-bin" is a script-alias (note that bug32966 also deals with a form action, 
which probably goes to some script-aliased directory). Don't know anymore.

Please (pretty pleeaaase) re-consider the workaround: " For backwards
      compatibility, an implementation may work around such references
      by removing the scheme if it matches that of the base URI and the
      scheme is known to always use the <hier_part> syntax. "

Whatever, I'll get off my soapbox, and I learned something. Thanks for your work 
on Mozilla - great project! 

Comment 4

18 years ago
It's a little bit more complicated than one line of code. I once made a proposal
(bug 22251) for handling URLs of this shape: http:page.html. The problem is that
we are also expected to handle malformed URLs correctly, like
http:/www.mozilla.org which should be translated to http://www.mozilla.org/. 
(Reporter)

Comment 5

18 years ago
> The problem is that we are also expected to handle
> malformed URLs correctly, like http:/www.mozilla.org
> which should be translated to http://www.mozilla.org/. 

I don't know if the following compromise is possible, but I think it's a good 
one:

* attempt to repair URLs like http:/www.mozilla.org/ as "malformed absolute 
URLs" if they are typed into the URL field of the browser

* attempt to honor URLs like http:/dir/file and http:file as "deprecated 
relative URLs" if they are links on a web page

This sounds (and is) inconsistent, but I think it makes sense:

* People usually don't type relative URLs into the browser's URL field, so with 
a very high probability http:/stuff and http:stuff was meant to be http://stuff

* Web pages are highly unlikely to have malformed links such as http:/server 
and http:server, since both Netscape and Exploder do NOT attempt to fix that, so 
a malformation like that would not go unnoticed by the author. Come to think of 
it, repairing links like http:/server might give Mozilla-using web authors a 
VERY wrong sense of security

Well, I don't want to bother you anymore, but please consider the compromise 
if your code structure allows for it. Thanks!!!

Comment 6

18 years ago
verified Wontfix
Status: RESOLVED → VERIFIED

Comment 7

17 years ago
*** Bug 84291 has been marked as a duplicate of this bug. ***

Comment 8

17 years ago
Reopening and sending to evangelism
Status: VERIFIED → UNCONFIRMED
Component: Networking → Evangelism
Resolution: WONTFIX → ---

Comment 9

17 years ago
received an unhappy message about the history of this bug. The author is aware
of the problem and is a mozilla supporter. marking assigned to note that the
author has been contacted.
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true

Comment 10

17 years ago
Created attachment 42333 [details] [diff] [review]
patch to deal with this deprecated relative urls

Comment 11

17 years ago
I'm providing this patch although I think it should not be applied easily. After
all this urls are deprecated and should not be used. But, if someone wants to
try it ...

To get this working we have to know something about the scheme we are working
on:

For backwards compatibility, an implementation may work around such references
by removing the scheme if it matches that of the base URI and the scheme is
known to always use the <hier_part> syntax.

If we don't want to hardcode "is always known to use the <hier_part> syntax"
into the urlparser we have to store this information on the protocol level ake
the protocol handler. This is what this patch does: add a new attribute to
nsIProtocolhandler and query it when necessary to help in parsing the url. This
comes with a little performance penalty.

Comment 12

17 years ago
I'm currently thinking about changing the addition in nsIProtocolHandler.idl to:

    /* standard url, hierachical (http, ftp, file, ...) */
    /* this is the default, the following masks are deviations */
    const short uri_hierachical    = 0;

    /* non hierachical, (about, rdf, javascript, finger, ...) */
    const short uri_nonhierachical = (1<<0);

    /* hierachical, but no authority component (file, ...) */
    const short uri_noauth         = (1<<1);

    /* more to come as needed */

    readonly attribute short uritype;

Also I found another protocolhandler in extensions/inspector and also two js
protocolhandler in extensions/irc and extensions/xmlterm. 

Comment 13

17 years ago
Created attachment 43551 [details] [diff] [review]
updated patch

Comment 14

17 years ago
Created attachment 43696 [details] [diff] [review]
Another version of the patch this time without the changes to nsStdURL.cpp
If you're trying to catch all the protocol handlers with this patch, I'd suggest
searching lxr for nsIProtocolHandler, as there are still more in the tree (eg
LDAP, though it's disabled by default).

Comment 16

17 years ago
Got that one, in fact I think I got all that are currently in the tree. What is
now of interest to me are external developers (like protozilla) or possibly some
developments in the embedding area to give them a heads up.

Comment 17

16 years ago
a couple nits:

1) uritype should be URIType.
2) the uri_xxx constants should be uppercase.

not doing so breaks the mozilla/necko code standards.

also, why do the uri_xxx constants need to be defined as bit masks?  aren't they
mutually exclusive?  If so, then they should simply be enumerated (ie. 0, 1, 2).
writing them using bit mask notation confuses the meaning of these flags.

Comment 18

16 years ago
Created attachment 44290 [details] [diff] [review]
revised patch

Comment 19

16 years ago
changed uritype to URIType, also uppercased the constants. I've gone back to
having a URI_STD constant with 0 as default and have the deviations from that
default scheme as bitmaskes which no longer are mutually exclusive.

Doug or Darin, can you r=

Comment 20

16 years ago
Created attachment 44596 [details] [diff] [review]
updated version for the trunk

Comment 21

16 years ago
there seem to be some indentation problems in nsIIOService.idl and
nsIProtocolHandler.idl...  fix these and r/sr=darin.

Comment 22

16 years ago
Created attachment 44658 [details] [diff] [review]
fixed indentation problems

Comment 23

16 years ago
This is only the first part of the fix for this bug. With this addition to
nsIProtocolHandler.idl and it's implementations we will later have a chance to
being able to deal with these deprecated relative urls if we ever decide to do
so. This first patch does nothing more then adding the needed information to the
 protocolhandlers.
Whiteboard: have r=darin, seeking sr=

Comment 24

16 years ago
qa to me.

-> networking - This is a URL handling issue that needs to be documented.
Evangelism has enough to do w/ other issues now.

+mostfreq
Component: Evangelism → Networking
Keywords: mostfreq
QA Contact: tever → benc
Summary: URL resolution of protocol:/path → URL: resolution of protocol:/path

Comment 25

16 years ago
hey andreas,

sorry it took me so long to get back to you - i just got back from vacation :-)

i think the patch looks fine...  

my only question is whether GetURIType(...) should take a scheme (just like 
AllowPort does)?  i don't feel strongly either way - you could certainly argue 
that *all* protocols supported by a given protocol hander should have the same 
URI type information...

sr=rpotts

Comment 26

16 years ago
... and I would argue that way. I my opinion it would not make sense to support
multiple schemes in on protocolhandler that have different behaviour regarding
its URI structure.

I checked it in, but it is only the groundwork. Now we are at least able to
handle these deprecated relative urls if we ever decide to do so. It's now
reduced to a fix to nsStdURL::Resolve. 

What now should be done is to figure out how big the performance penalty is in
doing this additional check in Resolve. I will attach the remaining patch for
nsStdURL.cpp for anyone to try.
 

Comment 27

16 years ago
Created attachment 44977 [details] [diff] [review]
remaining patch to nsStdURL::Resolve to deal with this type of relative urls

Comment 28

16 years ago
*** Bug 68528 has been marked as a duplicate of this bug. ***

Comment 29

16 years ago
can someone please provide a quick reference of behaviors for the following:

ftp://ftp.mozilla.org/pub/mozilla/test-40670/index.html
<a href="http:/pub/">test me</a>
<a href="http:../readme">test me</a>
and the same links if served via http.

And can someone please privately (to my username, not my bugmail account) 
explain _why_ we're implementing support for something that has been deprecated 
for nearly 3 years (since at least August 1998)?

bclary and the rest of the evang people have been willing to speak w/ 
individual webmasters as well as corporations about their pages / products 
which are using this broken feature.  From my experience we have been 
successful in evangelizing against this usage.  I will even volunteer to write 
to publishers if people feel that there are any books likely to be published or 
reprinted on the subject where it would be important to discourage these 
usages.

For comparison, XUL attributes which were deprecated before 1.0 are no longer 
supported by today's mozilla browser.  If we don't encourage people to 
deprecate this usage then I think we should have someone propose a new rfc 
admitting that this feature will live forever, because by continuing to support 
it we encourage its usage.

npm.netlib is currently discussing some similar issues 
in news://news.mozilla.org/3B6BD914.5080206@debitel.net but i don't see a 
recent discussion of this one.

there was a chance to discuss it in the tail of 
news://news.mozilla.org/3B6C29F4.7A8D0CA2@clarence.de

I'm currently 4000 netlib messages behind, but i do expect based on short 
discussions w/ gagan and others that any major changes (and this qualifies) to 
netwerk behavior would be discussed in netlib before they are implemented - so 
that when i do get around to reading them I will be able to find out what the 
issues are, who the players are, what their views are, and where the code 
changing bugs are.

wrt bugzilla, there are quite a few bugs on this specific topic, and most of 
them are *WONT or Evang, if you are going to make any CVS-Trunk code changes 
based on one of these bugs, please update all the bugs that you know of about 
this issue to link (reso dupe is ok) to this one so that people won't 
mistakenly find one and conclude ~no one is foolishly making this change~.

fwiw wget doesn't like this stuff either.
$P> wget http://www.wtop.com:80/index.html
--19:09:51--  http://www.wtop.com:80/index.html
           => `index.html'
Connecting to www.wtop.com:80... connected!
HTTP request sent, awaiting response... 302 Found
Location: http:./index.jhtml [following]
--19:09:57--  ftp://http:21/index.jhtml
           => `index.jhtml'
Connecting to http:21...
http: Host not found

In the past, we took the view that
"deprecated means that pages should stop using it,
 because one day clients will"
and the day to stop supporting it had come.

anyways, I hope to read npm.netlib to see the discussion.

Comment 30

16 years ago
As far as I'm concerned that's it with this bug. We now can implement this
support easily (see latest attached patch to nsStdURL.cpp) if we want to. The
latest patch is for documentation only. Or we can do some kind of quirks mode
for urls.

Current thinking still stands that we don't want to. This usage is deprecated
and should go away, but judging from the amount of bugs it still is in wide usage. 

Back to Evangelism ... trying to kill all usage of it ...
Assignee: andreas.otte → bclary
Status: ASSIGNED → NEW
Component: Networking → Evangelism
QA Contact: benc → zach
Whiteboard: have r=darin, seeking sr=

Comment 31

16 years ago
All Evangelism Bugs are now in the Product Tech Evangelism. See bug 86997 for
details.
Component: Evangelism → US English
Product: Browser → Tech Evangelism
Version: other → unspecified

Updated

16 years ago
Summary: URL: resolution of protocol:/path → msu.edu - URL: resolution of protocol:/path

Comment 32

16 years ago
If they asked people to stop using this kind of relative path in RFC2396, why is
it still valid in the BNF in RFC2396?

(Section 5.2.7 defines it as valid, too.)

Appendix A shows:

 URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
 absoluteURI   = scheme ":" ( hier_part | opaque_part )
 relativeURI   = ( net_path | abs_path | rel_path ) [ "?" query ]

 hier_part     = ( net_path | abs_path ) [ "?" query ]
 opaque_part   = uric_no_slash *uric

 net_path      = "//" authority [ abs_path ]
 abs_path      = "/"  path_segments
 rel_path      = rel_segment [ abs_path ]


In fact, appendix A says that:

//host/path

is a valid URL too.
If I read that grammar correctly, that says that 'scheme' followed by '":"' is
only allowed in ablosute URIs....  Relative URIs do not contain a scheme or a
":" per that grammar.

Comment 34

16 years ago
SPAM.  You may filter these bugs by querying the string "ReadingRitingRithmetic"

Mass Moving Bugs on Educational institutions in the United States to the new US
Edu component and assigning to default owners.
Assignee: bclary → doronr
Component: US General → US Edu
QA Contact: zach → caillon

Updated

16 years ago
Whiteboard: [SYNTAX-URL]

Comment 35

16 years ago
Created attachment 76645 [details] [diff] [review]
updated patch to support these kind of deprecated relative urls

moved the latest patch into the world of nsStandardURL.cpp.
Attachment #42333 - Attachment is obsolete: true
Attachment #43551 - Attachment is obsolete: true
Attachment #43696 - Attachment is obsolete: true
Attachment #44290 - Attachment is obsolete: true
Attachment #44596 - Attachment is obsolete: true
Attachment #44658 - Attachment is obsolete: true
Attachment #44977 - Attachment is obsolete: true
Andreas, this is an evangelism bug.  Evangelism has been getting people to use
correct URL format for quite some time now -- why the sudden change?  This will
make us look bad if we tell people that we won't support something and then
change our minds and support it -- people will then start expecting us to
support other non-standard things, such as layers, etc.  We should remain firm
with the stance we've taken.

I vote to leave the bug as evangelism.  I for one don't want to see our behavior
change.  There really aren't that many sites that have this format anyway
AFAICT.  We get a wide range of problems and this is probably one of the lesser
reported "bugs" -- sure they are out there, but let's concentrate on fixing those.

Doron, ping?  http://lectureonline.cl.msu.edu/ and
http://lectureonline.cl.msu.edu/cgi-bin/lecture.pl are the pages in question --
there is a meta refresh, a gif and a form which uses the bad behavior. 
Interestingly enough, the meta refresh is broken but the 'Continue' link is not.
 Contact is possibly Gerd Kortemeye <korte@lite.msu.edu> (author)

Comment 37

16 years ago
No change, not from me anyway. I just updated a patch on this bug that was a
year old. Although you should know that there lately was some discussion about
this because Mike Shaver was hit by this and questioned the current status. I
took that occasion and updated the patch. Just searched for a place to store the
patch :) and thought it was okay here because it already contained an older
version of the patch. The patch just shows: We can do this if we want, but
currently we don't want ...
Andreas, ah okay.  Sorry I misunderstood the intent of the patch.  Actually, on
looking at this bug again, I see the reporter of this bug is in fact Gerd
Kortemeyer.  :)

Gerd, we haven't heard from you on this bug in a while -- is there any status
update on this?  It would be great if the pages at MSU would use proper URIs.
Comment on attachment 76645 [details] [diff] [review]
updated patch to support these kind of deprecated relative urls

r=bbaetz on this patch. I never did understand why we didn't support this.
(although this really should be a separate bug, rather than an evangalism one)

Then we can close bug 22251, too.
Blocks: 142280
Christopher Aillon said on comment 36
> .. this is probably one of the lesser reported "bugs"

lesser reported bugs ???

Bug 22251, Bug 22894, Bug 29012, Bug 30707, Bug 32803, Bug 32966,
Bug 36821, Bug 40670, Bug 48014, Bug 50629, Bug 51055, Bug 51971,
Bug 56426, Bug 57146, Bug 58204, Bug 58522, Bug 58537, Bug 60308,
Bug 61128, Bug 61582, Bug 61890, Bug 62624, Bug 65474, Bug 68528,
Bug 65650, Bug 68977, Bug 69722, Bug 70886, Bug 71441, Bug 75807,
Bug 79369, Bug 82365, Bug 83269, Bug 84450, Bug 84456, Bug 84474,
Bug 84507, Bug 88356, Bug 91180, Bug 91364, Bug 99263, Bug 105898,
Bug 107061, Bug 111659, Bug 121350, Bug 112458, Bug 123176, Bug 125010,
Bug 135549, Bug 139326, Bug 139330, Bug 139872, Bug 140826, Bug 141408,
Bug 143469, Bug 145045, Bug 149370, Bug 150163, Bug 152579, Bug 151302,
Bug 154195, Bug 155506, Bug 155880, Bug 157337 and probably more.

These bugs also have a few votes (at least 12)

This bug works fine with Netscape<6, MSIE, Opera, KOnqueror, Lynx, and others.

Can we be sensible and reasonable and fix this ?

Or will we continue fighting against the world ?

Updated

15 years ago
Attachment #76645 - Attachment is obsolete: true

Comment 41

15 years ago
The original url from this bug seems not to be fixed by the patch on bug 32966.
Seems to be some other redirection problem, I will investigate ...

Comment 42

15 years ago
The code in nsDocShell::SetupRefreshURIFromHeader does not work with relative
urls in mind, so the code put in to fix bug 32966 is never called. Instead the
code for creating absolute urls is called. Does anyone know what the RFCs say
about relative urls in HTTP-EQUIV headers?

Updated

15 years ago
Depends on: 163225

Comment 43

15 years ago
this is now fixed by the checkin for bug 163225.
Status: NEW → RESOLVED
Last Resolved: 18 years ago15 years ago
Resolution: --- → FIXED

Comment 44

15 years ago
v
Status: RESOLVED → VERIFIED

Comment 45

15 years ago
tech evang june 2003 reorg
Component: US Edu → English US
Product: Tech Evangelism → Tech Evangelism Graveyard
You need to log in before you can comment on or make changes to this bug.