Closed Bug 96519 Opened 23 years ago Closed 22 years ago

URL: escaping of ":"

Categories

(Core :: Networking, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Future

People

(Reporter: ltskinol, Unassigned)

References

()

Details

Linux 2001082208

The listed URL doesn't render in Mozilla at all.  I get a blank display.  Works
fine in NN4.7.

Note the ':' in the URL.  When Mozilla opens the page, this gets changed
to %3A.  Could this be the problem?
evang fodder?
Assignee: asa → neeti
Component: Browser-General → Networking: HTTP
QA Contact: doronr → tever
[confirming]

IE5.5 handles that URL and displayes the page, too.

A quick skim of RFC 1738 (secction 2.2) confirms (I think) that the colon ":" is
a reserved character in http URLs, and in this case, it's not being used properly.
Moz is doing the Right Thing, IMO, and IE is forgiving a pretty major mistake.

I believe that the http server should probably be able to cope with Moz sending
%3A in place of the colon (since that's an escaped colon) and serve the file anyhow.

-> Tech Evangelism
Status: UNCONFIRMED → NEW
Component: Networking: HTTP → US English
Ever confirmed: true
Product: Browser → Tech Evangelism
Version: other → unspecified
A server MUST be able to cope with any character being escaped, IIRC, so this is
a server bug.

However, : is not reserved in the path. Hmm. Darin? Do we need to modify our
many url escaping routines?

-> default owner
Assignee: neeti → bclary
OS: other → All
QA Contact: tever → zach
Hardware: Other → All
Summary: Page doesn't render → news.excite.com - server does not accept %3A as substitute for :
If the HTTP URL definitions do not reserve this character in the scheme, then we
should not encode it, based on my limited RFC 1738 only understanding of URLs.

I'm going to take this back until it is clear this escaping is the correct behavior.
Assignee: bclary → neeti
Component: US English → Networking
Product: Tech Evangelism → Browser
QA Contact: zach → benc
Version: unspecified → other
Escaping is not optimal behaviour, but its not incorrect. I agree that we
shouldn't be doing it, though.
Oops - that was me, using gerv's machine. Sorry.
I believe this is Mozilla's bug. RFC 1738 defines HTTP URL as follows:

; HTTP

httpurl        = "http://" hostport [ "/" hpath [ "?" search ]]
hpath          = hsegment *[ "/" hsegment ]
hsegment       = *[ uchar | ";" | ":" | "@" | "&" | "=" ]
search         = *[ uchar | ";" | ":" | "@" | "&" | "=" ]

where 

hostport       = host [ ":" port ]
uchar          = unreserved | escape
unreserved     = alpha | digit | safe | extra
escape         = "%" hex hex
reserved       = ";" | "/" | "?" | ":" | "@" | "&" | "="


So according to this RFC, reserved characters, e.g. ":" may be used 
in hpath.
And reserved characters may not be used without changing the
semantics of the URL when it is escaped. 

I have seen URL forwarding scheme used by some proxy server
take the form of:

http://www.tohoku.ac.jp:8081/=@=:www-student.ulis.ac.jp/html/virtual-library/

Nnote the use of the reserved character ":" in this proxy forwarding
scheme. This will not work if you escape it. There are probably other
uses of reserved character that is probably broken because of the 
current Mozilla behavior.
Okay. If someone wants to take this up w/ excite, create a new bug and send it
to evangelism.
Yes, but the spec also says tha the %xx form is equivalent to the 'normal' form,
when the character is not reserved. So we're not right, but they're wrong :)
Sure. New bug for evanglism. I'm keeping this one b/c it's got all the good
analysis.
andreas: what do you think about this bug
According to rfc 2396 (which superceeds 1738) the colon is (although listed) not
reserved in the path component. Also it has no function in parsing the url
inside path. For that matter it can count as not reserved and is equivalent to
it's hexencoding. Excite should unescape it after parsing the path.

With the current parsing algorithm we have to escape a colon inside the filename
(which is part of the path) because otherwise we would mistake the colon in a
filename as indicator for an absolute url while resolving a relative url. We
definitly don't want that to happen. A possible solution might be to handle the
escaping of relative URLs differently than absolute URLs, but I don't think that
should happen unless absolutely necessary.
Target Milestone: --- → Future
moving neeti's futured bugs for triaging.
Assignee: neeti → new-network-bugs
Summary: news.excite.com - server does not accept %3A as substitute for : → URL: escaping of ":"
This will be fixed with the patch for bug 193477.
Depends on: 193477
The given url is no longer valid, can't check against it, but this should be
fixed now with the checkin for bug 193477 (for 1.4a). Please reopen if this is
not correct.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.