URL: Whitespace in link href's should be removed rather than escaped as %20

VERIFIED WONTFIX

Status

()

Core
Networking
VERIFIED WONTFIX
16 years ago
15 years ago

People

(Reporter: Phillip M. Jones, C.E.T., Assigned: Darin Fisher)

Tracking

({testcase})

Trunk
testcase
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(URL)

(Reporter)

Description

16 years ago
There many items on this CEA web Page which have URL with spaces in the link (URL). Communicator follows the correct convent as 
set up by W3C that the page shoul come back as either an error 400 or an error 404.

Mozilla (and N6/7 which is based on Mozilla) automatically inserts %20 in place of the spaces and allows the site to work. According 
to W3C this should not happen.

I have changed the buttons above from Mac OS 9 because I've verified it with a Friend that use a Modern Compac PC. He receives 
the same action both on Communicator and Netscape 7 which is based on Mozilla. I've tested with both Mozilla and netscape 7. I 
get the same results.
-> Networking:http
Assignee: Matti → darin
Component: Browser-General → Networking: HTTP
QA Contact: imajes-qa → tever
(Assignee)

Comment 2

16 years ago
i don't see anything wrong with mozilla fixing up the URL.  in fact, it seems
like a good thing IMO.
Status: UNCONFIRMED → RESOLVED
Last Resolved: 16 years ago
Resolution: --- → WONTFIX
(Reporter)

Comment 3

16 years ago
To do as you suggest, is equivently to IE feature of automatically fixing left off tags and such. Bad Code is Bad Code. The people  
(developers) at mozilla. have been Railing against this for years now. Standards are Standards and W3C says it wrong.
(Assignee)

Comment 4

16 years ago
but spaces must be eliminated from URLs before they can be used with the HTTP
protocol.  that is a strict requirement of the protocol.  otherwise, the request
would be malformed since HTTP uses space as a delimiter character in the request
line.  so, the only alternative to escaping spaces as we currently do would be
to reject the URL as malformed and throw up an error dialog or a broken image
tag, etc.  i personally don't see the harm in escaping spaces.  RFC 2396 says we
can escape any character we like... even characters that don't need to be escaped.

can you please point out the spec (and the location in that spec) that we are
violating?

Comment 5

16 years ago
There is some discussion about the grade of uri fixup that should be done in
links versus the urlbar. Some people think there should be absolutely no fixup
on links. We can't do that at the moment because some of the fixup code is
embedded in the urlparser. 

You are correct, a url link in a page should be a valid url with only the
allowed chars used. But this is not an ideal world and there is no ideal web. So
we do some uri fixup like escaping in every case.
  
 
The HTTP spec is very explicit that escaped things must be unescaped before
comparing uris for equality with local files/scripts/whatever.

spaces have to be escaped here, because thats the separator. Its not like the
escaping of ~ which some sites get confused with.

We do do more fixup from the urlbar than from page links. IS this recent?
(Reporter)

Comment 7

16 years ago
In the case I cited in the bug report. The author  accidentally put a space between  the  last section of the url which contained, 
starting just after the "-" character (not including the quote marks) and the rest of the URL. Chances are what happened is that the 
wrote the first accidentally including the space. then testing it out using netscape7 he found it worked. So since he saw it worked, he 
assumed this was a new feature, rather than a bug and continued using wrong method. Then when many people called them on the 
carpet about it. He stated that since it worked in Internet Explorer and Netscape 7, this must be the correct method. 

Several People on an Electronics Group using other browsers including Communicator, older versions of IE pointed out that it didn't 
work, They shot back well it works with the latest version of IE, and Netscape 7, and Mozilla, so it must be right.

And Mozilla, has been railing against IE's promotion of sloppy code, by fixing things that the web page author should have gotten 
right the first time. Aren't we promoting the same thing if this is not corrected?

Comment 8

16 years ago
Bradley: We do \ to / subsitution on Windows only in the urlbar, not on links.
Also there is the www.  .com thing which only happens on the urlbar afaik. Maybe
there is more.

We have no quirks/strict mode for urls. Maybe we need something like this.
urlbar is always quirks, on links it depends on the mode of the document the
links are embedded in.

Updated

16 years ago
Summary: Should Fail with either a 400 or 404 error because of space in URL Mozilla and N6-7 inserts a %20 for the space allowing the page to work → Whitespace in link href's should be removed rather than escaped as %20

Comment 9

16 years ago
Verified wontfix.
Status: RESOLVED → VERIFIED
QA Contact: tever → junruh

Comment 10

15 years ago
URL RFC's say that whitespace should be removed inside URL's. But the general
interpretaion is that spaces in URL's were meant to be there (and forgotten to
be unescaped), while other whitespace is nonfunctional and should be remove. I'm
hoping to do a full writeup of these behaviors. soon.
Component: Networking: HTTP → Networking
Summary: Whitespace in link href's should be removed rather than escaped as %20 → URL: Whitespace in link href's should be removed rather than escaped as %20
*** Bug 203056 has been marked as a duplicate of this bug. ***

Updated

15 years ago
Keywords: testcase

Comment 12

15 years ago
I noticed one instance where a clicked link resulted in a valid working url,
with a %20 at the end.  It has been like that for a long time so apparently it
is working for most people, but with mozilla it fails.  Object not found.

Clearly a careless web designer left a space at the end of the url, and that
causes mozilla to fail.  It even works with netscape 4.77.  

I think it is suicide to stand on a principle here and have fail cases because
of RFCs.  
You need to log in before you can comment on or make changes to this bug.