Closed Bug 7399 Opened 21 years ago Closed 21 years ago

Escaping illegal chars in URLs


(Core :: Networking, defect, P3)

Windows NT





(Reporter: hjtoi-bugzilla, Assigned: gagan)




Recently had spaces in links. They worked in IE and
Opera, but not in Netscape nor Gecko. They later changed the spaces to

In XML world at least the browser should escape ALL illegal characters in URLs
(I just read a mail about that today, but can't remember on which list it was).
So if there are spaces in URLs they should be escaped with %20 automatically by
the browser. Gecko understands escaped URLs, it is just a matter of doing the

The URL has a doc that contains one link that points to a file with a space
in its name. IE handles that fine, NS and Gecko fail.
There are some problems with this:

1) different URL RFCs have different ideas of what illegal characters are
2) Should the URL, as given in the document already be legal?  Is it the job of
the browser to correct a URL when the correction might mess up the server?
(What do current browsers do here?)

I think one may end up having to stick to tradition on this, but I'm not really
sure what the URL RFC's say about correction of URLs.

(When the site you mention above had spaces in links, was the whole thing in
quotes?  If not, then the problem was with parsing.)
It took some time to find where I had read that piece about illegal characters
in URIs (note, _URI_). The below URLs should answer your questions.

The discussion happened on XML-DEV. Here is a link to the archive and the
thread you should read:

Here are some extracted relevant URLs from the discussion:
According to those last two links, which point to HTML 4 section B.2.1 and
the last working draft of the W3C Character Model respectively, we should
indeed be escaping URIs.

1) We should probably take the superset. That way all bases are covered.
2) Yes, the URI in the document should indeed be legal. No, I would say that it
   is not our job to correct it. However, we should certainly not be sending
   invalid URIs to servers, so I suggest encoding would be best.

Currently, we are dropping spaces in URIs altogether (this happens somewhere in
the content sink, see bug 8319). We should certainly not be doing this.
Pushed past necko landing...
Changing all Networking Library/Browser bugs to Networking-Core component for

Occasionally, Bugzilla will burp and cause Verified bugs to reopen when I do
this in a bulk change.  If this happens, I will fix. ;-)
Closed: 21 years ago
Resolution: --- → DUPLICATE
*** This bug has been marked as a duplicate of 10373 ***
Bulk move of all Networking-Core (to be deleted component) bugs to new
Networking component.
You need to log in before you can comment on or make changes to this bug.