Closed Bug 170241 Opened 22 years ago Closed 18 years ago

URL: escaped characters in hostname

Categories

(Core :: Networking, enhancement)

x86
Windows 2000
enhancement
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 309671

People

(Reporter: twb0, Assigned: nhottanscp)

References

()

Details

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2a) Gecko/20020910
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2a) Gecko/20020910

Some HTML E-mails (i.e. from MSN and others) contain links which are not formed
correctly.  For example, one link from MSN:

http://g%2Emsn%2Ecom/0NL37384/2836

The %2E should be a "."

Microsoft IE and Opera are aware of this problem and automatically fix the hex
characters for you.  Obviously the people who create these poorly-written HTML
E-mails should fix their problem, but it would be helpful if the browser
compensated for this common error.


Reproducible: Always

Steps to Reproduce:
1.  Open HTML-based E-mail from MSN
2.  Click on any URL link (for example, http://g%2Emsn%2Ecom/0NL37384/2836)


Actual Results:  
Netscape/Mozilla will complain that "http://g%2Emsn%2Ecom/0NL37384/2836" could
not be found.

Expected Results:  
Mozilla should have expanded the %2E (and any other hex digits) and re-written
the URL to appear as:

http://g.msn.com/0NL37384/2836
Of course expanding the hex digits in 

http://g%2Emsn%2Ecom/0NL37384/2836%3Ffoo

would be an error (the %3F expands to an illegal character)....

Over to networking to decide whether we want to unescape just the hostname part
of urls...
Assignee: hewitt → new-network-bugs
Component: URL Bar → Networking
QA Contact: claudius → benc
I think the answer is: no. But I know who knows the answer :)
Summary: Browser should fix "malformed" URLs with hex digits → URL: escaped characters in hostname
Don't be so sure ... hostnames and escaping is tricky stuff. I remember we did
this once and it was changed ... had to do with international characters in
hostnames. ccing darin who did the IDN stuff.
we decided to do away with unescaping hostname characters because it helped
avoid security bugs and because there are no DNS characters that require
escaping.  at the time cookies and necko were not using the same path for URL
parsing, but that has since changed... so, we might want to revisit this.
*** Bug 170708 has been marked as a duplicate of this bug. ***
confirming while we debate this
Status: UNCONFIRMED → NEW
Ever confirmed: true
*** Bug 171172 has been marked as a duplicate of this bug. ***
*** Bug 157019 has been marked as a duplicate of this bug. ***
andreas, thoughts on this?
nhotta, is this a problem?
Assignee: new-network-bugs → nhotta
Some Windows programs use this technique when trying to launch a URL to avoid
'overly-smart shell' problems.  Of course, they probably only test it on IE.

looked through RFC 2396 quickly and I didn't see an express prohibition on
escaping the hostname in the text, but I also can't see a way in the BNF where
it would be allowed.

Is the security concern that an invalid hostname might slip through some
security checks?  Could the unescaping be done early enough in the code that it
could not slip past the checks?
see Bug 191388 for escaped IDN hostnames
*** Bug 261276 has been marked as a duplicate of this bug. ***
Based on the spoofing problems, I'd like to NOT support unescaping the hostname.

As far as I can tell from the spec, you're not allowed to escape characters in
hostnames anyway. INVALID based on that and the number of comments in this bug
that seem reluctant to change it.
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → INVALID
V/invalid.
Status: RESOLVED → VERIFIED
*** Bug 261276 has been marked as a duplicate of this bug. ***
According to RFC-3986 (the new URI spec that obsoletes RFC-2396) that is now allowed. See bug 309671.
Status: VERIFIED → RESOLVED
Closed: 20 years ago18 years ago
Resolution: INVALID → DUPLICATE
You need to log in before you can comment on or make changes to this bug.