Closed Bug 310081 Opened 19 years ago Closed 19 years ago

Firefox modifies the URL after it is manually entered or if a server opens another browser window with the same URL.

Categories

(Firefox :: Address Bar, defect)

x86
Windows 2000
defect
Not set
normal

Tracking

()

RESOLVED INVALID

People

(Reporter: STEPHENBKLYN, Unassigned)

References

()

Details

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.10) Gecko/20050716 Firefox/1.0.6
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.10) Gecko/20050716 Firefox/1.0.6

I believe I have found a problem with the way Firefox parses text in the URL
field.  I tried to find my bug in the subdivision: "Location Bar and
Autocomplete", but did not see any matches.

I also search using the keywords: url parse location

When I was an intern at IBM Research I developed an some javascript code for
their bioinformatics site which extracts text from the URL field.  For testing
the code (while at IBM), I used Windows NT 4.0 running both the IE 5.X (I forgot
which version) and the Netscape 4.75 browsers.

The javascript code currently works using IE 5.5 with service pack 2 (on an
Windows 2000 Professional, SP4 system).  I do not have Netscape installed on my
current system.

The javascript code does NOT work with Firefox 1.0.6 because the browse modifies
the URL.

The javascript code parses the entire URL in the location/url field of the
browser and extracts information which it then transfers to a textarea on the page.

This bug can be easily duplicated if you paste the following URL into your browsers.

	http://cbcsrv.watson.ibm.com/Tpa.html?>1_to_5_(5_residues)%abcde

After you press ENTER key the part of the URL after the ? (namely the single ">"
character) is changed by Firefox to three characters "%3E"

	http://cbcsrv.watson.ibm.com/Tpa.html?%3E1_to_5_(5_residues)%abcde

If you compare IE vs Firefox the contents of the TextArea are different.


Reproducible: Always

Steps to Reproduce:
1.Paste URL into browser:
http://cbcsrv.watson.ibm.com/Tpa.html?>1_to_5_(5_residues)%abcde

2.Hit Enter Key

3.URL is changed by firefox

http://cbcsrv.watson.ibm.com/Tpa.html?%3E1_to_5_(5_residues)%abcde

4. The
Actual Results:  
http://cbcsrv.watson.ibm.com/Tpa.html?%3E1_to_5_(5_residues)%abcde

Expected Results:  
The browser should not edit URLs.  The original URL is desired.

http://cbcsrv.watson.ibm.com/Tpa.html?>1_to_5_(5_residues)%abcde
*** Bug 310082 has been marked as a duplicate of this bug. ***
the uris are equivalent, as %3E is interpreted as > in a uri. 

I am not certain why FF changes > manually entered in its address field to its
escaped form %3E, (as far as I know, it doesn't NEED to), so it seems like a bug
to me.

However your application could catch this, and, if you like, consider them as
equivalent, along with any other such character that might be escaped in a uri.
Your program is interpreting uris, so it could follow some common practices for
their interpretation.

Also, in any valid xml doc, > is likely going to be somehow escaped, so it is
difficult for others to, for example, make links to this content in a web page,
if your program expects a literal > in the url  (just look at the source of this
very page). I don't know how your program is being used right now, but, if you
need motivation, this situation seems to limit it potential usefulness, aside
from any problems specific to firefox :-) .
According to RFC 2396 (http://www.faqs.org/rfcs/rfc2396.html), > isn't a
valid character for use in URIs. Firefox escapes the string to its equivalent
(%3E) in order to prevent potential internal handling issues (and to pass on a
valid URI). 
In order to handle the escaped string in your application, you should use the
JavaScript function unescape().
Status: UNCONFIRMED → RESOLVED
Closed: 19 years ago
Resolution: --- → INVALID
I now know that certain US ASCII characters are not allowed in URIs (see
http://www.faqs.org/rfcs/rfc2396.html).

However, Firefox is NOT consistent with the way it follows this specification.  

For example, the following four ASCII characters are resolved to their escape codes:

SPACE resolves to %20
<     resolves to %3C
>     resolves to %3E
`     resolves to %60

However, I tested the following characters (some of which are NOT allowed per
the specification) and they are NOT resolved.

# does NOT resolve to 	%23
% does NOT resolve to 	%25
{ does NOT resolve to	%7B
} does NOT resolve to 	%7D
| does NOT resolve to 	%7C
\ does NOT resolve to 	%5C
^ does NOT resolve to 	%5E
~ does NOT resolve to 	%7E
[ does NOT resolve to  	%5B
] does NOT resolve to 	%5D
; does NOT resolve to 	%3B
/ does NOT resolve to 	%2F
? does NOT resolve to 	%3F
: does NOT resolve to 	%3A
@ does NOT resolve to 	%40
= does NOT resolve to 	%3D
& does NOT resolve to 	%26
$ does NOT resolve to 	%24
Status: RESOLVED → UNCONFIRMED
Resolution: INVALID → ---
(In reply to comment #4)
> # does NOT resolve to 	%23
Section 1.2.3 RFC 3986.

> % does NOT resolve to 	%25
Section 2.4.2 RFC 2396.

> [ does NOT resolve to  	%5B
> ] does NOT resolve to 	%5D
Valid in certain schemes i.e. ldap://[2005:db8::9]/c=US?ou=org

> ^ does NOT resolve to 	%5E
> | does NOT resolve to 	%7C
> { does NOT resolve to	        %7B
> } does NOT resolve to 	%7D
These characters can be used in IDN IRIs.

> ~ does NOT resolve to 	%7E
Section 2.3 RFC 2396.

> \ does NOT resolve to 	%5C
Used when accessing content localy i.e. file://C:\
Section 7.3 RFC 3986

> ; does NOT resolve to 	%3B
> / does NOT resolve to 	%2F
> ? does NOT resolve to 	%3F
> : does NOT resolve to 	%3A
> @ does NOT resolve to 	%40
> = does NOT resolve to 	%3D
> & does NOT resolve to 	%26
> $ does NOT resolve to 	%24
Section 2.2 RFC 2396.
Status: UNCONFIRMED → RESOLVED
Closed: 19 years ago19 years ago
Resolution: --- → INVALID

It is very clear that the main issue is with your interpreting program. Your
program is receiving a uri, and is obviously interpreting it wrongly: it is a
real honest to goodness bug in your program and should be fixed!

You are relying upon firefox, along with the entire internet, behaving in a
certain way: but you should not rely upon it behaving in this way.

There is no valid reason to expect consistency (as you are defining it) in
firefox dealing with the list of characters you've made, however confounding
this may be. There may be good reasons (url, urn specification, xml
specifications, html specifications, other specifications), or even bad reasons,
why some of those characters are altered, or not, when typed in the address bar:
but this is absolutely immaterial. It is clearly stated you cannot expect them
to be unaltered.
You need to log in before you can comment on or make changes to this bug.