Closed Bug 274633 Opened 20 years ago Closed 20 years ago

URL parameter names incorrectly treated as encoded character entities

Categories

(Firefox :: General, defect)

x86
Windows 2000
defect
Not set
normal

Tracking

()

VERIFIED DUPLICATE of bug 232789

People

(Reporter: larryl, Assigned: bugzilla)

Details

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0

If the HTML contains a link where any of the query string parameter names happen
to match the name of a character encoding entity, they get decoded into the
character entity instead of being left as-is.

For example, the URL on this link incorrectly ends up with literal double-quote,
less-than and greater-than symbols in it, instead of keeping the original query
string parameter names (amp, quot, lt and gt):

   <a href="http://foo?a=b&amp=0&quot=1&gt=2&lt=3">link</a>


Reproducible: Always

Steps to Reproduce:
1. Create a HTML page containing:
    <a href="http://foo?a=b&amp=0&quot=1&gt=2&lt=3">link</a>
2. View it.
3. Mouse over the link, you can see in the status bar that the query string
   param names have been turned into special characters.




This is the same bug as #232789 "wrong reading url in href", which I believe was
incorrectly labeled as invalid due to misunderstanding of what was being
reported.  It looks like whoever marked the bug as invalid assumed the intent
was to use the encoded character as a query string parameter name, where the
actual intent is to be able to use a param name like "gt" without it getting
unexpectedly transformed into a greater-than sign.
No, bug 232789 was correctly invalidated based on the actual nature of the
situation. The only thing that looks off to me is that there wasn't a lecture
about correctly setting severity.

Character entity references in HTML follow SGML rules, which means there are
only four or five people in the world who actually understand them, certainly
not including me, but the simplified version is that in almost every part of an
HTML document (except a few odd places like <script> and <xmp>), whenever you
include an ampersand character, you have either started an entity reference, or
you've made a mistake. There are things that mark the end of an entity reference
other than a semicolon, including a newline and the start of a tag, so that
&gt<span> is actually legal, but they are a dangerous thing to use.

So, the "correct" interpretation of your URL depends on whether or not the start
of a new entity is a legal end delimiter for an SGML entity reference, and
whether an equals sign is an end delimiter, but it either contains the undefined
entities &amp=0, &quot=1, &gt=2, and &lt=3">link or possibly just one long
undefined entity of &amp=0&quot=1&gt=2&lt=3">link, ending at the start of the
next tag, or possibly exactly what we show in the statusbar.

Because doing the right thing, breaking links when people forget to convert
their ampersands to &amp;, would break millions of pages if we started now, we
can't do the right thing; it's years too late for that. However, rather than
completely silently papering over the author's error as you want us to, we can
show one interpretation of what it might have sort of meant in the statusbar,
where someone looking clear out to the end of a long URL, and inspecting it for
correctness, is probably the author of the page, rather than a casual user who
will be scared at the thought that he might be submitting a query string with
the wrong variables. Which is probably why every single browser I can find to
test on right now, including probably 99% of the browser market, does it exactly
like that. And thus, a bug report saying it's a bug is invalid, and an
enhancement request saying that it should be done differently is quite likely to
be a wontfix.

*** This bug has been marked as a duplicate of 232789 ***
Status: UNCONFIRMED → RESOLVED
Closed: 20 years ago
Resolution: --- → DUPLICATE
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: