Closed Bug 56529 Opened 24 years ago Closed 24 years ago

Mozilla window.location.search returns unescaped query string

Categories

(Core :: XUL, defect, P1)

defect

Tracking

()

RESOLVED FIXED

People

(Reporter: buster, Assigned: jst)

References

()

Details

(Keywords: top100, Whiteboard: [rtm++][HAVE FIX])

Attachments

(1 file)

1. using 2000101008 on WinNT commercial bits, and a debug build from around the
same time
2. open
http://www.intel.com/home/prodserv/pentiumiii/spots/index.htm?iid=feature+image&#
3. click on the "www.blueman.com" link
4. a small new browser window pops up saying "you're leaving intel's website".
That's fine.  Let it time out.
5. Then another browser window pops up with the url:
file:///S:/opt/10-10-00/Netscape%206/chrome/packages/core/redirect_loop.xul.  It
spins forever, throbber throbbing, no content.
the link to click resoves to:
javascript:openSite('http://apps.intel.com/util/serve-url.asp?iid=ihc+bmg_home&url=http://www.blueman.com&');"

openSite is defined like this:

function openSite(siteURL) {
  window.name='theopener';
  theLocation = '/home/feature/leave.htm?jumpTo=' + escape(siteURL) + '&';
  confirmWin = window.open(theLocation, 'confirmationWindow',
'toolbar=0,location=0,directories=0,status=0,menubar=0,scrollbars=0,resizable=0,width=550,height=250');

  // This is important for Netscape 2.0 to enable the opener property
  if (confirmWin.opener == null)
  {
   confirmWin.opener = self;
  }

  // Bring focus to the window in browsers that support focus
  if (window.focus) confirmWin.focus();
}


I think there's a similar bug out already, but I can't find it.
wanted to put on the rtm radar for debate.  This error makes the site behave
badly.  Don't know how many other sites have similar constructs.
Keywords: rtm, top100
I don't think this is an apps bug.  Maybe danm can help here?
Assignee: don → trudelle
Component: XP Apps → XP Toolkit/Widgets
QA Contact: sairuh → jrgm
->danm, will need a compelling reason to work on it with any expectation of PDT
approval.  At this point we need to only be working on very simple/safe fixes
for very horrible end-user experiences.
Assignee: trudelle → danm
So, there may be a compelling reason here, since we are making a big break
with backward Nav4/IE compatibility, but it turns out to be a DOM Level 0 /
Necko problem.

What is happening with the Intel page is is that the JS parsing logic in the
intermediate window gets handed the wrong info by Mozilla, and it winds up
sending the third window off with a URL that redirects back onto itself.

So necko detects the infinite redirect and correctly tries to throw up a
message to that effect.  (It can't find the XUL page with the error message
'redirect_loop.xul', but that's not a big deal -- bug 54241).

The real problem lies in where Mozilla hands back the wrong info. In Nav4 and
IE5, 'window.location.search' returns the "raw" query string; however, Mozilla
is unescaping that string (e.g. "%26" -> "&").

This is a major break in DOM Level 0 compatibility. I think that this must be
fixed for RTM, otherwise we will break any script that uses the search
property of window.location to do client-side URL logic.

As a quick hack, though, I did this:

Index: nsStdURL.h
===================================================================
RCS file: /cvsroot/mozilla/netwerk/base/src/nsStdURL.h,v
retrieving revision 1.19
diff -u -r1.19 nsStdURL.h
--- nsStdURL.h	2000/05/25 08:27:30	1.19
+++ nsStdURL.h	2000/10/14 07:14:50
@@ -165,7 +165,7 @@
 inline NS_METHOD
 nsStdURL::GetQuery(char* *o_Query)
 {
-    return GetString(o_Query, mQuery, UNESCAPED);
+    return GetString(o_Query, mQuery, ESCAPED);
 }

which fixed the problem on the Intel page, passed the pre-checkin tests, I did
this bugzilla submission and some queries sucessfully on Bugzilla, and let me
place a (fake name) order for merchandise from www.blueman.com

But, (a) this obviously needs more testing than that, (b) that hack
bastardizes the meaning of the enum (but happens to do what I needed it to do
in this case), and (c) I'd be concerned that some XUL/JS code has come to
expect this new return value from location.search (maybe even some C++ code).

Passing on to jst for the DOM 0, although the change is in netwerk (but could
be moved higher just for javascript callers). cc: andreas, gagan, pollmann,
ekrock who would be interested parties on this problem.
Assignee: danm → jst
Changing summary from "clicking on link causes extra browser window with chrome
url, and page doesn't load" to "Mozilla window.location.href returns unescaped
query string"
Keywords: 4xp, correctness
Summary: clicking on link causes extra browser window with chrome url, and page doesn't load → Mozilla window.location.href returns unescaped query string
If you want the escaped version of the query string then escape it again, but
*don't* change this internal necko logic. necko always returns unescaped atomic
parts of an url. 
Thanks, Andreas. I wasn't really suggesting that that was the right fix,
and now I know it's not the way to go.

However, re-escaping the query string is not a fix. Say you had this
query string:                      add=salt%26pepper&stir=10
After unescaping, it is:           add=salt&pepper&stir=10
And, then re-escaping it gives:    add%3Dsalt%26pepper%26stir%3D10
which is not the result that you are looking for.

JS/DOM needs a way to get the "raw" value of the query string back from nsIURI
cc: brendan, per his request.
Yes, this should qualify for exception fix status. We're breaking backward 
compatibility with DOM0 JavaScript 1.0 that's been in use since Navigator 2.0. 
This JavaScript is widely used and if not backwardly compatible will likely 
cause problems on many sites besides this one. Setting Priority P1 because it's 
high profile backward compatibility. Of course, we need to have a carefully 
reviewed bulletproof (near) zero-risk patch ASAP to accept it. 
Priority: P3 → P1
Currently we store the different parts of an url as we get them, but we return 
them in a "normalized" form, which is "unescaped" for url atoms and "escaped" 
for bigger parts of an url. This should always work, but maybe we escape to much 
with querys ... On the other hand the previous example seems not to be properly 
escaped in the first place.

Anyway, to not disturb any other access to url parts I suggest having different 
new methods in accessing url parts that return them as stored. For that we have 
to modify/enhance the nsIURI/nsIURL interfaces.
Andreas, I did what you suggested and added a new readonly attribute
escapedQuery to nsIURL and I changed the getters for window.location.search,
HTMLAnchorElement.search and HTMLArenaElement.search to use the new method.
Could you review the patch?

PDT: As Eric pointed out above, this breaks backwards compatibility in a way
that is visible on intel.com so I think we should fix this on the branch and
trunk, the patch doesn't look trivial, but it is a really safe change since all
the patch does is to add a new readonly attribute to an existing interface (and
also adds implementations of the new method to two classes) and there's no code
that uses the new code except for the code I changed in the patch. No changes
were made to the existing nsIURL methods so none of the code that uses nsIURL's
are affected at all by this change. With this change our .search methods are all
compatible with 4.x and IE 5.

Marking [rtm+] per Eric's comment.
Status: NEW → ASSIGNED
Whiteboard: [rtm+][HAVE FIX]
I agree with you, the patch does not look trivial but it is trivial. Looks good
to me. r=andreas.otte@primus-online.de. I hope these interfaces are not
finalized yet ...

But aside from that (and the compatibility problem) I would like to get to the
bottom of this. With urls there really is no "raw" data in my opinion. If a url
is properly escaped there should be no problem with escaping/unescaping. And
whoever gets part of on url or the whole thing should be able to deal with
escaped or unescaped versions. Is necko doing the wrong thing here? Or does the
js-script violate a rfc?
Do we have any way of knowing whether there are other dependencies on the
previous behavior? I'd hate to oscillate on a fix at this point...
Whiteboard: [rtm+][HAVE FIX] → [rtm++][HAVE FIX]
Phil: we aren't going to oscillate with web JS, because only Mozilla open source 
since Necko landed (or after, Andreas knows all) and beta products based on it 
(Netscape 6 PR releases) have shipped the incompatible-since-1995 behavior of 
unescaping the query part when returning location.search.

(For the classic codebase functions that always returned the escape (raw) query 
part, see 
http://lxr.mozilla.org/classic/source/network/main/mkparse.c#880 called from 
http://lxr.mozilla.org/classic/source/lib/libmocha/lm_url.c#172.)

We need to get this fix in for N6.  The incompatibility is likely to break more 
sites than intel.com, and it's wrong on its face to make a.search gets and sets 
not round-trip.  Notice the asymmetry: nsStdURL::GetQuery unescapes, SetQuery 
does not escape (another way to see this symmetry breaking, which should raise a 
big red flag: jst had to add only a readonly attribute to fix this bug).

a=brendan@mozilla.org, let's get this into the trunk.

Are there other DOM level 0 URL properties that don't round-trip?

/be
(Hey, even I can attest to the fact that jst's patch is trivial, since I had
written the same patch :-])

The only code that can be affected by that patch are javascript uses of the
Location.search property. So, I reviewed the .xul and .js files for both the
mozilla and commercial trees, looking for uses of the Location.search.

I found only the following (assuming I didn't miss any, but I don't think I
did). I don't believe any of these present immediate problems (if any), but
it would be best if the owners reviewed their assumptions about what is the
return value for the search property. But I wouldn't hold up a trunk landing
pending that review, and this *must* land on the branch before RTM.

http://lxr.mozilla.org/seamonkey/source/seamonkey/source/extensions/irc/xul/content/static.js#134
  rginda@netscape.com
    (passes on the result which is then parsed; may be a minor issue)

http://lxr.mozilla.org/seamonkey/source/seamonkey/source/extensions/xmlterm/ui/content/XMLTermChrome.js#19
  svn@xmlterm.org
    (looks like this goes nowhere)

http://lxr.mozilla.org/seamonkey/source/seamonkey/source/profile/resources/content/profileSelection.js#40
  ben@netscape.com or sspitzer@netscape.com
    (but this is just passing a hard-coded flag from cpp to js)

http://lxr.mozilla.org/seamonkey/source/seamonkey/source/xpfe/browser/samples/dexparamdialog.html
http://lxr.mozilla.org/seamonkey/source/seamonkey/source/xpfe/browser/samples/dexparamdialog.xul
  danm@netscape.com
    (but > 1 year old sample code which looks like it was expecting
     the "raw" query string.)

Finally, rjc@netscape.com in general, since the Search functionality does
a fair amount of parameter passing through the query string, so rjc
should have a good look at whether there are any hidden assumptions in
that code.
> If a url is properly escaped there should be no problem with 
> escaping/unescaping. 

However, it is extremely common for URLs to contain both the escaped and the
unescaped form for many of the ascii punctuation characters. It's really the
whole web that's violating the RFCs :-]

For example, this bread-and-butter form will generate a URL of (Nav4, Moz, IE5)
  <http://www.mozilla.org/?colors=red%26white&match=blue%3Dgreen>
where '&' and '=' appear in both escaped and unescaped forms.

  <form method="GET" action="http://www.mozilla.org/">
    <input name="colors" value="red&white"> 
    <input name="match"  value="blue=green">
    <input type="submit">
  </form>

(I don't know what necko should do, but aside from the JS/DOM uses that jst's
patch changes to use GetEscapeQuery, the only other callers of GetQuery are
apparently internal uses (nsWebShellWindow and some mailnews URL routines),
so presumably they are in control of the type of query strings that they will
encounter).
OS: Windows NT → All
Hardware: PC → All
Claudius -- you might want to have a look at search functionality in the trunk 
build for today, which will have this change (although any problems noted will
mean "fix search", not "fix this patch").
Fix checked in on both branch and trunk.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
verified fixed -- branch MN6 2000103009 mac/linux/win32; passes back the 
'raw'/escaped form for Location.search and Link.search (A and AREA elements).

(fixing all-this-time-a-typo summary ...   .search not .href).
Keywords: vtrunk
Summary: Mozilla window.location.href returns unescaped query string → Mozilla window.location.search returns unescaped query string
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: