JavaScript construct "location.href = ..." escapes non-ascii incorrectly

RESOLVED FIXED in mozilla0.9.7

Status

()

Core
DOM: Core & HTML
P3
major
RESOLVED FIXED
18 years ago
17 years ago

People

(Reporter: Karl Johan Kleist, Assigned: nhottanscp)

Tracking

({dom0, intl, testcase})

Trunk
mozilla0.9.7
x86
All
dom0, intl, testcase
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: PDT-, URL)

Attachments

(3 attachments, 2 obsolete attachments)

(Reporter)

Description

18 years ago
Build 2000052408

RECIPE:

1. Select some item in the pick-list "Produktgrupper:" to the left.

A new pick-list is rendered, with the default text "Välj undergrupp". Fine.

2. Try to select something from this new pick-list. You can't, but it should
 really contain several items.

Comment 1

18 years ago
worksforme on 052408/Win32
anyone else with Linux able to confirm this?

Comment 2

18 years ago
WFM   2000052408 Linux build.
Svante, if this still doesn't work for you, re-open this listing *exactly* how
you re-create the problem
Status: UNCONFIRMED → RESOLVED
Last Resolved: 18 years ago
Resolution: --- → WORKSFORME

Comment 3

18 years ago
Sorry for the spam.  New QA Contact for Browser General.  Thanks for your help
Joseph (good luck with the new job) and welcome aboard Doron Rosenberg
QA Contact: jelwell → doronr

Comment 4

18 years ago
verified.
Status: RESOLVED → VERIFIED
(Reporter)

Comment 5

18 years ago
Still doesn't work in M16 (RedHat 6.2, AMD Athlon).

I was asked by <wdormann@crosswinds.net> to give a more detailed recipe.
The only additional comment I can think of:

The second pick-list is rendered with only one choice ("Välj undergrupp"). 
There should be several choices; e.g. if "BACKUP" was selected in the first 
pick-list,
the second pick-list should have the choices "EXTERNA", "INTERNA", "MEDIA",
and "TILLBEHÖR".

/ svAnte

Status: VERIFIED → UNCONFIRMED
Resolution: WORKSFORME → ---

Comment 6

18 years ago
WORKSFORME - sublist under back=up shows 4 items in 6/18 AM tip linux build

Comment 7

18 years ago
this works on win32 as well.

reporter - please get the newest nightly and try again. thanks! If oyu still see
this, reopen.
Status: UNCONFIRMED → RESOLVED
Last Resolved: 18 years ago18 years ago
Resolution: --- → WORKSFORME
(Reporter)

Comment 8

18 years ago

You'll not get rid of me this easy, you know ;-)

Bad news: I still notice the bug in build 2000061809.

Good news: I've nailed down the bug further:

  Only if the selected item in the first pick-list contains a non-ascii
  character ("BILDSKÄRMAR", "NÄTVERK") the second pick-list will be f----d up.

Can you reproduce now? / Svante in rainy Stockholm
Status: RESOLVED → UNCONFIRMED
Resolution: WORKSFORME → ---

Comment 9

18 years ago
Ok, now I see it. Good work reporter!

http://www.dustin.se/shopping/dustin/border_left.asp?group0=2&text0=BILDSK%C3%
84RMAR&Shopper=PR7UNV3Q1I1H0ZJQDEI3N3RGT2AR9KZS&Guest=True

This is the URL of the bug.  This iseems to be a javascript bug, moving the
javascript engine Could be DOM related though
Assignee: asa → rogerl
Severity: major → normal
Status: UNCONFIRMED → NEW
Component: Browser-General → Javascript Engine
Ever confirmed: true
OS: Linux → All
QA Contact: doronr → pschwartau

Comment 10

18 years ago
As described above, the bug only occurs when we choose a list item 
containing a non-ASCII character. Is this therefore an HTML parser issue?

It is my understanding that the JavaScript Engine deals with characters 
as Unicode unless otherwise specified by the encoding information in the
Web page. It will treat any one Unicode character the same as any other.

Reassigning to Parser group for help on this -

Assignee: rogerl → rickg
Component: Javascript Engine → Parser
QA Contact: pschwartau → janc

Comment 11

18 years ago
Rod --- can you triage this for us? Thanks.
Assignee: rickg → rods

Comment 12

18 years ago
I need a reduced test case for this. Marking as Future until I get one.

This bug has been marked "future" because the original netscape engineer working 
on this is over-burdened. If you feel this is an error, that you or another 
known resource will be working on this bug,or if it blocks your work in some way 
-- please attach your concern to the bug for reconsideration. 
Status: NEW → ASSIGNED
Keywords: testcase
Target Milestone: --- → Future
(Reporter)

Comment 13

18 years ago
Rods,

I think I have narrowed down this one to a problem with the JavaScript construct
"location.href = ..." , which doesn't escape non-URL-safe characters correctly.

To see what I mean, visit "http://www.dustin.se/shopping/dustin/default.asp",
in the pick-list "Produktgrupper:" select "BILDSKÄRMAR". This renders another
pick-list "Välj undergrupp" below the first one, containing only the default
option "Välj undergrupp".

With the mouse pointer over the left yellow frame, select "Open Frame in New
Window". Study the URL. After the first ampersand comes
'text0=BILDSK%C3%84RMAR'; the 'Ä' is incorrectly escaped to '%C3%84',
should be '%C4'.

If you correct the URL, and load the page again the pick-list "Välj undergrupp"
will contain several options.

Study the source code, and the JavaScript function Search0() performs a
"location.href= ..." based on the selected option.

Here is a minimal test case:

<html>
<head>
<title>Minimal testcase for bug 40469</title>
</head>
<body>
<script language='JavaScript'>
function foo( i )
{
  var s = i.options[i.selectedIndex].value;
  alert( escape(s) );
  location.href = 'bar?a=' + s;
}
</script>
<p>
<select name='a' onChange='foo(this)'>
<option value='BACKUP'>BACKUP</option>
<option value='BILDSKÄRMAR'>BILDSKÄRMAR</option>
<option value='CD/DVD'>CD/DVD</option>
</select>
</p>
</body>
</html>

A last thought: is this possibly UTF-8 related?
(Reporter)

Updated

18 years ago
Severity: normal → major
Summary: Only first item displayed in pick-list → JavaScript construct "location.href = ..." escapes non-ascii incorrectly
(Reporter)

Comment 14

18 years ago
Sorry for the spam: Changed component to JavaScript (which is nothing but
an un-educated guess, really; should rather be Networking?)
Assignee: rods → rogerl
Status: ASSIGNED → NEW
Component: Parser → Javascript Engine
QA Contact: janc → pschwartau

Updated

18 years ago
Assignee: rogerl → jst
Component: Javascript Engine → DOM Level 0
QA Contact: pschwartau → desale

Comment 15

18 years ago
The escape() function of the JS Engine is superceded in the browser 
by the escape() function of the DOM. Reassigning to DOM Level 0  -

Comment 16

18 years ago
Created attachment 14742 [details]
Attaching a testcase reduced even more...

Comment 17

18 years ago
Using Mozilla tip builds 2000-09-12 on WinNT, 2000-09-13 on Linux.

If you click on the testcase above, it will bring up an alertbox showing
the result of escape('Ä'). It will also ask you to compare this in the URL bar:
 
                      javascript:alert(escape( 'Ä'))


Here is what is happening for me. I hope this helps to solve the problem -  


In NN 4.7:     alertbox ---> %C4
               URL bar  ---> %C4

In Mozilla:    alertbox ---> %C4
               URL bar  ---> %C3

Comment 18

18 years ago
Is this bug a duplicate of bug 51355 ? 

Updated

18 years ago
Status: NEW → ASSIGNED
(Reporter)

Comment 19

18 years ago
Please reconsider the "Future" target MS for this bug.

It causes major problems for non-us-ascii-users,
we often encounter JavaScript generated URLs
containing "strange" characters.

Keywords: dom0

Updated

17 years ago
Keywords: intl

Updated

17 years ago
Whiteboard: might be bug 51355

Comment 20

17 years ago
Phil's testcase was fixed in bug 51355, but svante's testcase still seems to fail.

Comment 21

17 years ago
Using binary 2001062113 WinNT. When I try my testcase above
(attachment id=14742), I still get the same discrepancy as before: 


NN 4.7:     alertbox ---> %C4
            URL bar  ---> %C4

IE 4.7:     alertbox ---> %C4
            URL bar  ---> %C4

Mozilla:    alertbox ---> %C4
            URL bar  ---> %C3


This isn't right, is it?
Naoki, could you look into this? If not, give this back to me.
Assignee: jst → nhotta
Status: ASSIGNED → NEW
(Assignee)

Comment 23

17 years ago
URL bar may be showing a part of UTF-8.
0xc4 in UTF-8 is two bytes 0xc384.
(Assignee)

Comment 24

17 years ago
In fact part of the patch in bug 51355 fixes this but it was not checked in, so 
I will attach that part to this bug.
(Assignee)

Comment 25

17 years ago
Created attachment 40369 [details] [diff] [review]
Patch, changed JS protocol to expect UTF-8 instead of ASCII

Comment 26

17 years ago
is this a low risk fix. If so, please get r= and sr= and land trunk first.
Target Milestone: Future → mozilla0.9.3
(Assignee)

Updated

17 years ago
Status: NEW → ASSIGNED
(Assignee)

Updated

17 years ago
Target Milestone: mozilla0.9.3 → ---

Comment 27

17 years ago
move to m0.9.4 and reassign to yokoyama
Assignee: nhotta → yokoyama
Status: ASSIGNED → NEW
Target Milestone: --- → mozilla0.9.4

Comment 28

17 years ago
r/sr=vidur

Updated

17 years ago
Status: NEW → ASSIGNED
Whiteboard: might be bug 51355 → might be bug 51355, need /a=

Comment 29

17 years ago
here is the impact:
1. it will impact javascript in URL with non ASCII data. In other word, if the
URL is not JavaScript, it won't get impact by this path. If the URL only
contains ASCII, it won't be impact

What we need to test?
test pages which have "javascript:" with non ascii data in it. 

Updated

17 years ago
Keywords: nsbranch+

Comment 30

17 years ago
Comment on attachment 40369 [details] [diff] [review]
Patch, changed JS protocol to expect UTF-8 instead of ASCII

a=asa on behalf of drivers
Attachment #40369 - Flags: approval+

Comment 31

17 years ago
checked in
Status: ASSIGNED → RESOLVED
Last Resolved: 18 years ago17 years ago
Resolution: --- → FIXED
Whiteboard: might be bug 51355, need /a= → might be bug 51355

Comment 32

17 years ago
Verified on 2001-09-21-04-0.9.4 branch as well as 2001-09-19-03-trunk. 
Status: RESOLVED → VERIFIED
(Reporter)

Comment 33

17 years ago
Reporter here. Sorry guys, but my original testcase still fails:

1. Visit "http://www.dustin.se/shopping/dustin/default.asp"

2. In the combobox "Produktgrupper", select any item that contains non-ascii
   characters, e.g. "Bildskärmar" or "Övrigt".

   Note that the next combobox "Välj undergrupp" is _not_ populated
   (which it is, if an item containing ascii only is selected).
Status: VERIFIED → REOPENED
Resolution: FIXED → ---

Updated

17 years ago
Assignee: yokoyama → ftang
Status: REOPENED → NEW

Comment 34

17 years ago
assigning to original owner.  However, nhotta is still on sabatical.
Assiging to ftang for now.

Comment 35

17 years ago
jbetak, can you look at this, what happen here ?
Assignee: ftang → jbetak
I will - accepting...
Status: NEW → ASSIGNED

Comment 37

17 years ago
marking PDT- and moving into 0.9.5. Juraj, if there is a very low risk patch in
the next few days, send it for reconsideration (remove the - from PDT). But it
seems it's too late for this one
Whiteboard: might be bug 51355 → might be bug 51355, PDT-
Target Milestone: mozilla0.9.4 → mozilla0.9.5
pushing out to M096 - it's really late for this now.
Target Milestone: mozilla0.9.5 → mozilla0.9.6

Comment 39

17 years ago
URL related issue. give to nhotta.
Assignee: jbetak → nhotta
Status: ASSIGNED → NEW
(Assignee)

Updated

17 years ago
Status: NEW → ASSIGNED
(Assignee)

Comment 40

17 years ago
Not sure if this is still related to the URL issue.
Svante, you mentioned about the combobox problem. Could you paste the lines
which is doing that into the bug?
(Assignee)

Updated

17 years ago
Whiteboard: might be bug 51355, PDT- → PDT-
(Reporter)

Comment 41

17 years ago
As I wrote 2001-09-21, the original test case at
< http://www.dustin.se/shopping/dustin/default.asp > still doesn't work
(i just tried with build 2001101503, Windows 2000 / SP2).

However, using the "minimal test case" described by me 2000-09-05
the 'Ä' is now correctly escaped to '%C4'.

So you may indeed be right, nhotta, that this bug is (no longer) URL related.
Form Manger???
(Assignee)

Comment 42

17 years ago
If it does not work only when non ASCII strings then it might be i18n related.
I am not familiar with that script, so need help. Where in the code the combobox
items is supposed to be generated?
(Reporter)

Comment 43

17 years ago
Reporter here. The handling of ComboBoxes in forms seems to be messy in general.
Is this bug possibly related to bug 106408 ?
(Assignee)

Comment 44

17 years ago
Is the list generated by the server? I cannot test this with a local copy.
(Reporter)

Comment 45

17 years ago
Sorry, nhotta (what's your first name btw?),
but I don't quite understand your question.

You mean the list of items in the buggy combo box?

Please try this:

1. In the ComboBox "Produktgrupper:", select an item containing a non-ascii
character, e.g. "Bildskärmar"

The cb below it now contains only _one_ item, the heading "Välj undergrupp".
This is incorrect (try IE or Opera or even NS 4 ... )

2. Select "View Frame Source" on the left frame containing the comboboxes.

The HTML of interest here is:

<select
  CLASS="GROUP_NAME_1"
  onChange =" Search0(this, '5RG0R3J99CSH2G8W00LHR2Q664M2R6LG', 'False' );"
  NAME="GROUP_NAME_1"
  style="font-family: verdana; font-size: 10;">
    <option VALUE="">Välj huvudgrupp</option>
    <option VALUE="1&text0=BACKUP">BACKUP</option>
    <option VALUE="2&text0=BILDSKÄRMAR" SELECTED>BILDSKÄRMAR</option>
    <option VALUE="3&text0=CD/DVD">CD/DVD</option>
    <option VALUE="4&text0=DATORER">DATORER</option>
    <option VALUE="5&text0=DATORKOMPONENTER">DATORKOMPONENTER</option>
    <option VALUE="6&text0=DATORVÄSKOR">DATORVÄSKOR</option>
    <option VALUE="7&text0=DIGITALMINNEN">DIGITALMINNEN</option>
    <option VALUE="8&text0=KABLAGE/ADAPTRAR">KABLAGE/ADAPTRAR</option>
    <option VALUE="9&text0=KAMEROR___VIDEO">KAMEROR & VIDEO</option>
    <option VALUE="10&text0=KVM/DATASWITCH">KVM/DATASWITCH</option>
    <option VALUE="11&text0=LITTERATUR">LITTERATUR</option>
    <option VALUE="12&text0=MJUKVARA">MJUKVARA</option>
    <option VALUE="13&text0=MODEM___INTERNET">MODEM & INTERNET</option>
    <option VALUE="14&text0=MULTIMEDIA">MULTIMEDIA</option>
    <option VALUE="15&text0=NÄTVERK">NÄTVERK</option>
    <option VALUE="16&text0=ÖVRIGT">ÖVRIGT</option>
    <option VALUE="17&text0=PRESENTATION">PRESENTATION</option>
    <option VALUE="18&text0=SÄKERHET">SÄKERHET</option>
    <option VALUE="19&text0=SCANNERS">SCANNERS</option>
    <option VALUE="20&text0=SKRIVARE">SKRIVARE</option>
    <option VALUE="21&text0=TELEFONER">TELEFONER</option>
    <option VALUE="22&text0=UPS">UPS</option>
</select>

<select
  CLASS="GROUP_NAME_2"
  onChange =" Search1(this, 2, 'BILDSKÄRMAR',
'5RG0R3J99CSH2G8W00LHR2Q664M2R6LG', 'False' );"
  NAME="GROUP_NAME_2"
  style="font-family: verdana; font-size: 10;">
    <option VALUE="" SELECTED>Välj undergrupp</option>
</select>

I realize that this just illustrates the _symptom_ of the problem,
obviously the _reason_ for the single "option" element in the second ComboBox,
(i.e. the second "select" element) lies earlier in the chain of events.

I'd suspect the JavaScript function Search0() which is called
when the user selects an item from the first ComboBox:

function Search0(i, sShopper, nGuest)
{
	if (i.selectedIndex == 0)
		return;
	var searchstring = i.options[i.selectedIndex].value;
	location.href="border_left.asp?group0=" + searchstring + "&Shopper=" + sShopper +
"&Guest=" + nGuest;
}

Maybe the URL assembled here becomes incorrect? You'd probably need to
trace the HTTP traffic, using a debugging proxy or something, what do I know...

With little knowledge of Mozilla internals, I doubt that I could be of
more help. But I'd be happy to assist again, if I can.

Finally: Please note that the site "www.dustin.se" is the largest e-commerce
site in Scandinavia for computer stuff. By fixing this bug, you will
buy yourself serious goodwill in the Scandinavian Mozilla community.
(a polite way of telling you that all my colleagues think that Mozilla sucks
because they can't use it for shopping on Dustin... )

And of course: This is not necessarily a Mozilla bug. It may be a feature
in the web server (MS IIS I think)?



(Reporter)

Comment 46

17 years ago
When re-reading my comments I note something interesting in the second "select"
element:

- - -
onChange =" Search1(this, 2, 'BILDSKÄRMAR', '5RG0R3J99CSH2G8W00LHR2Q664M2R6LG',
'False' );"
- - -

The third argument to Search1() is 'BILDSKÄRMAR', looks strange to me.
The unescaped ISO-8859-1 string would be 'BILDSKÄRMAR',
the URL escaped string would be '%C4'. But it seems to be neither.

This may be a lead?


(Assignee)

Comment 47

17 years ago
I think I found a problem.

        var searchstring = i.options[i.selectedIndex].value;
        location.href="border_left.asp?group0=" + searchstring + "&Shopper=" + 
sShopper +
"&Guest=" + nGuest;

The "searchstring" is a raw 8bit (in UTF-8). Raw 8bit string has to be escaped 
before it is sent out as URI. JS escape() converts the data to a document 
charset (ISO-8859-1 in this case) and escape to create "%C4".
See,
http://lxr.mozilla.org/seamonkey/source/dom/src/base/nsGlobalWindow.cpp#2512

Mozilla might be able to do the 8bit check and force to esacpe URI if 8bit data 
is found. Does anybody know where 'location.href' is processed?

Updated

17 years ago
Blocks: 107066

Updated

17 years ago
Keywords: nsbranch+
(Assignee)

Comment 48

17 years ago
Created attachment 56629 [details] [diff] [review]
Changed location.href to force escape 8 bit if the href contains 8 bit.
(Assignee)

Comment 49

17 years ago
For this bug, href is not escaped by the script and raw 8bit is sent to the
server. It used to work on 4.x because the raw data was encoded in a document
charset, now we use UTF-8.
So the change is for compatibily. If the href contains 8 bit then get a document
charset and convert from unicode and URL escape 8 bit data.
(Assignee)

Updated

17 years ago
Depends on: 107610

Updated

17 years ago
No longer depends on: 107610

Updated

17 years ago
Blocks: 107610

Comment 50

17 years ago
Comment on attachment 56629 [details] [diff] [review]
Changed location.href to force escape 8 bit if the href contains 8 bit.

r=ftang
Attachment #56629 - Flags: review+

Updated

17 years ago
No longer blocks: 107610
(Assignee)

Updated

17 years ago
Target Milestone: mozilla0.9.6 → mozilla0.9.7
Do we really want to do the charset conversion in
LocationImpl::EscapeNonAsciiInURI() even if we can't find the charset? And
should LocationImpl::EscapeNonAsciiInURI() be made static, I don't see it
needing |this| anywhere.
(Assignee)

Comment 52

17 years ago
Created attachment 57100 [details] [diff] [review]
Make the function static, no escaping in case of failure of getting a charset.
Attachment #56629 - Attachment is obsolete: true
The code:

+  nsCOMPtr<nsIDOMDocument> domDoc;
+  rv = window->GetDocument(getter_AddRefs(domDoc));
+  NS_ENSURE_SUCCESS(rv, rv);

should be followed by:

  if (!domDoc) {
    return NS_OK;
  }

so that we return early *without* returning an error if there is no document in
the window, no document in a window doesn't mean there's an error.

Also, when you're calling EscapeNonAsciiInURI(), you ignore the return code,
either add a if (NS_FAILED(result)) and return early, ignore the return value
all together (i.e. remove the |result = | in front of the call to
EscapeNonAsciiInURI), pick one.

In

+    result = NS_NewURI(getter_AddRefs(newUri), escapedHref, aBase);

add a .get() on escapedHref to be explicit about which NS_NewURI() function
you're calling.

With that, sr=jst
(Assignee)

Comment 54

17 years ago
Created attachment 57135 [details] [diff] [review]
Updated with jst's comment.
Attachment #57100 - Attachment is obsolete: true
Comment on attachment 57135 [details] [diff] [review]
Updated with jst's comment.

sr=jst
Attachment #57135 - Flags: superreview+
(Assignee)

Comment 56

17 years ago
checked in
Status: ASSIGNED → RESOLVED
Last Resolved: 17 years ago17 years ago
Resolution: --- → FIXED
(Reporter)

Comment 57

17 years ago
Reporter here. I can confirm that the bug report I filed almost
one and a half years ago now finally is solved.

You have no idea how this fix will improve the reputation of Mozilla
in the Swedish Internet community. "www.dustin.se" is the number one
e-commerce site among us, and now the final show stopper for Mozilla
seems to be delt with.

I've somewhat understood what a dangerous "Hinterland" the issue
about character escaping in URLs, JavaScript, Unicode, you-name-it, really is.
I'm immensly impressed by the expertize available in the Mozilla project.

Thank you all! Let's kick a-- with Mykroslush Internet Exploder!
You need to log in before you can comment on or make changes to this bug.