The window created by window.open() distorts the received/transmitted text in Cyrillic

ASSIGNED
Assigned to

Status

()

P3
normal
ASSIGNED
2 years ago
11 months ago

People

(Reporter: danilov.vladimir22, Assigned: wisniewskit)

Tracking

({leave-open, regression, testcase})

48 Branch
leave-open, regression, testcase
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(1 attachment)

250.29 KB, application/x-zip-compressed
Details
(Reporter)

Description

2 years ago
Created attachment 8815239 [details]
Some screenshots to clarify things

User Agent: Mozilla/5.0 (Windows NT 5.1; rv:50.0) Gecko/20100101 Firefox/50.0
Build ID: 20161123182536

Steps to reproduce:

I'm developing a web application to work with the database. From different parts of the application is called java-script (pic_0.jpg) to create a new window. Applies the Window.open(). In the URL parameter of the method is embedded in a string in Cyrillic (cp1251). In the newly created form is proposed to introduce a new entry to the database (pic_1.jpg). The new record also contain Cyrillic text (pic_4.jpg). The data from this form is sent to php script for processing (pic_2.jpg) for insertion into the database.


Actual results:

The newly created window distorts obtained via a GET parameter Cyrillic text (pic_0.jpg -> pic_1.jpg).
Similarly, when the transmission of the text (pic_4.jpg) using ajax request, the php script receives garbled text. Here I try to save in DB (pic_6.jpg). Or return back to the browser (pic_3.jpg -> pic_5.jpg).


Expected results:

In recent versions of FF Cyrillic alphabet should be displayed correctly when working with a form created by the method window.open. As correctly as she was working in Firefox 35 and below. Also normally how it works in IE and Chrome.
Again, this problem appeared after 35 browser version.

Comment 1

2 years ago
Thanks for the report! Could you attach a reduced testcase to reproduce the issue, please.
Flags: needinfo?(danilov.vladimir22)
Keywords: testcase-wanted

Updated

2 years ago
Component: Untriaged → DOM: Core & HTML
(Reporter)

Comment 2

2 years ago
first.html
===========
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
 <head>
  <meta http-equiv="Content-Type" content="text/html; charset=windows-1251">
  <script type="text/javascript">
   function create_new_window() {
    var get_params = "second.php?txt=" + document.getElementById("textbox").value;
    var win_params = "scrollbars=no,resizable=no,status=no,location=no,toolbar=no,menubar=no,width=1200,height=200,left=10,top=10";
    window.open(get_params, "_blank", win_params);
   }
  </script>
 </head>
 <body>
  <input type="text" id="textbox" size="50" value="некоторый текст на кириллице / some text in Latin">
  <input type="submit" onclick="create_new_window()" value="Передать в новое окно этот текст/ Send in a new window this text">
 </body>
</html>


second.php
==========
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
 <head>
  <meta http-equiv="Content-Type" content="text/html; charset=windows-1251">
  <script type="text/javascript">
   function pass_text() {
    IE = (navigator.appName.toLowerCase() == "microsoft internet explorer");
    var xmlhttp=null;
    xmlhttp = (IE) ? new ActiveXObject("Microsoft.XMLHTTP") : new XMLHttpRequest();	
    xmlhttp.open("GET", "third.php?txt=" + document.getElementById("textbox").value,true);
    xmlhttp.onreadystatechange = function() 
     if (xmlhttp.readyState == 4) 
      if (xmlhttp.status == 200) 
       alert("А это ответ от Ajax / And this is the response from the Ajax:\n" + xmlhttp.responseText);		
    if (IE) xmlhttp.send();
    else xmlhttp.send(null);
   }
  </script>
 </head>
 <body>
  Строка, переданная в параметре GET / The string passed as a GET-parameter: 
  <span style="color:red"><?php echo $_GET["txt"];?></span>
  <br><br>
  <input type="text" id="textbox" size="40" value="Еще один пример / Another example">
  <input type="submit" onclick="pass_text()" value="Передать этот текст на обработку / Pass this text to the processing">
 </body>
</html>


third.php
=========
<?php
 echo $_GET["txt"];
?>
Flags: needinfo?(danilov.vladimir22)

Comment 3

2 years ago
Couldvyou cover PHP strings in pure HTML/JS, please.
For comment 3.
Flags: needinfo?(danilov.vladimir22)
(Reporter)

Comment 5

2 years ago
Maby so:
first.html
===========
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
	<head>
		<meta http-equiv="Content-Type" content="text/html; charset=windows-1251">
		<script type="text/javascript">
			function create_new_window() {
				var get_params = encodeURI("second.html?txt=" + document.getElementById("textbox").value);
				var win_params = "scrollbars=no,status=yes,location=yes,toolbar=no,menubar=yes,width=1200,height=200,left=10,top=10";
				window.open(get_params, "_blank", win_params);
			}
		</script>
	</head>

	<body>
		<input type="text" id="textbox" size="50" value="некоторый_текст_на_кириллице/some_text_in_Latin">
		<input type="submit" onclick="create_new_window()" value="Send in a new window this text">
	</body>
</html>


second.html
===========
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
	<head>
		<meta http-equiv="Content-Type" content="text/html; charset=windows-1251">
		<script type="text/javascript">
			function parse_param() {
				var url = location.search;
				var passed_str = url.substring(url.indexOf('=') + 1);
				document.getElementById("received_str").innerHTML = passed_str;
				var decoded_str = decodeURI(passed_str);
				document.getElementById("decoded_str").innerHTML = decoded_str;
			} 
		</script>
	</head>

	<body onload="parse_param()">
		The string passed as a GET-parameter: 
		<span id="received_str" style="color:red" ></span>
		<br><br>
		It is the same, but decoded: 
		<span id="decoded_str" style="color:green" ></span>
	</body>
</html>


==========
In pure JS, everything looks obvious. Although, to see readable Cyrillic, you still need to do additional Unicod decoding. As for Ajax, my meager knowledge and even more limited experience says that to emulate for JS server-side processing will not work in any way.
Understand that it is not that guilty or not the open () method. There is the fact that the server php module (like 7.0, and 5.5) failed to correctly interpret received from Firefox Cyrillic. Decoding nothing. I assure You that the server settings I have not changed. And while the client was v FF.35 and below, there were no problems. With Chrome and IExplorer – too all works.
Perhaps this is not a bug with FF. But if I will not solve your problem, from the use of such respected me browser would have to refuse.

Comment 6

2 years ago
If you can build a standalone testcase, maybe you can provide a live testcase on a testing server.
(Reporter)

Comment 7

2 years ago
Sorry for the pause (many things new to me). Test example available at 93.116.255.182:8080.
In the process discovered several important details.
1. the window.open() has nothing to do with it.
2. the transmission of the text may work differently on a normal page and in a frame
3. the statistics of sample text from different browsers:
os+browser		plain-window/frame
---------------------------------------------------
win7(86)+FF35		x/ok
win7(86)+FF50		x/x
win7(86)+Chrome55	ok/ok
win7(86)+IE9		ok/ok
winXP(86)+FF35		x/ok
winXP(86)+FF50		x/x
winXP(86)+Chrome49	ok/ok
winXP(86)+IE8		ok/ok
Ubuntu14(64)+FF50	x/x
Ubuntu14(64)+Chromium53	ok/ok
Android5+FF50		x/x
Android5+Chrome54	ok/ok

Comment 8

2 years ago
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=9a61ec48081ec3570bc90ba1b29f28b93f855043&tochange=e6a98e60aa2d93c8bace95af7684997e693da033

Stone Shih — Bug 1254098 - Part 1: Don't use base url encoding to encode url when charset override is absent. r=valentin

stone, is it the expected behavior after your patch?
Blocks: 1254098
Flags: needinfo?(danilov.vladimir22) → needinfo?(sshih)
Keywords: testcase-wanted → regression, testcase
Version: 50 Branch → 48 Branch
Sorry for that I might not have much time to analyze this problem in this week but I'll analyze it in the next week.
Tested web-site [1] and the string showed in the dialog when clicking the button in the main frame is unexpected after reverting [2]. (clicking the button in the iframe is expected)

I assumed the expected results should be "текст из обычной страницы/ text from a normal page" and "текст из фрейма/ text from the frame" when clicking the button in the main frame and iframe.
[1] http://93.116.255.182:8080/
[2] https://hg.mozilla.org/integration/mozilla-inbound/rev/8923987bb55b

After some analysis, I found
1. the first special character of the test is 'т', uses Windows-1251 encoding, and its hex value is 0xF2
2. it's encoded as UTF8 in js and the hex values are 0xd1, 0x82
3. the document specifies 'charset=windows-1251'
Tried to find the spec and found [3] says we should use the API URL character encoding specified by the script's settings object when the url comes from a script. Also [4] says the API URL character encoding is the current character encoding of the document. I'm wondering that we should use document charset in [5] to resolve it as %F2
[3] https://www.w3.org/TR/html5/infrastructure.html#resolving-urls
[4] https://www.w3.org/TR/html5/webappapis.html#script-settings-for-browsing-contexts
[5] http://searchfox.org/mozilla-central/source/dom/xhr/XMLHttpRequestMainThread.cpp#1537

Tried to use document's charset and got some web platform test failures in [6]
XMLHttpRequest/open-url-encoding.htm (failed on Edge 38.14393.0.0 and chrome canary 57.0.2950.0)
  This test uses windows-1252 encoding and use XHR to open 'resources/content.py?\u00DF'
  u00DF is encoded as 0xC3 0x9F in js
  According to previous observation, it should be encoded with the document's encoding and get %DF
  The test expects %C3%9F

workers/semantics/encodings/003.html (failed on Edge 38.14393.0.0 and passed on chrome canary 57.0.2950.0)
  No encoding is specified in this test.
  It uses XHR to open url with special character 'å', which is 0xc3, 0xa5
  The character is encoded as utf8 in js and it's hex values is 0xc3, 0xa5
  The document uses windows-1252 encoding and encoded it as %E5
  (The document encoding is set in [7])
[6] https://treeherder.mozilla.org/#/jobs?repo=try&revision=6b58223588f21ce3f4c0ca6fc2689433e3d2eab3
[7] http://searchfox.org/mozilla-central/source/dom/html/nsHTMLDocument.cpp#493

Hi Baku,
I'm not sure if my understanding is correct and I'd like to ask for your kindly help to give me some feedbacks. Thanks.
Flags: needinfo?(sshih) → needinfo?(amarchesini)
> I'm not sure if my understanding is correct and I'd like to ask for your
> kindly help to give me some feedbacks. Thanks.

I think you are right. But it seems to me that our URL parser is able to deal only with UTF8 strings.
Flags: needinfo?(amarchesini) → needinfo?(valentin.gosu)
(Reporter)

Comment 12

2 years ago
Dear developers.
1. Will be solved the problem of non-unicode web pages?
2. If not, then my server with the test page no longer needed?
Hi Vladimir, sorry for the long delay. I'll take a look today.
Sorry for the long wait. It seems this is indeed a regression from bug 1320925. Interestingly, the tests that bug 1320925 was supposed to fix still pass if I back out that patch. I'll land the backout immediately, and I'm currently investigating the second part of the problem (Comment 7). We should pass the plain-window test case too.
Assignee: nobody → valentin.gosu
Flags: needinfo?(valentin.gosu)
Whiteboard: [necko-active]
Keywords: leave-open

Updated

2 years ago
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Hi Thomas,
I've looked at this for a bit, but I'm not sure exactly where the bug is coming from.
I've traced the bug to XMLHttpRequestMainThread::Open where we call
> NS_NewURI(getter_AddRefs(parsedURL), aUrl, nullptr, baseURI);
It seems that for iframes, baseURI has the correct originCharset, but if it's not, the originCharset is UTF-8.

Do you have time to take a look at this?
Flags: needinfo?(wisniewskit)
(Assignee)

Comment 19

2 years ago
I'll try, but I'm at my onboarding this week, so I may not have time until Monday, if that's alright?
Flags: needinfo?(wisniewskit)
The bug has been there for years, so it can wait a few days :) Thanks!
Assignee: valentin.gosu → wisniewskit
Tomas, did you get the chance to look at this?
Flags: needinfo?(wisniewskit)
(Assignee)

Comment 22

a year ago
Unfortunately not yet. I'm hoping to get to it by the end of the quarter though, if that's alright?
Flags: needinfo?(wisniewskit)
Whiteboard: [necko-active]

Updated

11 months ago
Priority: -- → P3
When we got to this, note the potential regression in bug 1387688
You need to log in before you can comment on or make changes to this bug.