If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

emitting DOM content (created with Javascript), lossy UTF-16 to ASCII conversion is used

RESOLVED DUPLICATE of bug 335298

Status

()

Core
DOM
RESOLVED DUPLICATE of bug 335298
13 years ago
7 months ago

People

(Reporter: gumbas, Assigned: jst)

Tracking

({intl})

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(1 attachment)

(Reporter)

Description

13 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0

In the page http://y2k-design.net/_cms_data/charset_iframetest.html you can see
a simple JavaScript that creates iframe and sets its contents.
The problem is that although the iframe has "meta http-equiv" charset definition
 (iso-8859-2 in this case) it's completely ignored.

The script below works in IE but doesnt in Firefox.

Code:
--------------------------
<html>
<head><meta http-equiv="content-type" content="text/html;
charset=iso-8859-2"><title>TEST</title></head>

<body>

Polish chars test: ISO:¶¶¶¶¶¶¶¶¶¶¶¶¶¶ Win:śśśśśśśśś
	
<br>
<br>

<script language="JavaScript1.2" type="text/javascript">
	function getContents()
	 {
		return("<html>\
				<head>\
				<meta http-equiv=\"content-type\" content=\"text/plain; charset=iso-8859-2\">\
				<title>TEST</title></head>\
				<body bgcolor=\"#ffffef\" margin=0 border=0>\
				Polish chars test: ISO:¶¶¶¶¶¶¶¶¶¶¶¶¶¶ Win:śśśśśśśśś\
				</body></html>");
	 };
	
	translatorPadIFrame = document.createElement("iframe");
	translatorPadIFrame.id = "translatorPadIFrame";
	translatorPadIFrame.style.width="200px";
	translatorPadIFrame.style.height="200px";
	translatorPadIFrame.src="javascript:parent.getContents();";
	document.body.appendChild(translatorPadIFrame);
</script>

</body>
</html>


Reproducible: Always

Steps to Reproduce:
1. Visit http://y2k-design.net/_cms_data/charset_iframetest.html and see it for
yourself

Actual Results:  
See the output screen (Firefox 1.0):
http://y2k-design.net/_cms_data/charset_iframetest.png

Expected Results:  
The iframe contents should display chars properly, according to the charset
definition.
(Reporter)

Comment 1

13 years ago
Created attachment 171369 [details]
How Firefox  displays iframe contents that has been dynamically created.

Comment 2

13 years ago
Your test page given at the URL is NOT in ISO-8859-2 but mixes two character
encodings, ISO-8859-2 and Windows-1250 in a single HTML file. 
'ś' (this is in UTF-8. please, use UTF-8 whenever you add a comment with
non-ASCII content by setting View | Character encoding to UTF-8 BEFORE writing
your comment) in ISO-8859-2 is 0xB6 while it's 0x9C. Note that '0x9C' is invalid
(or represents a C1 control character) in ISO-8859-2 so that it's rendered as '?'

As for 'iframe'... this is an interesting case.. I'm not sure which component
this belongs to...
Keywords: intl
OS: Windows 2000 → All
Hardware: PC → All

Comment 3

13 years ago
I meant to change the product to Core 
Component: General → DOM
Product: Firefox → Core
Version: unspecified → Trunk

Comment 4

13 years ago
What's going on here is:

'ś' (0xB6) in ISO-8859-2 is translated into UTF-16 (0x01 0x5B) and then emitted
as '0x5B' by trucating high-byte (UTF16 -> ASCII conversion). 
The same happens to 0x9C which is translated into 0xFF 0xFD (U+0FFFD : Not a
character) because 0x9C in ISO-8859-2 is invalid. When emitted, it turns to 0xFD
(0xFF is truncated by lossy UTF16 -> ASCII conversion) which is 'ý' in
ISO-8859-2 (as well as in ISO-8859-1)

I don't think MS IE honors charset in 'meta tag' (of a dynamically created
iframe) but it just emit 'DOM content' in the charset of the parent document.


Assignee: firefox → jst
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: iframe document charset ignored when dynamically created using Javascript → emitting DOM content (created with Javascript), lossy UTF-16 to ASCII conversion is used
See the XXX comment at
http://lxr.mozilla.org/seamonkey/source/dom/src/jsurl/nsJSProtocolHandler.cpp#264
 -- that's what's causing this bug.

The idea of having a real UTF-data stream has come up before.  See what
nsDOMParser does, eg, in ConvertWStringToStream.  The JS url code should
probably do something similar, or we should factor the code out into a
NS_NewUTF8InputStream method or something.

Whiteboard: DUPEME
note bug 230440
Depends on: 230440

Updated

12 years ago
Depends on: 335298
Fixed by bug 335298

*** This bug has been marked as a duplicate of 335298 ***
Status: NEW → RESOLVED
Last Resolved: 12 years ago
Resolution: --- → DUPLICATE

Updated

4 years ago
Whiteboard: DUPEME
You need to log in before you can comment on or make changes to this bug.