non-ascii html files in iframe are rendered as if they're in ISO-8859-1

NEW
Assigned to

Status

()

Core
Layout: HTML Frames
13 years ago
5 years ago

People

(Reporter: Jungshik Shin, Assigned: Jungshik Shin)

Tracking

({intl})

Trunk
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(URL)

(Assignee)

Description

13 years ago
This is a bit hard to reproduce. (sometimes, it just works) However, I'm hitting
this problem rather often. 

* How to reproduce
  1. set the default character encoding to Korean(EUC-KR ) so that un-labelled
html files  are treated as in EUC-KR
  2. Go to the web page given in the URL
  3. Iframes embedded in the page (there a few in the page) have gibberish like
 '¹ÚÂùÈ££¬ºÎÈ° Áß°£ Á¡°Ë'
  4. Sometimes, just reloading the page fixes the problem. Othertimes, it doesn't

* What's expected
   - html files for problematic iframes are sent over with http header
'Content-Type: text/html' (without charset parameter) 
   - they don't have 'meta' tag declared either
   - Therefore, either they have to be treated as in EUC-KR (the default char.
encoding selected) or in the character encoding of the parent document, which is
also in EUC-KR

A similar problem was reported for mailnews : bug 180163

Other sites with the problem : http://www.hani.co.kr  http://www.ohmynews.com
Hmm.... odd.  Can you reproduce this in a debug build?  If so, stepping through
all the charset mess in nsHTMLDocument is likely to be in order...
(Assignee)

Comment 2

13 years ago
*** Bug 114898 has been marked as a duplicate of this bug. ***
(Assignee)

Comment 3

13 years ago
ftang's hypothesis copied from bug 114898 (bug 272815 may give a clue about this
bug)

both http://globe.nikkei.co.jp/ad/control/b2o/b2oPre.html
and 
http://globe.nikkei.co.jp/ad/control/b2o/b2oPR.html

 have no meta tag in it. 

no http charset neither. 
http://b2o.nikkei.co.jp/ do have html meta charset.

Here is what I guess happened.

on window, it receive more bytes in the 1st block. 
1. the browser load http://b2o.nikkei.co.jp/
2. it hit meta charset, so it switch to shift-jis
3. it hit the iframe code so it use shift-jis as default to load that page.
4. it display correctly

on Linux, it receive less bytes in the 1st block
1. the browser load http://b2o.nikkei.co.jp/ as ISO-8859-1
2. it do not have the meta tag in the first block so it load the page as ISO-8859-1
3. it hit the iframe code so it use ISO-8859-1 as the default encoding to load
the page
4. it load the iframe as ISO-8859-1 and remember it in the cache
5. the 2nd block of the main page arrive, we find out the meta tag said
Shift_JIS, we now reload it
6. we hit the iframe code and use shift_jis as default charset, and then we find
out ISO-8859-1 as the cache charset. 

This is purely my GUESS. Not sure it is what really happened. 

(Assignee)

Comment 4

13 years ago
*** Bug 287818 has been marked as a duplicate of this bug. ***
(Assignee)

Comment 5

13 years ago
I've  tried to reproduce it on Linux with a debug build, but I couldn't. I
stepped through nsHTMLDocument and everything seems to be what they're supposed
to be. 
You need to log in before you can comment on or make changes to this bug.