Open Bug 278571 Opened 20 years ago Updated 2 years ago

non-ascii html files in iframe are rendered as if they're in ISO-8859-1

Categories

(Core :: DOM: Core & HTML, defect)

defect

Tracking

()

People

(Reporter: jshin1987, Unassigned)

References

()

Details

(Keywords: intl)

This is a bit hard to reproduce. (sometimes, it just works) However, I'm hitting
this problem rather often. 

* How to reproduce
  1. set the default character encoding to Korean(EUC-KR ) so that un-labelled
html files  are treated as in EUC-KR
  2. Go to the web page given in the URL
  3. Iframes embedded in the page (there a few in the page) have gibberish like
 '¹ÚÂùÈ££¬ºÎÈ° Áß°£ Á¡°Ë'
  4. Sometimes, just reloading the page fixes the problem. Othertimes, it doesn't

* What's expected
   - html files for problematic iframes are sent over with http header
'Content-Type: text/html' (without charset parameter) 
   - they don't have 'meta' tag declared either
   - Therefore, either they have to be treated as in EUC-KR (the default char.
encoding selected) or in the character encoding of the parent document, which is
also in EUC-KR

A similar problem was reported for mailnews : bug 180163

Other sites with the problem : http://www.hani.co.kr  http://www.ohmynews.com
Hmm.... odd.  Can you reproduce this in a debug build?  If so, stepping through
all the charset mess in nsHTMLDocument is likely to be in order...
*** Bug 114898 has been marked as a duplicate of this bug. ***
ftang's hypothesis copied from bug 114898 (bug 272815 may give a clue about this
bug)

both http://globe.nikkei.co.jp/ad/control/b2o/b2oPre.html
and 
http://globe.nikkei.co.jp/ad/control/b2o/b2oPR.html

 have no meta tag in it. 

no http charset neither. 
http://b2o.nikkei.co.jp/ do have html meta charset.

Here is what I guess happened.

on window, it receive more bytes in the 1st block. 
1. the browser load http://b2o.nikkei.co.jp/
2. it hit meta charset, so it switch to shift-jis
3. it hit the iframe code so it use shift-jis as default to load that page.
4. it display correctly

on Linux, it receive less bytes in the 1st block
1. the browser load http://b2o.nikkei.co.jp/ as ISO-8859-1
2. it do not have the meta tag in the first block so it load the page as ISO-8859-1
3. it hit the iframe code so it use ISO-8859-1 as the default encoding to load
the page
4. it load the iframe as ISO-8859-1 and remember it in the cache
5. the 2nd block of the main page arrive, we find out the meta tag said
Shift_JIS, we now reload it
6. we hit the iframe code and use shift_jis as default charset, and then we find
out ISO-8859-1 as the cache charset. 

This is purely my GUESS. Not sure it is what really happened. 

*** Bug 287818 has been marked as a duplicate of this bug. ***
I've  tried to reproduce it on Linux with a debug build, but I couldn't. I
stepped through nsHTMLDocument and everything seems to be what they're supposed
to be. 
Product: Core → Core Graveyard
Component: Layout: HTML Frames → Layout: Images
Product: Core Graveyard → Core

The bug assignee didn't login in Bugzilla in the last 7 months.
:emilio, could you have a look please?
For more information, please visit auto_nag documentation.

Assignee: jshin1987 → nobody
Flags: needinfo?(emilio)

Doesn't look like layout, but I haven't reproduced the bug so maybe it should just be closed.

Component: Layout: Images, Video, and HTML Frames → DOM: Core & HTML
Flags: needinfo?(emilio)
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.