Closed Bug 165609 Opened 22 years ago Closed 20 years ago

Child frames don't inherit parent's charset initially

Categories

(Core :: Internationalization, defect)

x86
All
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: ilya.konstantinov+future, Assigned: tetsuroy)

References

Details

(Keywords: intl)

Attachments

(1 file)

To demonstrate this bug, go to Debug prefs > Networking and disable Memory Cache
and Disk Cache.

A document contains two frames (either a FRAMESET'ed FRAME or an IFRAME).
1. One frame's SRC is a regular URL. The URL loads. The loaded frame's charset
is the same as the parent's charset.
2. Other frame's SRC is a javascript: URL, simply to fill the frame initially
with some content. Those contents are immediately replaced with some other
contents. The second frame's charset is default ISO-8859-1.

You can check the frame's charset by right-clicking on the frame -> This Frame
-> Frame Info.

The "other contents" of the second frame is a Hebrew word. This is made to
demonstrate that this Hebrew word might be rendered differently when writeln'd
in ISO-8859-1 vs. ISO-8859-8. This is the only case when this bug shows visible
malfunctions on sites.

Shift-Reloading the frame parent will set the second frame's charset to the
parent's encoding.
Reloading (without Shift) the frame parent will set the second frame's charset
to ISO-8859-1 again.
A testcase. Set as URL instead of attachment since it links to an external HTML
file in the first frame to demonstrate the difference between the frames.
I'll also prepare a simplified testcase (without the first frame) for an attachment.
Attached file Simple Testcase
Keywords: intl
QA Contact: ruixu → ylong
Please confirm this. This case is not hypothetical but very real, a result of a
(undeserved) evangelizing of an Israeli bank website (see bug 163659). As I
found out after investigating, the first issue I mailed them about was our bug,
which was pretty embarrasing to admit.

More over, I didn't see a possible workaround for the problem yet, so I cannot
really close that evangelizing bug without this fixed / "workaround"ed.
I can hardly read any Hebrew... could you provide a test case that makes use of
perhaps just one non-American character (such as the Euro sign, only available
in ISO-8859-15, not in ISO-8859-1)?
You won't see any difference with a non-BiDi character.    
When one frame document.write()s a string to another frame, the string is   
inserted as a Unicode string (just as if the was added with each character   
encoded as a &#xxxx; Unicode HTML entity), not as a bit stream.   
So a wrong charset in the target frame doesn't affect the interpretation of   
the text which is document.write()n to it.   
   
So you ask "What's the problem then?".   
The problem is that Bidirectional languages can be displayed in a Visual mode   
or a Logical mode (this affects the order of the characters). The standard   
mode is "Logical" and "Visual" is just a legacy mode (denoted by the legacy   
"ISO-8859-8" and "ISO-8859-6" encodings).   
When the frame is in "ISO-8859-8" charset, the "visual-mode" flag for   
it is marked.  For all other charsets, its cleared. 
This flag means that even Unicode text, which is normally in standard Logical   
order, is drawn in Visual order within it. 
In the same fashion, if you write a string where Hebrew is ordered in a 
"Visual" manner to an ISO-8859-1 page, it'll be perceived as a "Logical" 
Hebrew string and be drawn incorrectly. 
   
Now, I bet this was totally confusing...   
   
The visual effect of my bug would be that the order of the letters would   
change. You can simply see the 4 Hebrew letters printed into the frame,   
remember their shapes, and then Shift-Reload and see the letters ordered  
in a reversed order (the correct order; the way it should've been in the first 
place if frames inherited parents charset). 
An easier way to confirm the bug is to right-click the frame and choose This 
Frame > Frame Info to see the "Character Coding" of the frame on every step. 
Blocks: 163659
Just checked -- the "Simple testcase" doesn't work but the URL works very well and 
demonstrates the bug on 20030326 build. 
Make sure you disable the disk cache and memory cache in the Debug prefs before you try. In 
newer builds, "Disable disk cache" is called "Enable disk cache" for some reason, but what 
they really mean is "Disable". 
Unintentionally fixed by bug 243034. 
By always having in-memory documents (generated through document.open / 
document.write) in UTF-16, there are no ambiguities anymore. 
In fact, now we work exactly like IE in this regard. 
Status: UNCONFIRMED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: