I tried to pseudo localize menu items in navigator.xul to Japanese by putting encoding either "UTF-8", "ISO-10646-UCS-2" or "Shift_JIS" in text declaration with menu items in encoded Japanese text. It failed and menu items displayed in garbages with 3.18.99 M3 Windows build on Japanese NT 4.0.
reassigned to saari as p3 for m5
Gerardo, How did you do this? Did you use <META Content-Type ... charset= ...>? By default XUL should use UTF-8. My understanding is that all our .xul files used in 5.0 will be UTF-8 encoded. But you are correct, a XUL content developer should be able to use other charset encodings.
Ignore my previous comments. Let me start again. Ray (not Gerardo), How did you do this? Did you use (XML syntax, not HTML syntax): <?xml encoding='IANA-charset-name'?> [Ray told me he did.] 2 things are required: (1) By default, XML assumes the charset encoding is either UCS-16 or UTF-8 based upon the presence or not of a Byte-Order-Mark (BOM). If the data is in UTF-8, we need to convert to UCS-16. (The UTF-8 to UCS-2 converter is part of M4. Pre-5.0 Communicator has code to check for BOM.) (2) For other charset encodings, we must be able to parse <?xml encoding='IANA-charset-name'?> and call the appropriate Unicode converter Part (1) is needed for M4 and part (2) could wait for M5. See reference: http://www.w3.org/TR/1998/REC-xml-19980210#charencoding
I have <?xml encoding='UTF-8'?> in the first line of my XUL file. I also have Japanese text encoded in UTF-8 in menu items.
Add nisheeth to cc list since he own EXPAT integration. Saari, I have no idea why this bug assign to you. If you think you should not own this bug, reassign this to nisheeth.
Reassigning to nisheeth
*** This bug has been marked as a duplicate of 4463 ***
Bug 4431 is not a duplicate of 4463. 4463 is for default encoding but 4431 is for general encoding. Currently if I put Shift_JIS encoding with Shift JIS text, it crashes. I tested it with 4/26 build on Japanese NT 4.0.
Suumarizing the new status of this bug: The expat XML parser needs to parse <?xml encoding=...?> and then call the approprate charset converter
Setting component to XML and milestone to M6...
Moving non-crasher XML bugs to M7...
I've spoken to Harish and Frank about this. We will implement this using the observer mechanism we already have in place for META tags. Harish implemented the META tag mechanism and is ready to extend that to include observation of XML PIs. I'm re-assigning the bug to him and setting the milestone to M8... Specifically, an observer will register an interest in the <?xml ?> PI and will get notified by the parser when that PI is encountered. The observer can then check for the encoding attribute and tell the webshell to reload the document with a new charset, if necessary.
Have a fix but will not be checking in until M9. Need time for verification. Setting to M9.
Do you need some test cases? IQA may be able to assist.
Fix is in. Now, the observers can register for <?xml ?> PI and will get notified by the parser when that PI is encountered. Marking the bug fixed.
Ray, can you verify that this got fiexed for M9?
harishd, I though you didn't fix this and we decide to use another way to handle this. Should we reopen this bug and assign it to me (ftang) ?
reopen this bug and assign it to ftang.
mark this M10
Add code into nsParser.cpp to detect BOM and also implement the Appendix F of XML 1.0
Used files at babel/automation/xmlencoding/ to verify this fix