Closed
Bug 63841
Opened 24 years ago
Closed 24 years ago
[Composer / ISO-2022-JP Charset]Characters are messed up when input hankaku katakana after kanji.
Categories
(Core :: Internationalization, defect, P3)
Core
Internationalization
Tracking
()
VERIFIED
FIXED
mozilla0.9
People
(Reporter: amyy, Assigned: nhottanscp)
References
Details
(Keywords: intl)
12-27 Mtrunk build: [Composer / ISO-2022-JP Charset]Characters are messed up when input hankaku katakana after kanji. Steps to reproduce: 1. Start Composer. 2. View | Character Coding | More | East Asian | Japanese(ISO-2022-JP). 3. Type a Japanese kanji(e.g. "hyou") follow by a hankaku katakana(e.g. "a"). 4. Save as a file and click on "Browse" icon to bring the page navigator Window. Result: The characters are messed up, and if you close the file and re-open it, you can see the incorrect characrters in Composer also. Notes: 1. It exists in WinME and Mac, I have no idea with Linux hankaku katakana input method. However, in Linux I can type some kanji and zenkaku katakana, when you browse the page, and go [View] | [Page Source], the characters in the body are show reference code like "ウ" instead of show kanji or kakakana. 2. WinME, after you created a page and save it, sometimes most of icons of Coposition Toolbar are disable. 3. After create a page, and bring the page source in [View], a lots of time, the hankaku katakana show reference code.
Reporter | ||
Comment 1•24 years ago
|
||
Change QA contact and add keywords.
Updated•24 years ago
|
Comment 2•24 years ago
|
||
This is regression. Related bug is 49262. Changed the component to international and assign to nhotta@netscape.com.
Assignee | ||
Comment 3•24 years ago
|
||
I can reproduce this with NS 6 release build (so probably not a regression). Reassign to ftang.
Assignee: nhotta → ftang
Comment 4•24 years ago
|
||
I cannot reproduce this on my 2000100908 build. It probably introduce after that time.
Comment 5•24 years ago
|
||
Can someone try Beta3 . both ylong and nhotta show me this problem in trunk and N6RTM. Reassign this to yokoyama. Notice the "hyou" (you have to hit return to convert) contains 5c in the 2nd byte. It might be some editor code strip out 0x5c incorrectly introduced after beta3. Reassign to yokoyama to work on. Yokoyama, please talk to nhotta about this.
Assignee: ftang → yokoyama
Reporter | ||
Comment 6•24 years ago
|
||
It might not cause by contains 5c in 2nd byte. When I showed this to ftang with 01-09-06 Win Mtrunk build, it just normal kanji follow by hankaku katakana.
Comment 7•24 years ago
|
||
This is interesting, if you use an plain text editor to see the raw byte, it show esc + "$B" + "I=ハ" + esc + "(B" If I don't put the halfwidth hiragana in there, I will get esc + "$B" + "I="+ esc + "(B" Notice that esc + "$B" is the shift in escape sequence for JIS x0208 in ISO-2022-JP and esc + "(B" is the shift out escape sequence for JIS x0208 in ISO-2022-JP. and "I=" is the JIS x0208 in 7 bits for the character "hyou". The correct result should be esc + "$B" + "I=" + esc + "(B" + "ハ" I guess the problem code is in mozilla/htmlparser/src/nsHTMLContentSinkStream.cpp 210 akkana 3.69 NS_IMETHODIMP 211 akkana 3.89 nsHTMLContentSinkStream::InitEncoders() 212 akkana 3.41 { ... 206 207 akkana 3.69 /** 208 * Initialize the Unicode encoder with our current mCharsetOverride. 209 akkana 3.41 */ 210 akkana 3.69 NS_IMETHODIMP 211 akkana 3.89 nsHTMLContentSinkStream::InitEncoders() 212 akkana 3.41 { 213 akkana 3.69 nsresult res; 214 akkana 3.41 215 akkana 3.89 // Initialize an entity encoder if we're using the string interface: 216 if (mString && (mFlags & nsIDocumentEncoder::OutputEncodeEntities)) 217 res = nsComponentManager::CreateInstance(kEntityConverterCID, NULL, 218 NS_GET_IID(nsIEntityConverter), 219 getter_AddRefs(mEntityConverter)); 220 akkana 3.41 221 akkana 3.89 // Initialize a charset encoder if we're using the stream interface 222 if (mStream) 223 rickg 3.37 { 224 jst 3.108 nsAutoString charsetName; charsetName.Assign(mCharsetOverride); 225 akkana 3.89 NS_WITH_SERVICE(nsICharsetAlias, calias, kCharsetAliasCID, &res); 226 scc 3.92 if (NS_SUCCEEDED(res) && calias) { 227 jst 3.108 nsAutoString temp; temp.Assign(mCharsetOverride); 228 scc 3.92 res = calias->GetPreferred(temp, charsetName); 229 } 230 akkana 3.89 if (NS_FAILED(res)) 231 rickg 3.37 { 232 akkana 3.89 // failed - unknown alias , fallback to ISO-8859-1 233 scc 3.92 charsetName.AssignWithConversion("ISO-8859-1"); 234 rickg 3.37 } 235 akkana 3.89 236 res = nsComponentManager::CreateInstance(kSaveAsCharsetCID, NULL, 237 NS_GET_IID(nsISaveAsCharset), 238 getter_AddRefs(mCharsetEncoder)); 239 if (NS_FAILED(res)) 240 return res; 241 // SaveAsCharset requires a const char* in its first argument: 242 mjudge 3.98 nsCAutoString charsetCString; charsetCString.AssignWithConversion(charsetName); 243 akkana 3.89 // For ISO-8859-1 only, convert to entity first (always generate entites like ). 244 res = mCharsetEncoder->Init(charsetCString, 245 charsetName.EqualsIgnoreCase("ISO-8859-1") ? 246 nsISaveAsCharset::attr_htmlTextDefault : 247 nsISaveAsCharset::attr_EntityAfterCharsetConv 248 + nsISaveAsCharset::attr_FallbackDecimalNCR, 249 nhotta 3.107 nsIEntityConverter::html32); I think the 0x5C is not a factor here. I can reproduce wiht character which do not have a 5c in it. I think someone should debug through intl/unicharutil/src/nsSaveAsCharset.cpp nsSaveAsCharset::DoCharsetConversion(const PRUnichar *inString, char **outString) nsSaveAsCharset::DoConversionFallBack and see what happen there. I reassign this back to nhotta since he know tha code better. Put P3 as priority for now. ji- please try to send it in HTML Mail and see does it do the same thing there. If so, we should change this to P2 since ISO-2022-JP is much more important in Mail than web page.
Assignee: yokoyama → nhotta
Priority: -- → P3
Assignee | ||
Comment 9•24 years ago
|
||
Remove 'regression' keyword because it's reproducible with RTM.
Keywords: regression
Assignee | ||
Comment 10•24 years ago
|
||
I found nsISaveAsCharset is not used any more (see bug 65324, bug 59679). Probably a problem in nsDocumentEncoder.cpp, cc to jst.
Comment 12•24 years ago
|
||
Reassigning to anthonyd who owns the serializer code.
Assignee: jst → anthonyd
Comment 13•24 years ago
|
||
I'm not real sure why this is now on my plate, but oh well. If this is any sort of priority, then some one form 118n should take it. setting to future. anthonyd
Assignee | ||
Comment 14•24 years ago
|
||
Reassign to nhotta.
Assignee | ||
Comment 15•24 years ago
|
||
Bug 59679 was fixed, I cannot reproduce the problem using today's build.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•