Closed
Bug 63841
Opened 25 years ago
Closed 25 years ago
[Composer / ISO-2022-JP Charset]Characters are messed up when input hankaku katakana after kanji.
Categories
(Core :: Internationalization, defect, P3)
Core
Internationalization
Tracking
()
VERIFIED
FIXED
mozilla0.9
People
(Reporter: amyy, Assigned: nhottanscp)
References
Details
(Keywords: intl)
12-27 Mtrunk build:
[Composer / ISO-2022-JP Charset]Characters are messed up when input hankaku
katakana after kanji.
Steps to reproduce:
1. Start Composer.
2. View | Character Coding | More | East Asian | Japanese(ISO-2022-JP).
3. Type a Japanese kanji(e.g. "hyou") follow by a hankaku katakana(e.g. "a").
4. Save as a file and click on "Browse" icon to bring the page navigator Window.
Result:
The characters are messed up, and if you close the file and re-open it, you can
see the incorrect characrters in Composer also.
Notes:
1. It exists in WinME and Mac, I have no idea with Linux hankaku katakana input
method. However, in Linux I can type some kanji and zenkaku katakana, when you
browse the page, and go [View] | [Page Source], the characters in the body are
show reference code like "ウ" instead of show kanji or kakakana.
2. WinME, after you created a page and save it, sometimes most of icons of
Coposition Toolbar are disable.
3. After create a page, and bring the page source in [View], a lots of time, the
hankaku katakana show reference code.
| Reporter | ||
Comment 1•25 years ago
|
||
Change QA contact and add keywords.
Updated•25 years ago
|
Comment 2•25 years ago
|
||
This is regression. Related bug is 49262. Changed the component to international and assign to
nhotta@netscape.com.
| Assignee | ||
Comment 3•25 years ago
|
||
I can reproduce this with NS 6 release build (so probably not a regression).
Reassign to ftang.
Assignee: nhotta → ftang
Comment 4•25 years ago
|
||
I cannot reproduce this on my 2000100908 build. It probably introduce after that
time.
Comment 5•25 years ago
|
||
Can someone try Beta3 . both ylong and nhotta show me this problem in trunk and
N6RTM. Reassign this to yokoyama. Notice the "hyou" (you have to hit return to
convert) contains 5c in the 2nd byte. It might be some editor code strip out
0x5c incorrectly introduced after beta3.
Reassign to yokoyama to work on. Yokoyama, please talk to nhotta about this.
Assignee: ftang → yokoyama
| Reporter | ||
Comment 6•25 years ago
|
||
It might not cause by contains 5c in 2nd byte. When I showed this to ftang with
01-09-06 Win Mtrunk build, it just normal kanji follow by hankaku katakana.
Comment 7•25 years ago
|
||
This is interesting, if you use an plain text editor to see the raw byte, it show
esc + "$B" + "I=ハ" + esc + "(B"
If I don't put the halfwidth hiragana in there, I will get
esc + "$B" + "I="+ esc + "(B"
Notice that esc + "$B" is the shift in escape sequence for JIS x0208 in
ISO-2022-JP and esc + "(B" is the shift out escape sequence for JIS x0208 in
ISO-2022-JP.
and "I=" is the JIS x0208 in 7 bits for the character "hyou".
The correct result should be
esc + "$B" + "I=" + esc + "(B" + "ハ"
I guess the problem code is in
mozilla/htmlparser/src/nsHTMLContentSinkStream.cpp
210 akkana 3.69 NS_IMETHODIMP
211 akkana 3.89 nsHTMLContentSinkStream::InitEncoders()
212 akkana 3.41 {
...
206
207 akkana 3.69 /**
208 * Initialize the Unicode encoder with our current
mCharsetOverride.
209 akkana 3.41 */
210 akkana 3.69 NS_IMETHODIMP
211 akkana 3.89 nsHTMLContentSinkStream::InitEncoders()
212 akkana 3.41 {
213 akkana 3.69 nsresult res;
214 akkana 3.41
215 akkana 3.89 // Initialize an entity encoder if we're using the string interface:
216 if (mString && (mFlags &
nsIDocumentEncoder::OutputEncodeEntities))
217 res =
nsComponentManager::CreateInstance(kEntityConverterCID, NULL,
218
NS_GET_IID(nsIEntityConverter),
219
getter_AddRefs(mEntityConverter));
220 akkana 3.41
221 akkana 3.89 // Initialize a charset encoder if we're using the stream interface
222 if (mStream)
223 rickg 3.37 {
224 jst 3.108 nsAutoString charsetName; charsetName.Assign(mCharsetOverride);
225 akkana 3.89 NS_WITH_SERVICE(nsICharsetAlias, calias, kCharsetAliasCID, &res);
226 scc 3.92 if (NS_SUCCEEDED(res) && calias) {
227 jst 3.108 nsAutoString temp; temp.Assign(mCharsetOverride);
228 scc 3.92 res = calias->GetPreferred(temp, charsetName);
229 }
230 akkana 3.89 if (NS_FAILED(res))
231 rickg 3.37 {
232 akkana 3.89 // failed - unknown alias , fallback to ISO-8859-1
233 scc 3.92 charsetName.AssignWithConversion("ISO-8859-1");
234 rickg 3.37 }
235 akkana 3.89 236 res =
nsComponentManager::CreateInstance(kSaveAsCharsetCID, NULL,
237
NS_GET_IID(nsISaveAsCharset),
238
getter_AddRefs(mCharsetEncoder));
239 if (NS_FAILED(res))
240 return res;
241 // SaveAsCharset requires a const char* in its first
argument:
242 mjudge 3.98 nsCAutoString charsetCString;
charsetCString.AssignWithConversion(charsetName);
243 akkana 3.89 // For ISO-8859-1 only, convert to entity first (always generate
entites like ).
244 res = mCharsetEncoder->Init(charsetCString,
245
charsetName.EqualsIgnoreCase("ISO-8859-1") ?
246
nsISaveAsCharset::attr_htmlTextDefault :
247
nsISaveAsCharset::attr_EntityAfterCharsetConv
248 +
nsISaveAsCharset::attr_FallbackDecimalNCR,
249 nhotta 3.107 nsIEntityConverter::html32);
I think the 0x5C is not a factor here. I can reproduce wiht character which do
not have a 5c in it.
I think someone should debug through intl/unicharutil/src/nsSaveAsCharset.cpp
nsSaveAsCharset::DoCharsetConversion(const PRUnichar *inString, char **outString)
nsSaveAsCharset::DoConversionFallBack
and see what happen there.
I reassign this back to nhotta since he know tha code better.
Put P3 as priority for now.
ji- please try to send it in HTML Mail and see does it do the same thing there.
If so, we should change this to P2 since ISO-2022-JP is much more important in
Mail than web page.
Assignee: yokoyama → nhotta
Priority: -- → P3
| Assignee | ||
Comment 9•25 years ago
|
||
Remove 'regression' keyword because it's reproducible with RTM.
Keywords: regression
| Assignee | ||
Comment 10•25 years ago
|
||
Comment 12•25 years ago
|
||
Reassigning to anthonyd who owns the serializer code.
Assignee: jst → anthonyd
Comment 13•25 years ago
|
||
I'm not real sure why this is now on my plate, but oh well. If this is any sort
of priority, then some one form 118n should take it.
setting to future.
anthonyd
| Assignee | ||
Comment 14•25 years ago
|
||
Reassign to nhotta.
| Assignee | ||
Comment 15•25 years ago
|
||
Bug 59679 was fixed, I cannot reproduce the problem using today's build.
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•