Open Bug 335944 Opened 14 years ago Updated 9 years ago

Sort out how to use nsIUnicodeDecoder::Reset()

Categories

(Core :: Internationalization, defect)

defect
Not set

Tracking

()

People

(Reporter: emk, Assigned: smontagu)

Details

From bug 335816:
 ------- Comment #5 From David Baron  2006-04-28 15:46 PDT  [reply] -------

(From update of attachment 220128 [details] [diff] [review] [edit])
>+  mFirst = PR_TRUE;
>+

It seems like it would be better not to touch mFirst in Reset().  The current
semantics of Reset are that it resets all the per-character state.  I'd rather
leave it that way and set mFirst in the constructor and where you do now, and
leave the call to Reset.

(snip)

 ------- Comment #6 From David Baron  2006-04-28 15:49 PDT  [reply] -------

Also, I think Reset is used by the callers to handle encoding errors (advance
and continue decoding), and I don't think you want it to reset mFirst.

 ------- Comment #8 From Masatoshi Kimura (:emk)  2006-04-28 18:19 PDT  [reply] -------

(snip)

Hm... a comment in nsIDecoder.h is now out of date?
http://lxr.mozilla.org/seamonkey/source/intl/uconv/public/nsIUnicodeDecoder.h
| Resets the charset converter so it may be recycled for a completely 
| different and urelated buffer of data.

 ------- Comment #9 From David Baron  2006-04-28 18:34 PDT  [reply] -------

(snip)

Have you looked to see if there are callers of Reset doing what's originally
documented?  It's probably much less risky the way you wrote it, actually, so
we should probably leave the Reset behavior as you suggested.

 ------- Comment #10 From Masatoshi Kimura (:emk)  2006-04-28 18:38 PDT  [reply] -------

I didn't seek callers, but the UTF16 decoder also resets the state regarding
the BOM.
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/intl/uconv/ucvlatin/nsUCS2BEToUnicode.cpp&rev=1.19#125
| nsUTF16ToUnicodeBase::Reset()
| {
|   mState = 2;
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/intl/uconv/ucvlatin/nsUCS2BEToUnicode.cpp&rev=1.19#215
| nsUTF16ToUnicode::Convert(const char * aSrc, PRInt32 * aSrcLength,
|                           PRUnichar * aDest, PRInt32 * aDestLength)
| {
|   if(2 == mState) // first time called
QA Contact: amyy → i18n
Yeah... there are still problems here related to BOM state.  When we Reset() due to errors, we probably shouldn't reset BOM state.  However, maybe we can make the SetInputErrorBehavior code reliable throughout all decoders, and then stop doing the Reset() on error stuff.  I filed bug 638379 on that.
You need to log in before you can comment on or make changes to this bug.