7965 - [converter] need ISO-2022-CN converters

Frank Tang

Reporter

Updated

•

25 years ago

Status: NEW → ASSIGNED

Target Milestone: M11

Frank Tang

Reporter

Updated

•

25 years ago

Summary: need ISO-2022-CN converters → [converter] need ISO-2022-CN converters

Frank Tang

Reporter

Updated

•

25 years ago

Target Milestone: M11 → M10

Frank Tang

Reporter

Updated

•

25 years ago

Target Milestone: M10 → M14

Frank Tang

Reporter

Description

•

25 years ago

Change to M14 since it is Post Beta

Frank Tang

Reporter

Updated

•

25 years ago

Assignee: ftang → yueheng.xu

Status: ASSIGNED → NEW

Frank Tang

Reporter

Comment 1

•

25 years ago

you may want to split this bug to two, one for nsIUnicodeEncoder, one for
nsIUnicodeDecoder. Mark the M build, please. Thanks.

yueheng.xu

Updated

•

25 years ago

Status: NEW → ASSIGNED

Frank Tang

Reporter

Comment 2

•

25 years ago

yueheng.xu: What is the status of ISO-2022-CN converter ?

Frank Tang

Reporter

Comment 3

•

25 years ago

Do we really need to support ISO-2022-CN ? See the following email exchanges-
=======================================================================
Subject: 
             software and usage of ISO-2022-CN
        Date: 
             Fri, 01 Oct 1999 17:30:12 -0700
       From: 
             ftang@netscape.com (Yung-Fong Tang)
 Organization: 
             Netscape
         To: 
             zhf@net.tsinghua.edu.cn, zhf@net.edu.cn, hdy@tsinghua.edu.cn,
             tckao@iiidns.iii.org.tw, chung@iiidns.iii.org.tw, 
MRC@CAC.Washington.EDU
         CC: 
             mozilla-i18n@mozilla.org, taka@netscape.com, momoi@netscape.com,
             erik@netscape.com




Dear RFC 1922 authors:

I think I have ask this question one year ago and I don't think any
author reply me yet. Let me ask again-
1. Is there any software currently support ISO-2022-CN ? Can you list
them ? (Mark C, Is that true your IMAP 4 server support ISO-2022-CN as
search charset ?)

2. Is there known usage of ISO-2022-CN ? Private email exanges ?
mailling list which use ISO-2022-CN ? newsgroup that use ISO-2022-CN ?

3. Can ISO-2022-CN replaced by GB2312 or BIG5 with B64 in MIME ? Why not
?

Same question for ISO-2022-CN-EXT.

Thanks.
==========================================================================
Subject: 
        re: software and usage of ISO-2022-CN
   Date: 
        Fri, 1 Oct 1999 17:31:25 -0700 (PDT)
   From: 
        Mark Crispin <MRC@CAC.Washington.EDU>
     To: 
        ftang@netscape.com
    CC: 
        zhf@net.tsinghua.edu.cn, zhf@net.edu.cn, hdy@tsinghua.edu.cn, 
tckao@iiidns.iii.org.tw,
        chung@iiidns.iii.org.tw, mozilla-i18n@mozilla.org, taka@netscape.com,
        momoi@netscape.com, erik@netscape.com




I've seen email messages in ISO-2022-CN, although typically it's been GB2312
encoded in ISO 2022 (no CNS 11643).

My IMAP server supports ISO-2022-CN, as well as GB2312/CN-GB, CN-GB12345,
BIG5/CN-BIG5.  Actually, I treat all ISO-2022-xx charsets the same; I have a
general ISO 2022 interpreter that will interpret just about any ISO 2022 text
you throw at it.

I support CNS 11643 planes 1 and 2, not 3-7 or 14.

I'm still trying to work out how CNS planes 3-7 relate to CNS plane 14 and
Unicode.  As far as I can tell, plane 14 is now CNS plane 3 (but are the
codepoints the same?  e.g., is CNS 0x32121 Unicode 0x4e28?); plane 4 is just a
repeat of the Han characters from Unicode 2.0 (albeit in different locations);
and I don't know anything about how planes 5-7 relate to Unicode.

==========================================================================
Subject: 
            Re: software and usage of ISO-2022-CN
       Date: 
            Sat, 02 Oct 1999 11:55:23 +0800
      From: 
            高天助 <tckao@iii.org.tw>
        To: 
            Mark Crispin <MRC@CAC.Washington.EDU>
        CC: 
            ftang@netscape.com, zhf@net.tsinghua.edu.cn, zhf@net.edu.cn, 
hdy@tsinghua.edu.cn,
            tckao@iiidns.iii.org.tw, chung@iiidns.iii.org.tw, 
mozilla-i18n@mozilla.org,
            taka@netscape.com, momoi@netscape.com, erik@netscape.com,
            Emily Hsu <emilyhsu@iii.org.tw>, "何建民 Ho" 
<hoho@iis.sinica.edu.tw>
 References: 
            1




The previous 6148 characters of CNS plane 14 has been moved to plane 3 and has 
the
same code point . the last 171 charcters of plane 14 has been distributed into
plane 4 according to the stroke and radical sequence.
A few characters of plane 5-7 and 15(used in personal name) has been encoded 
into
Unicode 2.0, and we are working to encoded the othe characters of plane 4-7 and 
15
into plane 2 of ISO 10646 now,  and these characters will be encoded by UTF 16 
in
Unicode in the future.
Some expert reviewed the RFC 1922   4 months ago in Taiwan, and made the 
following
decision that is Taiwan should not endorsed RFC 1922 based on the following
rationale:
        1. there is just one character name" big-5" registered in ICANN, but we
used the name CN-Big 5 in RFC 1922, it will cause more confusion to vendors and
users.
        2. We will adopt the Unicode( ISO 10646) in the future in Taiwan, and 
most
of the software has supported or going to support the Unicoded  , so these 
expert
suggested that the Taiwan should not promote any other encoding method to 
confuse
the user.

        Base on the above reason, we are considing to withdraw our endorsement
form RFC 1922.
        If you have any question to the relation of CNS and unicode, my 
colleague
Emily Hus  will  help you to make it more clear.
T.C.Kao
==========================================================================
Subject: 
        Re: software and usage of ISO-2022-CN
   Date: 
        Fri, 1 Oct 1999 21:13:49 -0700 (PDT)
   From: 
        Mark Crispin <MRC@CAC.Washington.EDU>
     To: 
        高天助 <tckao@iii.org.tw>
    CC: 
        ftang@netscape.com, zhf@net.tsinghua.edu.cn, hdy@tsinghua.edu.cn,
        tckao@iiidns.iii.org.tw, chung@iiidns.iii.org.tw, 
mozilla-i18n@mozilla.org,
        taka@netscape.com, momoi@netscape.com, erik@netscape.com,
        Emily Hsu <emilyhsu@iii.org.tw>, 何建民 Ho <hoho@iis.sinica.edu.tw>




On Sat, 02 Oct 1999 11:55:23 +0800, =?big5?B?sKqk0adV?= wrote:
> The previous 6148 characters of CNS plane 14 has been moved to plane 3 and
> has the same code point . the last 171 charcters of plane 14 has been
> distributed into plane 4 according to the stroke and radical sequence.

Hmm.  I only have information about 4197 characters in plane 14; codepoints
0xE2121 through 0xE672A.  However, I see that some CNS plane 14 codepoints
aren't listed in my source data (the Unicode mappings file for CNS) which
suggests that these are CNS codepoints which do not have a corresponding
Unicode codepoint.  The first of these missing codepoints is 0xE2138.

So, which codepoint range went to plane 3 and which went to plane 4?

I support CNS planes 1 and 2 in my software, and I'd like to support the other
planes, but I need information about how to do this.  At this point, I'm
primarily interested in those characters which are in the BMP.

> Some expert reviewed the RFC 1922   4 months ago in Taiwan, and made the
> following decision that is Taiwan should not endorsed RFC 1922 based on the
> following rationale [...]

In my opinion, we should probably retire the names CN-BIG5 and CN-GB
immediately, in favor of GB2312 and BIG5.

I think that there is still a limited future for ISO-2022-CN as a transitional
mechanism, since we don't yet have full deployment of Unicode or 8-bit clean
links.  I have seen ISO-2022-CN used in email.

Nevertheless, I agree that we should be concentrating our efforts on Unicode.

> Base on the above reason, we are considing to withdraw our endorsement
> form RFC 1922.

I recommend the following compromise position:
 1) withdraw endorsement of CN-BIG5 and CN-GB, on the grounds that these never
    became widely deployed and other names supercede them.
 2) since ISO-2022-CN-EXT and GB-12345, etc. never became widely deployed, no
    efforts should be expended on either of these.
 3) recommend that ISO-2022-CN only be used in transitional applications, and
    that most efforts should be made for Unicode based applications.
 4) the primary focus of activity is Unicode, and the fate of RFC 1922 should
    become "historical" (as opposed to "withdrawn").

What do you think of this?

-- Mark --
==========================================================================
Subject: 
            Re: software and usage of ISO-2022-CN
       Date: 
            Tue, 05 Oct 1999 19:30:42 +0800
      From: 
            高天助 <tckao@iii.org.tw>
        To: 
            ftang@netscape.com
        CC: 
            Mark Crispin <MRC@CAC.Washington.EDU>, zhf@net.tsinghua.edu.cn,
            zhf@net.edu.cn, hdy@tsinghua.edu.cn, tckao@iiidns.iii.org.tw, 
chung@iiidns.iii.org.tw,
            mozilla-i18n@mozilla.org, taka@netscape.com, momoi@netscape.com,
            erik@netscape.com, Emily Hsu <emilyhsu@iii.org.tw>,
            "何建民 Ho" <hoho@iis.sinica.edu.tw>
 References: 
            1 , 2 , 3

Yung-Fong Tang 寫道：
>
> How about Big5-Plus now ? Will you promote it ? Or yet another dead-spec again 
?
>

    Big5-plus was initiated by Research,Development, and Evaluation Commission 
of
Executive Yuan, As I understand, it will be a dead-spec, no organization will 
promote it.
==========================================================================
Subject: 
        Re: software and usage of ISO-2022-CN
   Date: 
        Tue, 5 Oct 1999 09:02:15 -0700 (PDT)
   From: 
        Mark Crispin <MRC@CAC.Washington.EDU>
     To: 
        高天助 <tckao@iii.org.tw>
    CC: 
        ftang@netscape.com, zhf@net.tsinghua.edu.cn, hdy@tsinghua.edu.cn,
        tckao@iiidns.iii.org.tw, chung@iiidns.iii.org.tw, 
mozilla-i18n@mozilla.org,
        taka@netscape.com, momoi@netscape.com, erik@netscape.com,
        Emily Hsu <emilyhsu@iii.org.tw>, 何建民 Ho <hoho@iis.sinica.edu.tw>




On Tue, 05 Oct 1999 19:43:11 +0800, =?big5?B?sKqk0adV?= wrote:
> O.K., I think this statements will be more reality, can you suggest what
> should we do next step.

In my opinion, nothing needs to be done.  It has already happened.  Since RFC
1922 is Informational, there's no need to take any specific document action.
It would be too much work to issue an update to RFC 1922 or make official
statements.

Informally, if someone asks, "what should I do about RFC 1922", tell them the
following recommendations on an informal basis:
 1) RFC 1922 is becoming an historical document due to the Internet-wide
    transition to Unicode.
 2) implement GB2312 EUC and BIG5, since that's what most people use.
 3) from RFC 1922, just implement ISO-2022-CN, at least to encode GB2312 in
    ISO 2022.  It is also a good idea to implement HZ.
 4) don't use the CN-xxx names from RFC 1922, but it is a good idea to
    recognize them as aliases.
 5) don't worry about anything else in RFC 1922.
 6) implement UTF-8 now, since this will become the preferred standard for
    non-English email.
 7) if you implement your application to use Unicode internally, it will be
    much easier to follow these recommendations.

A similar situation exists with Korean.  EUC-KR is preferred in Korea, but
sometimes you still see ISO-2022-KR.

This situation -- a gradual change to obsolescence -- happens to RFCs all the
time.  It's not a big problem.  Fortunately, since the transition to UTF-8 is
happening across the entire Internet, there's no special "Chinese problem"
caused by this either.  It's an "everybody's problem".

If someone implements a multi-lingual email application, it is not difficult
to include ISO-2022-CN support, since you need the ISO 2022 capability for
Japanese.
==========================================================================
Subject: 
             Re: software and usage of ISO-2022-CN
        Date: 
             Wed, 06 Oct 1999 04:49:49 -0700
       From: 
             ftang@netscape.com (Yung-Fong Tang)
 Organization: 
             Netscape
         To: 
             Mark Crispin <MRC@CAC.Washington.EDU>, ftang@netscape.com,
             chung@iiidns.iii.org.tw, erik@netscape.com, 
zhf@net.tsinghua.edu.cn,
             Emily Hsu <emilyhsu@iii.org.tw>, hdy@tsinghua.edu.cn, 
taka@netscape.com,
             tckao@iiidns.iii.org.tw, momoi@netscape.com, 
hoho@iis.sinica.edu.tw,
             lunde@adobe.com
 Newsgroups: 
             netscape.public.mozilla.i18n
  References: 
             1 , 2




Add Ken Lunde to the cc. I will forward him all our discussion up to this
message.

Mark Crispin wrote:

> On Tue, 05 Oct 1999 19:43:11 +0800, =?big5?B?sKqk0adV?= wrote:
> > O.K., I think this statements will be more reality, can you suggest what
> > should we do next step.
>
> In my opinion, nothing needs to be done.  It has already happened.  Since RFC
> 1922 is Informational, there's no need to take any specific document action.
> It would be too much work to issue an update to RFC 1922 or make official
> statements.
>
> Informally, if someone asks, "what should I do about RFC 1922", tell them the
> following recommendations on an informal basis:
>  1) RFC 1922 is becoming an historical document due to the Internet-wide
>     transition to Unicode.

Agree

>
>  2) implement GB2312 EUC and BIG5, since that's what most people use.

Agree

>
>  3) from RFC 1922, just implement ISO-2022-CN, at least to encode GB2312 in
>     ISO 2022.  It is also a good idea to implement HZ.

I wonder we should even recommed people to implement GB encode ISO-2022-CN part.
I have not heard anyone from PRC join this discussion yet. Is ISO-2022-CN really
used in PRC ? How popular ? Or we could even ignore that part ? (Question
Question Question, not statement)

I totally agree it is a good idea to implement HZ. I saw a lot of HZ data on web
site or newsgroup. I never see any REAL LIVE ISO-2022-CN data , even the GB2312
encoded one.

>
>  4) don't use the CN-xxx names from RFC 1922, but it is a good idea to
>     recognize them as aliases.

Yes, we (Netscape) already add them as alias for 4.x for 1~3 years.

>
>  5) don't worry about anything else in RFC 1922.
>  6) implement UTF-8 now, since this will become the preferred standard for
>     non-English email.
>  7) if you implement your application to use Unicode internally, it will be
>     much easier to follow these recommendations.
>
> A similar situation exists with Korean.  EUC-KR is preferred in Korea, but
> sometimes you still see ISO-2022-KR.

I think the situration is a little bit different for ISO-2022-KR. There are
popular SendMail modification which send ISO-2022-KR.

>
>
> This situation -- a gradual change to obsolescence -- happens to RFCs all the
> time.  It's not a big problem.  Fortunately, since the transition to UTF-8 is
> happening across the entire Internet, there's no special "Chinese problem"
> caused by this either.  It's an "everybody's problem".
>
> If someone implements a multi-lingual email application, it is not difficult
> to include ISO-2022-CN support, since you need the ISO 2022 capability for
> Japanese.

Strongly disagree. When we convert Unicode to ISO-2022-JP or ISO-2022-KR, it is
clear which byte combination we should convert to. But for ISO-2022-CN, the code
need to first decide encode in GB seq or CNS seq. This is the most difficult
part- e.g. "undo CJK Unification".
==========================================================================
Subject: 
        Re: software and usage of ISO-2022-CN
   Date: 
        Wed, 6 Oct 1999 04:52:07 -0700 (PDT)
   From: 
        Mark Crispin <MRC@CAC.Washington.EDU>
     To: 
        ftang@netscape.com
    CC: 
        chung@iiidns.iii.org.tw, erik@netscape.com, zhf@net.tsinghua.edu.cn,
        Emily Hsu <emilyhsu@iii.org.tw>, hdy@tsinghua.edu.cn, taka@netscape.com,
        tckao@iiidns.iii.org.tw, momoi@netscape.com, hoho@iis.sinica.edu.tw, 
lunde@adobe.com




Both Annie and I have received messages encoded in ISO-2022-CN.  Some people
are using it.

> > If someone implements a multi-lingual email application, it is not
> > difficult to include ISO-2022-CN support, since you need the ISO 2022
> > capability for Japanese.
>
> Strongly disagree. When we convert Unicode to ISO-2022-JP or ISO-2022-KR, it
> is clear which byte combination we should convert to. But for ISO-2022-CN,
> the code need to first decide encode in GB seq or CNS seq. This is the most
> difficult part- e.g. "undo CJK Unification".

That's only if you support sending mail in ISO-2022-CN.  You don't have to
worry about it if you only support reading ISO-2022-CN, and send mail with a
different charset.

In other words, treat it like incoming Shift-JIS; recognize it and display it
properly, but don't generate it.

As a practical matter, I would implement sending ISO-2022-CN by sending GB
sequences unless there is no suitable GB codepoint, in which case I use the
CNS sequence.  There's too much shifting in ISO 2022 for it to be suitable for
100% CNS text.
==========================================================================
Subject: 
             Re: software and usage of ISO-2022-CN
        Date: 
             Fri, 08 Oct 1999 09:39:30 -0700
       From: 
             ftang@netscape.com (Yung-Fong Tang)
 Organization: 
             Netscape/AOL
         To: 
             Mark Crispin <MRC@CAC.Washington.EDU>
         CC: 
             chung@iiidns.iii.org.tw, erik@netscape.com, 
zhf@net.tsinghua.edu.cn,
             Emily Hsu <emilyhsu@iii.org.tw>, hdy@tsinghua.edu.cn, 
taka@netscape.com,
             tckao@iiidns.iii.org.tw, momoi@netscape.com, 
hoho@iis.sinica.edu.tw,
             lunde@adobe.com
  References: 
             1





Mark Crispin wrote:

> Both Annie and I have received messages encoded in ISO-2022-CN.  Some people
> are using it.
>

Back to my origional question. Which software generate them ?

>
> > > If someone implements a multi-lingual email application, it is not
> > > difficult to include ISO-2022-CN support, since you need the ISO 2022
> > > capability for Japanese.
> >
> > Strongly disagree. When we convert Unicode to ISO-2022-JP or ISO-2022-KR, it
> > is clear which byte combination we should convert to. But for ISO-2022-CN,
> > the code need to first decide encode in GB seq or CNS seq. This is the most
> > difficult part- e.g. "undo CJK Unification".
>
> That's only if you support sending mail in ISO-2022-CN.  You don't have to
> worry about it if you only support reading ISO-2022-CN, and send mail with a
> different charset.

Sure, but I though we SHOULD generate them if we want to support RFC 1922, right 
?

>
>
> In other words, treat it like incoming Shift-JIS; recognize it and display it
> properly, but don't generate it.
>
> As a practical matter, I would implement sending ISO-2022-CN by sending GB
> sequences unless there is no suitable GB codepoint, in which case I use the
> CNS sequence.  There's too much shifting in ISO 2022 for it to be suitable for
> 100% CNS text.

Since reasonable after heard opinions from III, Taiwan, ROC
====Subject: 
        Re: software and usage of ISO-2022-CN
   Date: 
        Fri, 8 Oct 1999 11:58:52 -0700 (PDT)
   From: 
        Mark Crispin <MRC@CAC.Washington.EDU>
     To: 
        Yung-Fong Tang <ftang@netscape.com>
    CC: 
        chung@iiidns.iii.org.tw, erik@netscape.com, zhf@net.tsinghua.edu.cn,
        Emily Hsu <emilyhsu@iii.org.tw>, hdy@tsinghua.edu.cn, taka@netscape.com,
        tckao@iiidns.iii.org.tw, momoi@netscape.com, hoho@iis.sinica.edu.tw, 
lunde@adobe.com




On Fri, 08 Oct 1999 09:39:30 -0700, Yung-Fong Tang wrote:
> Mark Crispin wrote:
> > Both Annie and I have received messages encoded in ISO-2022-CN.  Some
> > people are using it.
> Back to my origional question. Which software generate them ?

I don't know, or I would have told you.  As a guess, it may be a Chinese
version of Eudora.

We've configured Pine to send ISO-2022-CN mail.  But that isn't a matter of
"Pine generates ISO-2022-CN mail"; it's a matter of "Pine can be configured to
generate ISO-2022-CN mail via external filters."  That is, the Pine user
enters and edits GB2312 text in Pine's editor; when the message is sent Pine
follows a configuration file filter rule to pass it through ncf to convert it
to ISO-2022-CN, and sets the outgoing charset=ISO-2022-CN.

When sending 8-bit GB2312 mail, Pine is very likely to convert the whole thing
into BASE64.
======================================================================

Teruko Kobayashi

Updated

•

25 years ago

Keywords: beta1

bobj

Comment 4

•

25 years ago

Cleared the beta1 keyword because we will not hold Beta1 for this, but
leaving targeted for M14.  If yueheng.xu@intel.com can get this completed,
reviewed and checked-in by 2/15, we'll take it for Beta1.

Keywords: beta1

bobj

Updated

•

25 years ago

Whiteboard: patch from mozilla submitted for review

Frank Tang

Reporter

Comment 5

•

25 years ago

remove "patch from mozilla submitted for review" . The patch they send is not 
for this bug but for a not reported GBK problem.

Whiteboard: patch from mozilla submitted for review

Frank Tang

Reporter

Comment 6

•

25 years ago

this is not M14. yueheng.xu@intel.com didn't make it and we don't need this for 
beta1. Move to M16

Target Milestone: M14 → M16

msanz

Updated

•

24 years ago

QA Contact: teruko → ftang

yueheng.xu

Comment 7

•

24 years ago

I will do it later since no body with that content yet.

Target Milestone: M16 → M20

Frank Tang

Reporter

Comment 8

•

24 years ago

reassign this back to ftang since yueheng.xu@intel.com is no longer working on
mozilla project. Mark it as Future since the authors of ISO-2022-CN think it is
no longer important to support ISO-2022-CN

Assignee: yueheng.xu → ftang

Status: ASSIGNED → NEW

Target Milestone: M20 → Future

Frank Tang

Reporter

Comment 9

•

24 years ago

mark it as assign

Status: NEW → ASSIGNED

Frank Tang

Reporter

Comment 10

•

23 years ago

watanabe@komadori.planet.sci.kobe-u.ac.jp:
want to add ISO-2022-CN for us?

Ervin Yan

Comment 11

•

22 years ago

ftang:  I think we need add ISO-2022-CN support in Mozilla. we can not read
mails  from Solaris Chinese dtmail which encoded with ISO-2022-CN and
ISO-2022-CN-EXT encoding.

I will add a patch to support the Decoder for these encoding these days.

My patch works in my machine now, I need to test it for several days.

Ervin Yan

Comment 12

•

22 years ago

Attached patch patch for nsISO2022CNToUnicode.cpp (obsolete) — Details — Splinter Review

Ervin Yan

Comment 13

•

22 years ago

Attached patch patch for nsISO2022CNToUnicode.h (obsolete) — Details — Splinter Review

Ervin Yan

Comment 14

•

22 years ago

Attached patch patch for nsUCvCnModule.cpp (obsolete) — Details — Splinter Review

Ervin Yan

Comment 15

•

22 years ago

Add Brian Yuan and Masaki to CC

Ervin Yan

Comment 16

•

22 years ago

ftang: Can you review the patches? Thanx.

Ervin Yan

Comment 17

•

22 years ago

*** Bug 159863 has been marked as a duplicate of this bug. ***

Ervin Yan

Comment 18

•

22 years ago

*** Bug 159863 has been marked as a duplicate of this bug. ***

Jerry

Updated

•

22 years ago

Keywords: review

Frank Tang

Reporter

Comment 19

•

22 years ago

Comment on attachment 93393 [details] [diff] [review]
patch for nsISO2022CNToUnicode.cpp

r=ftang

Attachment #93393 - Flags: review+

Frank Tang

Reporter

Comment 20

•

22 years ago

Comment on attachment 93394 [details] [diff] [review]
patch for nsISO2022CNToUnicode.h

r=ftang change mState_ASCII to eState_ASCII (same as others) m is for member
variable e is for enum. Same issue in .cpp file

Attachment #93394 - Flags: review+

Frank Tang

Reporter

Comment 21

•

22 years ago

Comment on attachment 93395 [details] [diff] [review]
patch for nsUCvCnModule.cpp

r=ftang

Attachment #93395 - Flags: review+

Ervin Yan

Comment 22

•

22 years ago

Attached patch Patch for iso-2022-cn decoder (obsolete) — Details — Splinter Review

Remove old patches.  

Add a new patch which follow ftang's comments.

Attachment #93393 - Attachment is obsolete: true

Attachment #93394 - Attachment is obsolete: true

Attachment #93395 - Attachment is obsolete: true

Ervin Yan

Comment 23

•

22 years ago

Attached patch patchs to follow Alec's new converter structure (obsolete) — Details — Splinter Review

Add updated patches to follow Alec's new converter structure

Attachment #94284 - Attachment is obsolete: true

Masaki Katakai

Assignee

Comment 24

•

22 years ago

Great work, Erivn!!

I've verified ISO-2022-CN email can be browsed now.

Frank Tang

Reporter

Comment 25

•

22 years ago

Comment on attachment 95224 [details] [diff] [review]
patchs to follow Alec's new converter structure

r=ftang

Attachment #95224 - Flags: review+

Alec Flett

Comment 26

•

22 years ago

Comment on attachment 95224 [details] [diff] [review]
patchs to follow Alec's new converter structure

>Index: ucvcn/nsISO2022CNToUnicode.h
>===================================================================

>+public:
>+  nsISO2022CNToUnicode()
>+  {
>+    mState = eState_ASCII;
>+    mPlaneID = 0;

initialize these using the C++ preferred method:
nsISO2022CNToUnicode() : mState(eState_ASCII), mPlaneID(0) {}

this allows C++ to optimize initialization by doing a memcpy into the initial
structure

>+
>+  // Decoder handle
>+  nsIUnicodeDecoder *mGB2312_Decoder;
>+  nsIUnicodeDecoder *mEUCTW_Decoder;

you shoudl use nsCOMPtr's for the mGB2312_Decoder and mEUCTW_Decoder
members...that way you won't have to release them in the destructor, and you
won't leak on failure.
>+
>+    if(!mGB2312_Decoder) {// failed creating a delegate converter
>+       return NS_ERROR_UNEXPECTED;
>+    } else {
>+       PRInt32  srcLen = aSrcLength;
>+       PRInt32  destLen = *aDestLength;

else after a return makes no sense...

instead:

>    if(!mGB2312_Decoder) {// failed creating a delegate converter
>       return NS_ERROR_UNEXPECTED;
>
>    PRInt32  srcLen = aSrcLength;
>    PRInt32  destLen = *aDestLength;

and so forth

>+       if(res != NS_OK) {
>+          return NS_ERROR_UNEXPECTED;
>+       }

don't compare directly against NS_OK - use if (NS_FAILED(res))

the decoder itself looks fine to me though.

Attachment #95224 - Flags: needs-work+

Ervin Yan

Comment 27

•

22 years ago

Attached patch New patches to follow Alec's instruction (obsolete) — Details — Splinter Review

Ervin Yan

Updated

•

22 years ago

Attachment #95224 - Attachment is obsolete: true

Ervin Yan

Comment 28

•

22 years ago

Attached patch Correct some typo errors in comment lines of previous patch (obsolete) — Details — Splinter Review

Sorry, there are some typo errors in comment lines in previous patch.

Attachment #95362 - Attachment is obsolete: true

Alec Flett

Comment 29

•

22 years ago

Comment on attachment 95364 [details] [diff] [review]
Correct some typo errors in comment lines of previous patch

sorry, a few other minor things:

first, this:
nsString tmpCharset;

this is a stack-based string, so you should be using nsAutoString. see
http://www.mozilla.org/projects/xpcom/string-guide.html#Concrete_Classes and 
the section on "Stack based strings" in
http://www.mozilla.org/projects/xpcom/string-quickref.html

second, I see this loop:
+    for (PRInt32 i=0; i<destLen; i++) {
+	aDest[i] = dest[i];
+    }

you should use memcpy() here instead of copying the characters one by one - it
will be signifigantly faster. In fact, I wonder why you don't just forward the
whole call to mEUCTW_Decoder->Convert() instead of creating temporary
variables, and then copying the values over... won't they also return
NS_OK_UDEC_MOREOUTPUT/etc? (same goes for the other converter)

I'm not marking needs-work because I'd like to here why we need the temporary
variables. If there is a good reason, then I'll mark this patch sr=alecf as
long as you fix the nsAutoString stuff.

Ervin Yan

Comment 30

•

22 years ago

Attached patch patch to follow Alec's instruction — Details — Splinter Review

r=ftang

Thank Alec for review. hope this patch no problems. :-)

Attachment #95364 - Attachment is obsolete: true

Alec Flett

Comment 31

•

22 years ago

Comment on attachment 95521 [details] [diff] [review]
patch to follow Alec's instruction

excellent! sr=alecf (and r=ftang)

Attachment #95521 - Flags: superreview+

Attachment #95521 - Flags: review+

Masaki Katakai

Assignee

Comment 32

•

22 years ago

assigned to myself

Assignee: ftang → katakai

Status: ASSIGNED → NEW

Masaki Katakai

Assignee

Comment 33

•

22 years ago

patch checked into trunk. Mark this as fixed.

Also added branchOEM keyword for OEM branch.

Ervin, please verify the fix in trunk. (I will also verify) and
work to make patch for OEM branch. I understand 
your patch attachment 94284 [details] [diff] [review] for old structure
needs to be updated according to alecf's comments.

Good work, Ervin. Thank you for fast-reviewing, Frank and Alec!!

Status: NEW → RESOLVED

Closed: 22 years ago

Resolution: --- → FIXED

Whiteboard: branchOEM

Bradley Baetz (:bbaetz)

Comment 34

•

22 years ago

This busted HPUX:

Error 129: "./../ucvcn/nsISO2022CNToUnicode.h", line 45 # Redefinition of macro
'PMASK' differs from previous definition at ["/usr/include/sys/param.h", line 140].
    #define PMASK       0xa0

I added a |#undef PMASK| just before that line, which should fix it.

Masaki Katakai

Assignee

Comment 35

•

22 years ago

Attached patch patch for OEM branch — Details — Splinter Review

Jim Dunn

Updated

•

22 years ago

Whiteboard: branchOEM → branchOEM+

Masaki Katakai

Assignee

Comment 36

•

22 years ago

checked in to OEM branch.
Great work, Ervin!! and Thanks for review, Frank and Alec.

Whiteboard: branchOEM+ → branchOEM+,fixedOEM

timeless

Updated

•

22 years ago

Keywords: review

Whiteboard: branchOEM+,fixedOEM → fixedOEM

timeless

Updated

•

22 years ago

Keywords: fixedOEM

Whiteboard: fixedOEM

Roland Mainz

Comment 37

•

22 years ago

(Dumb ?!) question:
This patch contains a _decoder_ - where is the _encoder_ part for this encoding
?

patch for nsISO2022CNToUnicode.cpp 22 years ago Ervin Yan 17.79 KB, patch	ftang : review+	Details \| Diff \| Splinter Review
patch for nsISO2022CNToUnicode.h 22 years ago Ervin Yan 3.82 KB, patch	ftang : review+	Details \| Diff \| Splinter Review
patch for nsUCvCnModule.cpp 22 years ago Ervin Yan 1.65 KB, patch	ftang : review+	Details \| Diff \| Splinter Review
Patch for iso-2022-cn decoder 22 years ago Ervin Yan 23.25 KB, patch		Details \| Diff \| Splinter Review
patchs to follow Alec's new converter structure 22 years ago Ervin Yan 23.22 KB, patch	ftang : review+	Details \| Diff \| Splinter Review
New patches to follow Alec's instruction 22 years ago Ervin Yan 23.14 KB, patch		Details \| Diff \| Splinter Review
Correct some typo errors in comment lines of previous patch 22 years ago Ervin Yan 23.10 KB, patch		Details \| Diff \| Splinter Review
patch to follow Alec's instruction 22 years ago Ervin Yan 22.25 KB, patch	alecf : review+ alecf : superreview+	Details \| Diff \| Splinter Review
patch for OEM branch 22 years ago Masaki Katakai 22.30 KB, patch		Details \| Diff \| Splinter Review