53133 - Need conversion of T61String data from ISO8859 to UTF8

Reporter

Description

•

25 years ago

RFC2459 suggests that applications convert data encoded using T61String from ISO8859-1 (Latin 1) encoding to UTF8. This is suggested because some applications (IE?) used T61String as a place to put 8-bit Latin character sets. This suggestion isn't strictly correct since T61 actually uses escape sequences to represent a variety of different character values. However, the current implementation of T61String in NSS doesn't convert those either. We should: 1. Allow T61String to be treated as ISO8859-1 (optionally?) 2. Implement true T61 handling Of these, the first is more important, and if it is the only choice implemented, means that applications don't need to make a decision.

Wan-Teh Chang

Comment 1

•

25 years ago

Terry, could you set the Target Milestone for this bug? Thanks.

Terry Hayes

Reporter

Updated

•

25 years ago

Target Milestone: --- → 3.3

Wan-Teh Chang

Comment 2

•

25 years ago

Reassigned to Terry and marked as an "enhancement".

Assignee: wtc → thayes

Severity: normal → enhancement

Wan-Teh Chang

Updated

•

24 years ago

Target Milestone: 3.3 → 3.4

Wan-Teh Chang

Comment 3

•

24 years ago

Set target milestone 4.0.

Target Milestone: 3.4 → 4.0

Wan-Teh Chang

Comment 4

•

23 years ago

Changed the QA contact to Bishakha.

QA Contact: sonja.mirtitsch → bishakhabanerjee

Terry Hayes

Reporter

Updated

•

23 years ago

Whiteboard: [rfc2459]

Wan-Teh Chang

Updated

•

23 years ago

Whiteboard: [rfc2459] → [rfc2459] [cert]

Wan-Teh Chang

Comment 5

•

23 years ago

Terry, what is T61String? It is not used in RFC 2459.

Jean-Marc Desperrier

Comment 6

•

22 years ago

Wan-Teh, T61String is the same thing as TeletexString.

John G. Myers

Assignee

Updated

•

22 years ago

Blocks: 197009

John G. Myers

Assignee

Updated

•

22 years ago

Assignee: thayes0993 → jgmyers

John G. Myers

Assignee

Comment 7

•

22 years ago

Attached patch Proposed fix (obsolete) — Details — Splinter Review

John G. Myers

Assignee

Updated

•

22 years ago

Attachment #138925 - Flags: review?(thayes0993)

John G. Myers

Assignee

Updated

•

22 years ago

Target Milestone: 4.0 → 3.9

Nelson Bolyard (seldom reads bugmail)

Comment 8

•

22 years ago

John, NSS 3.9 has shipped. Retargetting for NSS 3.9.1.

Target Milestone: 3.9 → 3.9.1

John G. Myers

Assignee

Comment 9

•

22 years ago

Unlikely to meet the criteria for 3.9.1, targeting at 3.10

Target Milestone: 3.9.1 → 3.10

John G. Myers

Assignee

Updated

•

22 years ago

Attachment #138925 - Flags: review?(thayes0993) → review?(MisterSSL)

Nelson Bolyard (seldom reads bugmail)

Comment 10

•

22 years ago

Comment on attachment 138925 [details] [diff] [review] Proposed fix John, Thanks for your contribution! Here in NSS land, a single r+ is sufficient to checkin a bug fix, except in the weeks just before RTM. Also, we have the practice of giving r+ to a patch, with suggested changes. This gives the patch submittor a chance to make some additional changes and check it in without an additional review. That's whay I'm doing here. 1. I suggest the new functions contain the string "iso8859_1" instead of "iso88591". 2. Please add some comments in secname.c, explaining that we're not doing real T61-to-UTF8 conversion, but rather are treating T61 strings as if they were iso 8859-1 strings. 3. Please add one-line comments in secport.h and utf8.c explaining what each of the new functions does. 4. FYI, we decided long ago to stop using PR_EXTERN and PR_IMPLEMENT macros because they defeat the ctags program. Instead, we created some new macros, named NSS_EXTERN, NSS_IMPLEMENT, NSS_EXTERN_DATA, NSS_IMPLEMENT_DATA, declared in base/nssbaset.h, that serve the same purpose and do not confuse ctags. Please use those in new code. Thanks.

Attachment #138925 - Flags: review?(MisterSSL) → review+

John G. Myers

Assignee

Comment 11

•

22 years ago

Attached patch Address most review comments — Details — Splinter Review

Addressed review comments other than comment (1), add myself as a contributor to two files. I deliberately chose names such as sec_port_iso88591_utf8_conversion_function as being less ugly than sec_port_iso8859_1_utf8_conversion_function. The convention established by the other charaset names in those functions is to remove '-' characters from charset names. It's not as if NSS is going to support any other 8859-based charset.

Attachment #138925 - Attachment is obsolete: true

John G. Myers

Assignee

Comment 12

•

22 years ago

Comment on attachment 139078 [details] [diff] [review] Address most review comments Requesting checkin approval

Attachment #139078 - Flags: superreview?(wchang0222)

Nelson Bolyard (seldom reads bugmail)

Comment 13

•

22 years ago

Comment on attachment 139078 [details] [diff] [review] Address most review comments second review not required for NSS at this time.

Attachment #139078 - Flags: review+

Wan-Teh Chang

Comment 14

•

22 years ago

Comment on attachment 139078 [details] [diff] [review] Address most review comments This patch looks good. There are some minor problems that should be fixed before checking it in. 1. Some compilers may warn that the switch(convert) statement you added to lib/certdb/secname.c does not handle the case convert_none. 2. Question: is it T.61 or T61? 3. The 'inBuf' parameter for both PORT_ISO88591_UTF8Conversion and sec_port_iso88591_utf8_conversion_function should be declared with 'const'. Question: these two functions are identical. Why do we need two functions? 4. lib/util/secport.h (a public header) should not include nssbaset.h (a private header). You can avoid that by not using the NSS_EXTERN macro. 5. Do not use the NSS_IMPLEMENT macro in lib/util/utf8.c. Do not replace it with PR_IMPLEMENT either. Just remove it. (NSS uses *.def files to control the export of symbols from DLLs.) 6. I did not review the implementation of sec_port_iso88591_utf8_conversion_function and test_iso88591_chars in lib/util/utf8.c. I may try to do that tomorrow. If you can address 3, 4, and 5 (1 is optional), you can go ahead and check the patch into the NSS tip. Note: the Mozilla trunk is using NSS_CLIENT_TAG.

Attachment #139078 - Flags: superreview?(wchang0222) → superreview+

Wan-Teh Chang

Comment 15

•

22 years ago

Comment on attachment 139078 [details] [diff] [review] Address most review comments I just realized that nssbaset.h is a public header and I contradicted Nelson's suggestion of using NSS_EXTERN and NSS_IMPLEMENT. I still think files in lib/util should not include nssbaset.h (it is a "Stan" header) and the NSS_EXTERN and NSS_IMPLEMENT macros should not be used on sec_port_iso88591_utf8_conversion_function.

John G. Myers

Assignee

Comment 16

•

22 years ago

Checking in security/nss/lib/certdb/secname.c; /cvsroot/mozilla/security/nss/lib/certdb/secname.c,v <-- secname.c new revision: 1.16; previous revision: 1.15 done Checking in security/nss/lib/util/secport.c; /cvsroot/mozilla/security/nss/lib/util/secport.c,v <-- secport.c new revision: 1.17; previous revision: 1.16 done Checking in security/nss/lib/util/secport.h; /cvsroot/mozilla/security/nss/lib/util/secport.h,v <-- secport.h new revision: 1.8; previous revision: 1.7 done Checking in security/nss/lib/util/utf8.c; /cvsroot/mozilla/security/nss/lib/util/utf8.c,v <-- utf8.c new revision: 1.4; previous revision: 1.3 done Review responses: 1. Addressed 2. The character encoding scheme is "T.61". The label has many names, NSS uses "T61". 3. Declared const. Was following the pattern set by the UCS-2 and UCS-4 to/from UTF-8 converters. The dual functions similarly follows the pattern set by the UCS-2 and UCS-4 converters, but I left out implementing PORT_SetISO88591_UTF8ConversionFunction() as I don't see any need for being able to replace the ISO-8859-1 to UTF-8 conversion function. It appears that PORT_SetUCS2_UTF8ConversionFunction() and PORT_SetUCS4_UTF8ConversionFunction() have no purpose, but merely mimic the misnamed PORT_SetUCS2_ASCIIConversionFunction(). The latter should really be named PORT_SetUCS2_PasswordConversionFunction() as it has nothing to do with ASCII. 4. Addressed 5. Addressed

Status: NEW → RESOLVED

Closed: 22 years ago

Resolution: --- → FIXED

John G. Myers

Assignee

Comment 17

•

22 years ago

Wan-Teh: If you'd like me to remove one of the two functions, let me know which function name you want to keep.

Wan-Teh Chang

Comment 18

•

22 years ago

Comment on attachment 139078 [details] [diff] [review] Address most review comments Bob, Nelson, could you answer John's question in comment 17? I just reviewed the actual conversion function. Since I don't know how Latin-1 should be converted to UTF-8, I can only verify that John's code does exactly what his comments say the conversion should be. John, I have two question about the /* 7F -> 0xxxxxx */ comment. Should "7F" be "00-7F"? Should "0xxxxxx" have one more 'x'? I didn't review the test_iso88591_chars function.

Wan-Teh Chang

Comment 19

•

22 years ago

Comment on attachment 139078 [details] [diff] [review] Address most review comments I finally reviewed the test_iso88591_chars function. The only change I want to suggest is to declare the 'utf8' array with a size of 3 instead of 8: >+ unsigned char utf8[8]; It only needs to be 3 bytes to hold the conversion output (1-2 bytes) and the null byte we add at the end. This will make the test stricter. Question about the two fprintf statements: >+ fprintf(stdout, "Failed to convert ISO-8859-1 0x%02.2x to UTF-8\n", iso88591); >+ fprintf(stdout, "Wrong conversion of ISO-8859-1 0x%02.2x to UTF-8: ", iso88591); The %x format needs to match an unsigned int. Since 'iso88591' is an unsigned char, are we relying on C's automatic promotion of unsigned char function arguments to unsigned int? Other similar fprintf statements in the file use an explicit (unsigned int) cast.

Nelson Bolyard (seldom reads bugmail)

Comment 20

•

22 years ago

Wan-Teh asked me to look at the issue of the two functions PORT_ISO88591_UTF8Conversion and sec_port_iso88591_utf8_conversion_function and decide if we need to identical functions. So, I thought about it and have the following recommendation. Rename PORT_ISO88591_UTF8Conversion to PORT_T61_UTF8Conversion, and have it call a settable pointer (just as PORT_UCS2_UTF8Conversion does). By default, that pointer will point to sec_port_iso88591_utf8_conversion_function. Move the comment about treating T.61 strings as ISO-8859-1 strings to secport.c from secname.c, and ahve secname.c call PORT_T61_UTF8Conversion. Rationalle (sp?): As John observed, no one will want to replace our ISO-8859-1-to-UTF8 conversion functions with their own ISO-8859-1-to-UTF8 conversion function, but they MIGHT want to put in a TRUE T.61 conversion function, and we should enable that.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

John G. Myers

Assignee

Comment 21

•

22 years ago

Re the question in comment 19, yes we are depending on automatic promotion rules. Re comment 20, I would like to disagree with the need for a caller settable T61 conversion function. My reasons: 1) Such parsing of fields in certs belongs at the NSS level. Giving the calling application responsibility for such obscure minutiae is very poor layering. This is a quite different situation than PORT_SetUCS2_ASCIIConversionFunction(), which has the application expose knowledge about the application's own password callbacks, not about how to parse certs. 2) There is no demonstrated need for the feature. In practice, the TeletexString label means ISO-8859-1 encoding. There is no customer demonstrating deployed T.61 certs or asking support for same.

Nelson Bolyard (seldom reads bugmail)

Comment 22

•

22 years ago

The original reason that NSS has callback for the character set conversion functions was that certain products that used NSS had their own functions, some of which were licensed from third parties, and they wanted to be sure that NSS used their functions. In fact, it was only after NSS began to be used in other products that did not contain their own conversion libraries that the utf8.c file was added to NSS. Anyway, Wan-Teh asked me to render my opinion on the issue of two functions, one of which only calls the other, and I did.

John G. Myers

Assignee

Comment 23

•

22 years ago

Nelson, could we close this bug as FIXED and open a new bug to address the enhancement of being able to convert other T.61-labeled charsets?

Nelson Bolyard (seldom reads bugmail)

Comment 24

•

22 years ago

Yes.

Status: REOPENED → RESOLVED

Closed: 22 years ago → 22 years ago

Resolution: --- → FIXED

Lars Smidt

Comment 25

•

22 years ago

I would like to see the fix for this bug in NSS 3.9.1 It would be a very useful fix for me, since I am affected by the bug because my name includes a danish letter (Lars Teglgård Smidt) which means I cannot use my personal certificate with any mozilla product.

Wan-Teh Chang

Comment 26

•

22 years ago

Changed the target milestone to NSS 3.9.1. It turns out that this fix is already in the NSS_3_9_BRANCH and the NSS_CLIENT_TAG (the tag used by the Mozilla trunk). So, current Mozilla trunk builds (2004-01-28 or later) already have this fix.

Target Milestone: 3.10 → 3.9.1

Nelson Bolyard (seldom reads bugmail)

Comment 27

•

21 years ago

This bug and its fix affected all platforms and OSes.

OS: Windows NT → All

Proposed fix 22 years ago John G. Myers 8.26 KB, patch	nelson : review+	Details \| Diff \| Splinter Review
Address most review comments 22 years ago John G. Myers 9.20 KB, patch	nelson : review+ wtc : superreview+	Details \| Diff \| Splinter Review