Closed Bug 53133 Opened 25 years ago Closed 22 years ago

Need conversion of T61String data from ISO8859 to UTF8

Categories

(NSS :: Libraries, enhancement, P3)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: thayes0993, Assigned: jgmyers)

References

Details

(Whiteboard: [rfc2459] [cert])

Attachments

(1 file, 1 obsolete file)

RFC2459 suggests that applications convert data encoded using T61String from ISO8859-1 (Latin 1) encoding to UTF8. This is suggested because some applications (IE?) used T61String as a place to put 8-bit Latin character sets. This suggestion isn't strictly correct since T61 actually uses escape sequences to represent a variety of different character values. However, the current implementation of T61String in NSS doesn't convert those either. We should: 1. Allow T61String to be treated as ISO8859-1 (optionally?) 2. Implement true T61 handling Of these, the first is more important, and if it is the only choice implemented, means that applications don't need to make a decision.
Terry, could you set the Target Milestone for this bug? Thanks.
Target Milestone: --- → 3.3
Reassigned to Terry and marked as an "enhancement".
Assignee: wtc → thayes
Severity: normal → enhancement
Target Milestone: 3.3 → 3.4
Set target milestone 4.0.
Target Milestone: 3.4 → 4.0
Changed the QA contact to Bishakha.
QA Contact: sonja.mirtitsch → bishakhabanerjee
Whiteboard: [rfc2459]
Whiteboard: [rfc2459] → [rfc2459] [cert]
Terry, what is T61String? It is not used in RFC 2459.
Wan-Teh, T61String is the same thing as TeletexString.
Blocks: 197009
Assignee: thayes0993 → jgmyers
Attached patch Proposed fix (obsolete) — Splinter Review
Attachment #138925 - Flags: review?(thayes0993)
Target Milestone: 4.0 → 3.9
John, NSS 3.9 has shipped. Retargetting for NSS 3.9.1.
Target Milestone: 3.9 → 3.9.1
Unlikely to meet the criteria for 3.9.1, targeting at 3.10
Target Milestone: 3.9.1 → 3.10
Attachment #138925 - Flags: review?(thayes0993) → review?(MisterSSL)
Comment on attachment 138925 [details] [diff] [review] Proposed fix John, Thanks for your contribution! Here in NSS land, a single r+ is sufficient to checkin a bug fix, except in the weeks just before RTM. Also, we have the practice of giving r+ to a patch, with suggested changes. This gives the patch submittor a chance to make some additional changes and check it in without an additional review. That's whay I'm doing here. 1. I suggest the new functions contain the string "iso8859_1" instead of "iso88591". 2. Please add some comments in secname.c, explaining that we're not doing real T61-to-UTF8 conversion, but rather are treating T61 strings as if they were iso 8859-1 strings. 3. Please add one-line comments in secport.h and utf8.c explaining what each of the new functions does. 4. FYI, we decided long ago to stop using PR_EXTERN and PR_IMPLEMENT macros because they defeat the ctags program. Instead, we created some new macros, named NSS_EXTERN, NSS_IMPLEMENT, NSS_EXTERN_DATA, NSS_IMPLEMENT_DATA, declared in base/nssbaset.h, that serve the same purpose and do not confuse ctags. Please use those in new code. Thanks.
Attachment #138925 - Flags: review?(MisterSSL) → review+
Addressed review comments other than comment (1), add myself as a contributor to two files. I deliberately chose names such as sec_port_iso88591_utf8_conversion_function as being less ugly than sec_port_iso8859_1_utf8_conversion_function. The convention established by the other charaset names in those functions is to remove '-' characters from charset names. It's not as if NSS is going to support any other 8859-based charset.
Attachment #138925 - Attachment is obsolete: true
Comment on attachment 139078 [details] [diff] [review] Address most review comments Requesting checkin approval
Attachment #139078 - Flags: superreview?(wchang0222)
Comment on attachment 139078 [details] [diff] [review] Address most review comments second review not required for NSS at this time.
Attachment #139078 - Flags: review+
Comment on attachment 139078 [details] [diff] [review] Address most review comments This patch looks good. There are some minor problems that should be fixed before checking it in. 1. Some compilers may warn that the switch(convert) statement you added to lib/certdb/secname.c does not handle the case convert_none. 2. Question: is it T.61 or T61? 3. The 'inBuf' parameter for both PORT_ISO88591_UTF8Conversion and sec_port_iso88591_utf8_conversion_function should be declared with 'const'. Question: these two functions are identical. Why do we need two functions? 4. lib/util/secport.h (a public header) should not include nssbaset.h (a private header). You can avoid that by not using the NSS_EXTERN macro. 5. Do not use the NSS_IMPLEMENT macro in lib/util/utf8.c. Do not replace it with PR_IMPLEMENT either. Just remove it. (NSS uses *.def files to control the export of symbols from DLLs.) 6. I did not review the implementation of sec_port_iso88591_utf8_conversion_function and test_iso88591_chars in lib/util/utf8.c. I may try to do that tomorrow. If you can address 3, 4, and 5 (1 is optional), you can go ahead and check the patch into the NSS tip. Note: the Mozilla trunk is using NSS_CLIENT_TAG.
Attachment #139078 - Flags: superreview?(wchang0222) → superreview+
Comment on attachment 139078 [details] [diff] [review] Address most review comments I just realized that nssbaset.h is a public header and I contradicted Nelson's suggestion of using NSS_EXTERN and NSS_IMPLEMENT. I still think files in lib/util should not include nssbaset.h (it is a "Stan" header) and the NSS_EXTERN and NSS_IMPLEMENT macros should not be used on sec_port_iso88591_utf8_conversion_function.
Checking in security/nss/lib/certdb/secname.c; /cvsroot/mozilla/security/nss/lib/certdb/secname.c,v <-- secname.c new revision: 1.16; previous revision: 1.15 done Checking in security/nss/lib/util/secport.c; /cvsroot/mozilla/security/nss/lib/util/secport.c,v <-- secport.c new revision: 1.17; previous revision: 1.16 done Checking in security/nss/lib/util/secport.h; /cvsroot/mozilla/security/nss/lib/util/secport.h,v <-- secport.h new revision: 1.8; previous revision: 1.7 done Checking in security/nss/lib/util/utf8.c; /cvsroot/mozilla/security/nss/lib/util/utf8.c,v <-- utf8.c new revision: 1.4; previous revision: 1.3 done Review responses: 1. Addressed 2. The character encoding scheme is "T.61". The label has many names, NSS uses "T61". 3. Declared const. Was following the pattern set by the UCS-2 and UCS-4 to/from UTF-8 converters. The dual functions similarly follows the pattern set by the UCS-2 and UCS-4 converters, but I left out implementing PORT_SetISO88591_UTF8ConversionFunction() as I don't see any need for being able to replace the ISO-8859-1 to UTF-8 conversion function. It appears that PORT_SetUCS2_UTF8ConversionFunction() and PORT_SetUCS4_UTF8ConversionFunction() have no purpose, but merely mimic the misnamed PORT_SetUCS2_ASCIIConversionFunction(). The latter should really be named PORT_SetUCS2_PasswordConversionFunction() as it has nothing to do with ASCII. 4. Addressed 5. Addressed
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Wan-Teh: If you'd like me to remove one of the two functions, let me know which function name you want to keep.
Comment on attachment 139078 [details] [diff] [review] Address most review comments Bob, Nelson, could you answer John's question in comment 17? I just reviewed the actual conversion function. Since I don't know how Latin-1 should be converted to UTF-8, I can only verify that John's code does exactly what his comments say the conversion should be. John, I have two question about the /* 7F -> 0xxxxxx */ comment. Should "7F" be "00-7F"? Should "0xxxxxx" have one more 'x'? I didn't review the test_iso88591_chars function.
Comment on attachment 139078 [details] [diff] [review] Address most review comments I finally reviewed the test_iso88591_chars function. The only change I want to suggest is to declare the 'utf8' array with a size of 3 instead of 8: >+ unsigned char utf8[8]; It only needs to be 3 bytes to hold the conversion output (1-2 bytes) and the null byte we add at the end. This will make the test stricter. Question about the two fprintf statements: >+ fprintf(stdout, "Failed to convert ISO-8859-1 0x%02.2x to UTF-8\n", iso88591); >+ fprintf(stdout, "Wrong conversion of ISO-8859-1 0x%02.2x to UTF-8: ", iso88591); The %x format needs to match an unsigned int. Since 'iso88591' is an unsigned char, are we relying on C's automatic promotion of unsigned char function arguments to unsigned int? Other similar fprintf statements in the file use an explicit (unsigned int) cast.
Wan-Teh asked me to look at the issue of the two functions PORT_ISO88591_UTF8Conversion and sec_port_iso88591_utf8_conversion_function and decide if we need to identical functions. So, I thought about it and have the following recommendation. Rename PORT_ISO88591_UTF8Conversion to PORT_T61_UTF8Conversion, and have it call a settable pointer (just as PORT_UCS2_UTF8Conversion does). By default, that pointer will point to sec_port_iso88591_utf8_conversion_function. Move the comment about treating T.61 strings as ISO-8859-1 strings to secport.c from secname.c, and ahve secname.c call PORT_T61_UTF8Conversion. Rationalle (sp?): As John observed, no one will want to replace our ISO-8859-1-to-UTF8 conversion functions with their own ISO-8859-1-to-UTF8 conversion function, but they MIGHT want to put in a TRUE T.61 conversion function, and we should enable that.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Re the question in comment 19, yes we are depending on automatic promotion rules. Re comment 20, I would like to disagree with the need for a caller settable T61 conversion function. My reasons: 1) Such parsing of fields in certs belongs at the NSS level. Giving the calling application responsibility for such obscure minutiae is very poor layering. This is a quite different situation than PORT_SetUCS2_ASCIIConversionFunction(), which has the application expose knowledge about the application's own password callbacks, not about how to parse certs. 2) There is no demonstrated need for the feature. In practice, the TeletexString label means ISO-8859-1 encoding. There is no customer demonstrating deployed T.61 certs or asking support for same.
The original reason that NSS has callback for the character set conversion functions was that certain products that used NSS had their own functions, some of which were licensed from third parties, and they wanted to be sure that NSS used their functions. In fact, it was only after NSS began to be used in other products that did not contain their own conversion libraries that the utf8.c file was added to NSS. Anyway, Wan-Teh asked me to render my opinion on the issue of two functions, one of which only calls the other, and I did.
Nelson, could we close this bug as FIXED and open a new bug to address the enhancement of being able to convert other T.61-labeled charsets?
Yes.
Status: REOPENED → RESOLVED
Closed: 22 years ago22 years ago
Resolution: --- → FIXED
I would like to see the fix for this bug in NSS 3.9.1 It would be a very useful fix for me, since I am affected by the bug because my name includes a danish letter (Lars Teglgård Smidt) which means I cannot use my personal certificate with any mozilla product.
Changed the target milestone to NSS 3.9.1. It turns out that this fix is already in the NSS_3_9_BRANCH and the NSS_CLIENT_TAG (the tag used by the Mozilla trunk). So, current Mozilla trunk builds (2004-01-28 or later) already have this fix.
Target Milestone: 3.10 → 3.9.1
This bug and its fix affected all platforms and OSes.
OS: Windows NT → All
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: