Unicode 3.1 updated the definition of UTF-8 to help prevent security issues. We need to modify the UTF-8 converter to not interpret "non-shortest forms". See http://www.unicode.org/unicode/reports/tr27/index.html: The current conformance clause C12 in The Unicode Standard, Version 3.0 forbids the generation of "non-shortest form" UTF-8, and forbids the interpretation of illegal sequences, but not the interpretation of "non-shortest form". Where software does interpret the non-shortest forms, security issues can arise. For example: * Process A performs security checks, but does not check for non-shortest forms. * Process B accepts the byte sequence from process A, and transforms it into UTF-16 while interpreting non-shortest forms. * The UTF-16 text may then contain characters that should have been filtered out by process A. To address this issue, the Unicode Technical Committee has modified the definition of UTF-8 to forbid conformant implementations from interpreting non-shortest forms for BMP characters, and clarified some of the conformance clauses.
Nominating this for nsenterprise.
We also need to explore if there is a way to update the UTF-8 converter for earlier versions of Netscape 6.
Actually this may have been updated for Unicode 3.01: http://www.unicode.org/unicode/uni2errata/UTF-8_Corrigendum.html Updated Summary from: Update UTF-8 for Unicode 3.1 conformance to: Update UTF-8 for Unicode 3.01 conformance Added nsbranch keyword. momoi> We also need to explore if there is a way to momoi> update the UTF-8 converter for earlier versions of momoi> Netscape 6. If the fix is just in the UTF-8 converter, we should be able to provide updated an XPCOM converter module for Netscape 6.0 and 6.1. Frank, Is that correct?
In a Netscape internal bug, shanjian mentioned that achieving efficiency with the proposed change in the UTF-8 converter may take some additional work.
I didn't not realize unicode3.1 addressed this problem when I wrote that. I guess that unicode 3.1 must have a converting program in the companion CD. If that is the case, we can just borrow that code and implementation shouldn't take a lot of time.
Isn't it just a matter of checking bit patterns as described in Table 3.1B in http://www.unicode.org/unicode/reports/tr27/index.html: Table 3.1B. Legal UTF-8 Byte Sequences Code Points 1st-Byte 2nd-Byte 3rd-Byte 4th-Byte U+0000..U+007F 00..7F U+0080..U+07FF C2..DF 80..BF U+0800..U+0FFF E0 A0..BF 80..BF U+1000..U+FFFF E1..EF 80..BF 80..BF U+10000..U+3FFFF F0 90..BF 80..BF 80..BF U+40000..U+FFFFF F1..F3 80..BF 80..BF 80..BF U+100000..U+10FFFF F4 80..8F 80..BF 80..BF Table 3.1B. lists all of the byte sequences that are legal in UTF-8. A range of byte values such as A0..BF indicates that any byte from A0 to BF (inclusive) is legal in that position. Any byte value outside of the ranges listed is illegal. For example, the byte sequence <C0 AF> is illegal since C0 is not legal in the 1st Byte column. The byte sequence <E0 9F 80> is illegal since in the row where E0 is legal as a first byte, 9F is not legal as a second byte. The byte sequence <F4 80 83 92> is legal, since every byte in that sequence matches a byte range in a row of the table (the last row). * Cases where a trailing byte range is not 80..BF are underlined in the table to draw attention to them. These occur only in the second byte of a sequence.
The simplest implementation will be like that. But we probably want to optimize the code and try to achieve the same result with no or less extra performance cost. I think somebody in unicode society already did this for us, so we can just borrow the code or algorithm.
security issue. also, easy to fix. moz0.9.4
The security issue here is: Do we do to help *poorly* written 3rd party apps avoid parsing errors?
since the code is poorly ident, I also change a lot of tab and space and the diff is a -uw . Please ignore the the ugly looking of tab/space in the patch. The check in will show nicely with the while file follow mozilla identification. jbetak/shanjian can you review this code?
This get our attention because a real security hole exist somewhere
The security hole in webmail has already been fixed by webmail team. The importanance of this fix have been lowered now.
Lowered, but still high. If we fix this, then our client can foil any future similar exploits which use non-shortest forms of UTF-8 for spoofing.
fully tested. Need code review.
Two decisions we need to make: (1) Do make new XPCOM converter modules available for 6.1? (The solution for 6.0 users should be to upgrade to 6.1 + the new XPCOM converter module.) (2) If we do (1), should we create a 6.11 or do a silent upgrade? If we do a silent upgrade, how do users know if they have the fix or not? Do we have them check the the size and date of the converter module?
> If we fix this, then our client can foil any future similar > exploits which use non-shortest forms of UTF-8 for spoofing. There are non-Netscape webmail services in which the exploit is still problematical. (Hotmail & Yahoo, for example.) Mozilla/NS 6 users use these services and we should not be contributing to a security problem.
I will be on vacation start from 9/6. If I don't got a approval of this by tomorrow noon, then I will check it in after 9/17
Comment on attachment 47746 [details] [diff] [review] Check for Unicode byte waiting for /a
I'll assign this bug to myself while ftang is on vacation.
This has a r= and sr=. Just awaiting an a=, and FTang says it addresses a secutiry issue (e.g. " . . .*poorly* written 3rd party apps avoid parsing errors?"). Nominating for PDT+. Removing nsenterpise from keyword, as I do not believe it is an enterprise issue.
Comment on attachment 47746 [details] [diff] [review] Check for Unicode byte Fully tested by ftang s comment on 2001-09-04 10:53
0.9.4 is out the door.
U got the PDT+. Pls check it in ASAP
Checked into 0_9_4_BRANCH. I'll check into the trunk once opened.
someone said N4.x have the same problem. But I look at it, n4.x do not have the same problem as 6.1 and IE does.
Frank's test case has passed on 09-20 Branch build / Windows2000. Will verify it on Trunk build once it get checked in there.
> someone said N4.x have the same problem. But I look at it, n4.x > do not have the same problem as 6.1 and IE does. Both Takagi-san and I reproduced the problem with 4.78 at Netscape Webmail -- not once but several times. I also looked at Yahoo and Hotmail and they also had the problem. None of these sites exhibit the problem today with Comm 4.78 and I gather that they fixed the problem. Is it possible that your test case only covers one possible way the exploit works?
checked into the trunk..
Frank's testcase has passed on 09-24 trunk build / Win2000-CN. Mark it as verified, please re-open if there is some other case(s) might cause the problem.