Closed
Bug 236941
Opened 20 years ago
Closed 20 years ago
UTF-8 converter looses full lines of text if they contain invalid characters
Categories
(MailNews Core :: Internationalization, defect)
MailNews Core
Internationalization
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: jmdesp, Assigned: jmdesp)
References
(Blocks 1 open bug)
Details
(Keywords: fixed1.7, intl, regression, Whiteboard: fixed-aviary1.0)
Attachments
(2 files, 8 obsolete files)
2.15 KB,
message/rfc822
|
Details | |
5.32 KB,
patch
|
jshin1987
:
review+
Bienvenu
:
superreview+
chofmann
:
approval1.7+
|
Details | Diff | Splinter Review |
Tested on 2004022608 Reproduction : The attached RFC822 message demonstrates the problem. After "quelque temps un message d'une", several lines of quoted text are not displayed, and then back again. That the lines that are not displayed all contain some invalid characters (encoded in ISO-8859 when the message is declared in UTF-8). The problem occurs in mail, and also when displaying a RFC822 message in browser. It does not happen in normal web pages. This is a regression over bug 35004. I didn't see a duplicate, but I just seem to never be able to find them. If you can confirm there is none, mark as new.
Assignee | ||
Comment 1•20 years ago
|
||
Assignee | ||
Comment 2•20 years ago
|
||
If you visualize inside browser, you just have to do a "view source" to see the missing lines.
Comment 3•20 years ago
|
||
I can't even save it from Mozilla, no "Save Link Target As", no Shift-Click or "Save Page As" when displayed in Browser as page or as source. Does Mozilla try to interpret the file even when saving from a link?
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 4•20 years ago
|
||
Uhm, sorry. I just installed the current nightly before reading this bug and ran straight into bug 236903 ...
Assignee | ||
Comment 5•20 years ago
|
||
I try to set the encoding to some other values where the caracters would be illegal to see if they are missing lines too, but without much success except for the encoding HZ. But I think that the decoders are usually quite tolerant, and try to make sense of many different things, so that it seldom happens that they give out a illegal error even if they something really strange. So I still think this is probably a general problem, not specific to UTF68, and that with the string reorganization the decoding is now by line, and does not handle errors on a sub-line level.
Keywords: intl
Comment 7•20 years ago
|
||
I'm not yet properly set up with a developing and debugging environment, but I suspect this may be a regression caused by bug 228543. Jungshik, are the following two assumptions correct? (1) The fix for bug 228543 made mailnews use CopyUTF8toUTF16() for UTF-8 mails, which previously would have been decoded by nsUTF8ToUnicode. (2) When CopyUTF8toUTF16() encounters invalid sequences, it drops the rest of the input instead of inserting a replacement character and continuing
Assignee | ||
Comment 8•20 years ago
|
||
Oups, it seems I should have added it in the comment earlier, I tested the regression appeard on the day bug 228543 was checked in, and Jungshik confirmed me by mail he thinks the call to CopyUTF8toUTF16 might be responsible for the disparition of the string as you suspect. I still need some time to confirm on a build environnement this is it. I suggested him a solution by simply getting rid of lines 152 to 155 here : http://lxr.mozilla.org/mozilla/source/mailnews/base/util/nsMsgI18N.cpp#152 but he doesnt't like too much the idea of getting rid of this optimization all together. I myself do think it's better to use the generic conversion routines for all data that comes directy from the external world, and if that call is too slow, then that the solution would better be to deCOMtaminate it and then add the optimisations in the deCOMtaminated version.
Comment 9•20 years ago
|
||
*** Bug 239236 has been marked as a duplicate of this bug. ***
Comment 10•20 years ago
|
||
The critical point of failure is MimeInlineTextPlain_parse_line in mimetpla.cpp. When making the conversion, we assume the line to be in UTF-8 - and fail. The result is an empty line. This patch adds a simple fallback: if the conversion fails, assume the charset to be ISO-8859-1.
Assignee: smontagu → mnyromyr
Status: NEW → ASSIGNED
Updated•20 years ago
|
Attachment #145186 -
Flags: superreview?(bienvenu)
Attachment #145186 -
Flags: review?(jshin)
Updated•20 years ago
|
OS: Windows 2000 → All
Hardware: PC → All
Assignee | ||
Comment 11•20 years ago
|
||
This is not the clean way to do it Karsten. You can not just assume the input encoding is in truth ISO-8859-1 on a wide scale, and just silently ignore invalid charset declaration is really bad practice. But I'm of course happy someone has more time than to test how this bug can be solved. The code you modified calls the nsMsgI18NConvertToUnicode function and that where we should make sure the return is coherent with what browser does, that is *not* to drop the full line if the input encoding is invalid, but to show the bad part as diamond characters. In the discussion with Jshin that made the change that very probably caused the bug, he said he thinks it's the fact to CopyUTF8toUTF16 inside nsMsgI18NConvertToUnicode that drops the lines, and we should test to either remove CopyUTF8toUTF16 (only remove three lines of code) or to test if the string is truly UTF-8 by making a IsUTF8 call first before calling CopyUTF8toUTF16. This assume the default encoder will do the same job as browser for bad UTF-8. Can you test if either of the two works ? That would make it a very minor change that would have a better chance of being accepted fast (I'd love it to be in 1.7 but I'm not holding my breath).
Comment 12•20 years ago
|
||
Comment on attachment 145186 [details] [diff] [review] catching failed conversions > This is not the clean way to do it Karsten. You can not just assume > the input encoding is in truth ISO-8859-1 on a wide scale Yes, you're right. I completely forgot the code for the charset "guessing" in case of failure. > silently ignore invalid charset declaration is really bad practice. Well, that depends. Throwing charset alerts when *displaying* would be rather nagging. But *saving* and similar operations shouldn't change the charset anyway. > The code you modified calls the nsMsgI18NConvertToUnicode function > and that where we should make sure the return is coherent with what > browser does, that is *not* to drop the full line if the input > encoding is invalid, but to show the bad part as diamond characters. That is an option, but is it "nice"? We could give the charset sniffer a try and /display/ at least what might have been meant if the encoding had been correct. > In the discussion with Jshin that made the change that very probably > caused the bug, he said he thinks it's the fact to CopyUTF8toUTF16 > inside nsMsgI18NConvertToUnicode that drops the lines, It is called CopyUTF8toUTF16, that's right; the actual failure assertion is thrown in ConvertUTF8toUTF16::write. Like AppendUTF8toUTF16/CopyUTF8toUTF16, that's outside of MailNews and shouldn't be tampered with here. > and we should test to either remove CopyUTF8toUTF16 (only remove > three lines of code) That won't suffice, because the CopyUTF8toUTF16 calls I eliminated in mimetpla.cpp would still drop lines. > or to test if the string is truly UTF-8 by making a IsUTF8 call first > before calling CopyUTF8toUTF16. That would be extremely "expensive". We won't need that check most of the time anyway. I think it's better to make as sure as possible that the charset is right and only take other measures if that fails. > Can you test if either of the two works ? That would make it a very > minor change that would have a better chance of being accepted fast I'll have a look. > (I'd love it to be in 1.7 but I'm not holding my breath). I recommend not holding it, then. ;-)
Attachment #145186 -
Attachment is obsolete: true
Attachment #145186 -
Flags: superreview?(bienvenu)
Attachment #145186 -
Flags: review?(jshin)
Comment 13•20 years ago
|
||
The removal of the UTF-8 check in nsMsgI18N.cpp:152-154 does not solve the bug, because the following routine does not produce any output. It does not cause any assertions either, so it may be useful to remove the call to avoid cluttering of the debug console; I haven't done that in this patch, though. This updated patch now does this if we're allowed to use the "real" charset (ie. no overruling or 'global' autodetection is set): - try to use the charset given in the content-type header - if that fails, try to autodetect the charset of this line - if that fails, take the raw bytes as is (that is, as ASCII) Let's see what Jshin thinks...
Updated•20 years ago
|
Attachment #145223 -
Flags: review?(jshin)
Comment 14•20 years ago
|
||
Comment on attachment 145223 [details] [diff] [review] autodetect charset if possible I'd not do this way. I don't think we want to recover from the mislabelled UTF-8 lines. Just signalling to users that there's something wrong would be more than enough. A lot simpler way is to call IsUTF8 (that is not as expensive as you think after ASCII lines are filtered out) in nsMsgI18N.cpp for lines labelled as 'UTF-8'. If it fails the test, it can fall through. I've already made a patch for that on top of my patch for bug 229032. I planned to upload the patch once bug 229032 is fixed. Unfortunately, the patch for bug 229032 will not go in until 1.8alpha. Anyway, if this bug is regarded as serious enough (IMHO, it doesn't often very often) to be fixed in 1.7, I can make a patch against the current trunk.
Attachment #145223 -
Flags: review?(jshin)
Assignee | ||
Comment 15•20 years ago
|
||
Karsten "Mnyromyr" Düsterloh > This updated patch now does this if we're allowed to use the "real" charset > (ie. no overruling or 'global' autodetection is set): > - try to use the charset given in the content-type header > - if that fails, try to autodetect the charset of this line There's already a general framework in mail/news to decide what encoding must be applied to message. If change are needed to handle some case (like the messages labeled with wrong charset described in Bug 239236), it should be done at this level. Your point is to automatically run a charset detection if the message is invalid in the selected charset. This removes control to the user, he can no more force the charset to use, which is in some case useful. I think too Bug 239236 is not truly a duplicate of this bug, because the question of what should be done with messages that have an unknown charset is quite unrelated to how we should best handle invalid content in messages that we have already determined we will display in UTF-8. Jshin > I don't think we want to recover from the mislabelled > UTF-8 lines. - Open in browser a page in ISO-8859-1. - Force encoding to UTF-8 - Invalid characters are replaced by question marks inside a black diamond. - *this* is the old behaviour of mail/news for invalid UTF-8 and this is what we must return to (display the user as much as can be properly interpreted). And I expect there is an easy way to do it, I just don't know exactly how :-)
Comment 16•20 years ago
|
||
(In reply to comment #15) > Jshin > > I don't think we want to recover from the mislabelled > > UTF-8 lines. > - Invalid characters are replaced by question marks inside a black diamond. > - *this* is the old behaviour of mail/news for invalid UTF-8 and this is what That's exactly what I'm gonna do. Karsten's patch tries to go a lot further than that, which I don't think is a good idea as you wrote and I wrote earlier.
Comment 17•20 years ago
|
||
> A lot simpler way is to call IsUTF8 (that is not as expensive as you > think after ASCII lines are filtered out) in nsMsgI18N.cpp for lines > labelled as 'UTF-8'. If it fails the test, it can fall through. Jshin, could you please test your fix with the situation and the testcase of bug 239236? The problem isn't just illegal characters in lines labelled as UTF-8, but also lines with an *unknown* charset. These lines run into the CopyUTF8toUTF16 because they're *wrongly* *assumed* to be UTF-8!
Updated•20 years ago
|
Assignee: mnyromyr → smontagu
Status: ASSIGNED → NEW
Comment 18•20 years ago
|
||
Sorry, I had missed some bugmail. I'm stepping down, feel free to not use anything from my patches here and over in 239941. :) (I still do think that *displaying* garbage is not a good thing[tm], since we don't *send* it - but at least we shouldn't *drop* lines.)
Assignee | ||
Comment 19•20 years ago
|
||
I'm trying to write a working patch that will replace invalid characters with 0xFFFD as is done in the browser. Line 343 of mimetpla.cpp tests this (obj->options->format_out == nsMimeOutput::nsMimeMessageSaveAs and decide if true, the line is UTF-8 and CopyUTF8toUTF16 is adequate to copy it. Unfortunately when displaying the sample attachment, this test holds true. Currently I'm in favor of suppressing the test alltogether, mimetpla.cpp is not the right place to recover encoding problems.
Assignee | ||
Comment 20•20 years ago
|
||
Jshin: This line 338 of mimetpla.cpp nsDependentCString inputStr(line, length); is raising an assertion everytime it's executed that 'line' should be zero terminated. It's not blocking, and I don't know how I should change it, but it certainly needs to be corrected.
Assignee | ||
Comment 21•20 years ago
|
||
The change to mimetpla is very minor, I commented out the line I removed. Removing the line and reindenting made it look more complex. The change to nsMsgI18N may look more important, but in fact there is no new code in that. I'm juste replacing the method jshin used to convert by the one in nsXMLHttpRequest::ConvertBodyToText. The code is nothing but a line by line copy (a few variable renamed) from http://lxr.mozilla.org/mozilla/source/extensions/xmlextras/base/src/nsXMLHttpRequest.cpp#523 jshin, can you review this ?
Assignee | ||
Comment 22•20 years ago
|
||
I uploaded the wrong file :-( That was the one with the line removed and reindentation from mimetpla.cpp, and without the change to nsMsgI18N.
Attachment #145824 -
Attachment is obsolete: true
Comment 23•20 years ago
|
||
> This line 338 of mimetpla.cpp
> nsDependentCString inputStr(line, length);
> is raising an assertion everytime it's executed that 'line' should be zero
> terminated.
It should be fairly safe to terminate it, because of the +1 in the definition of
desired_size in mime_LineBuffer in mimebuf.cpp:
line[length] = 0; // see desired_size in mime_LineBuffer in mimebuf.cpp
nsDependentCString inputStr(line, length);
Comment 24•20 years ago
|
||
I tested patch 145825 and it did indeed solve the problem with the 'rfc 822 testcase' here (all non-UTF-8 characters are shown as ? in my font). But it completely breaks the display of the testcase in bug 239236: the entire body is missing! If this patch gets checked in, bug 239236 needs to be reopened and will be even worse...
Assignee | ||
Comment 25•20 years ago
|
||
Can you investigate the regression more carefully, Karsten ? I don't have time for that today. Of course, I'll make any needed change so that we get acceptable result for bug 239236. I'm quite surprised it can regress something still. All the patch does is to try harder when the converter fails.
Comment 26•20 years ago
|
||
The problem with the test case of bug 239236 is this: mimetpla.cpp:466: { // convert back to mailCharset before writing. rv = nsMsgI18NConvertFromUnicode(nsDependentCString(mailCharset), lineResultUnichar, outString); NS_ENSURE_SUCCESS(rv, -1); } Since the mailCharset is illegal/unknown, nsMsgI18NConvertFromUnicode will always return an error and hence force NS_ENSURE_SUCCESS to quit the function without outputting any data...
Assignee | ||
Comment 27•20 years ago
|
||
OK, the regression came from blindly removing the nsMimeOutput::nsMimeMessageSaveAs test. In bug 239236, we are not able to convert the input *at all*, so blindly applying CopyUTF8toUTF16 was better than trying to get an adequate decoder, because at least it gave a line of result when the input line was fully US-ASCII. I've been reading some more of the source. When we arrive in mimetpla.cpp:466, the input string ought to have *already* been transcoded to utf-8, this must be the reason why when nsMimeMessageSaveAs is true we use CopyUTF8toUTF16 blindly. MimeInlineText_convert_and_parse_line does the conversion around here http://lxr.mozilla.org/mozilla/source/mailnews/mime/src/mimetext.cpp#381 and then calls obj->clazz->parse_line that leads to mimetpla.cpp:466 In that code 'charset_conversion_fn' should do the conversion. If it returns a void pointer and a valid return code, the input string is used directly instead of a converted string. This is that line : http://lxr.mozilla.org/mozilla/source/mailnews/mime/src/mimetext.cpp#406 In practice, the implementation use the mime_convert_charset wrapper around ConvertUsingEncoderAndDecoder, that makes sure that in case of failure, we *always* returns a positive error code and a NULL return value. "If we fail, we just use unconverted input" looks really ugly to me and opening an avenue for an attaquer to be able to insert arbitrary content. In bug 236941, charset_conversion_fn fails because the input is not valid WRT utf-8 encoding rules, in bug 239236, it fails because the charset does not exists. In both case, we end up with the raw input where we are expecting already converted data. I think the adequate treatement for bug 239236 should be to use US-ASCII when the input charset is unknown and make it fall in a loop that like my patch inserts 0xFFFD for every undecodable character (that would be every character above 0x80). The question is about *where* it's best to do it.
Assignee | ||
Comment 28•20 years ago
|
||
This new patch is quite cleaner, I'm no more taking the risk of doing anything dangerous. I test that the string is not utf-8 before ignoring nsMimeMessageSaveAs, and I use a fall-back charset if there's no encoding for input. That fall-back is UTF-8, because 'us-ascii' is mapped to 'cp1252', and I want a warning that somethings wrong with the message. We know at this stage the message is not UTF-8, so that's for the best, the user will be visually warned something has gone wrong. (If currently unknown charsets are to be displayed as ISO-8859-1, the correct mechanisme to use is to add them to the alias list for ISO-8859-1, not to silently map anything unknown) This patch works, and it makes MimeInlineTextPlain_parse_line rock solid WRT problem in content that ought to be UTF-8. It probably would be cleaner to always feed MimeInlineTextPlain_parse_line rock with the valid UTF-8 it should get, but I'd prefer someone with a better knowledge of the intrisics of mailnews to do that kind of change. I just wanted a patch as simple as possible to have a chance of getting it in the Mozilla 1.7 branch.
Assignee | ||
Updated•20 years ago
|
Attachment #145825 -
Attachment is obsolete: true
Comment 29•20 years ago
|
||
Comment on attachment 146038 [details] [diff] [review] Patch that replace invalid char with 0xFFFD, corrected Sorry it took me so long. >Index: mimetpla.cpp >+ line[length] = 0; // see desired_size in mime_LineBuffer in mimebuf.cpp Can you elaborate on this? > nsDependentCString inputStr(line, length); BTW, mimetpfl.cpp needs to be fixed the same way as mimetpla.cpp >Index: nsMsgI18N.cpp >+ PRUnichar * outBuffer = >+ NS_STATIC_CAST(PRUnichar*, nsMemory::Alloc((outBufferLength + 1) * >+ sizeof(PRUnichar))); You can use nsAutoBuffer<PRUnichar, 512> here to avoid the memory allocation in most cases. With that change, this patch can go into 1.7branch. For 1.8alpha, the patch needs to be changed a bit more.
Assignee | ||
Comment 30•20 years ago
|
||
Hum, I have a different patch since yesterday. It's almost the same code (slightly slimpler), but in mimetext.cpp and mimemoz2.cpp. This is where mozilla originally tries to convert the string, and keeps the same string if the conversion fails (which is a non-sense, but if we go there we'd end up rewriting all of mailnews). I conceptually feel it a lot better to correct there than in mimetpla.cpp. This also means I don't need to guess something went wrong by adding lot of "isutf8" testing, I now correct the problem "at the source". >>Index: mimetpla.cpp >>+ line[length] = 0; // see desired_size in mime_LineBuffer in mimebuf.cpp > > Can you elaborate on this? See comment #23 about that. With the new patch, when the input line arrives there, it is already correctly zero terminated, there is no assertion anymore, and no reason to change anything in mimetpla.cpp. > BTW, mimetpfl.cpp needs to be fixed the same way as mimetpla.cpp I did a quick check and it seems that with the new patch the change applies to both normal and flowed text without needing to change two things. > You can use nsAutoBuffer<PRUnichar, 512> here to avoid the memory allocation > in most cases. The new patch is a bit shorter, I didn't need to change anything to the allocation already done, so the problem doesn't occur. Sorry about the late change, I wasn't really satisfied with the solution in the first patch, but it took a while to get the time to rewrite it.
Attachment #146038 -
Attachment is obsolete: true
Assignee | ||
Comment 31•20 years ago
|
||
Comment on attachment 146932 [details] [diff] [review] Patch that does the change in mimemoz2.cpp instead The patch does not need to be changed to apply to the 1.7 branch I checked it works well under linux, for flowed too.
Attachment #146932 -
Flags: superreview?(bienvenu)
Attachment #146932 -
Flags: review?(jshin)
Attachment #146932 -
Flags: approval1.7?
Comment on attachment 146932 [details] [diff] [review] Patch that does the change in mimemoz2.cpp instead The patch does not need to be changed to apply to the 1.7 branch Please attach unified diffs (cvs diff -pu8), and please request approval after you have review and super-review (and preferably after landing on the trunk).
Attachment #146932 -
Flags: approval1.7?
Assignee | ||
Comment 33•20 years ago
|
||
Comment on attachment 146932 [details] [diff] [review] Patch that does the change in mimemoz2.cpp instead The patch does not need to be changed to apply to the 1.7 branch Index: mimetext.cpp =================================================================== RCS file: /cvsroot/mozilla/mailnews/mime/src/mimetext.cpp,v retrieving revision 1.46 diff -p -u -8 -r1.46 mimetext.cpp --- mimetext.cpp 17 Apr 2004 18:33:13 -0000 1.46 +++ mimetext.cpp 29 Apr 2004 18:40:53 -0000 @@ -385,16 +385,20 @@ MimeInlineText_convert_and_parse_line(ch SetMailCharacterSetToMsgWindow(obj, text->charset); } } } //initiate decoder if not yet if (text->inputDecoder == nsnull) MIME_get_unicode_decoder(text->charset, getter_AddRefs(text->inputDecoder)); + // If no decoder found, use ""UTF-8"", that will map most non-US-ASCII chars as + // Invalid. A pure-ASCII only decoder would be better, but there is none + if (text->inputDecoder == nsnull) + MIME_get_unicode_decoder("UTF-8", getter_AddRefs(text->inputDecoder)); if (text->utf8Encoder == nsnull) MIME_get_unicode_encoder("UTF-8", getter_AddRefs(text->utf8Encoder)); PRBool useInputCharsetConverter = obj->options->m_inputCharsetToUnicodeDecoder && !nsCRT::strcasecmp(text->charset, obj->options->charsetForCachedInputDecoder.get()); if (useInputCharsetConverter) status = obj->options->charset_conversion_fn(line, length, text->charset, Index: mimemoz2.cpp =================================================================== RCS file: /cvsroot/mozilla/mailnews/mime/src/mimemoz2.cpp,v retrieving revision 1.207 diff -p -u -8 -r1.207 mimemoz2.cpp --- mimemoz2.cpp 17 Apr 2004 18:33:13 -0000 1.207 +++ mimemoz2.cpp 29 Apr 2004 18:40:55 -0000 @@ -760,31 +760,64 @@ int ConvertUsingEncoderAndDecoder(const else { unichars = localbuf; unicharLength = klocalbufsize+1; } if (unichars == nsnull) { rv = NS_ERROR_OUT_OF_MEMORY; } else { - // convert to unicode - rv = decoder->Convert(stringToUse, &srcLen, unichars, &unicharLength); + // convert to unicode, replacing failed chars with 0xFFFD as in + // the methode used in nsXMLHttpRequest::ConvertBodyToText and nsScanner::Append + const char * inBuffer = stringToUse; + PRInt32 totalChars = 0, + outBufferIndex = 0, + outLen = unicharLength; + PRInt32 dataLen = srcLen; + + do { + PRInt32 inBufferLength = dataLen; + rv = decoder->Convert(inBuffer, + &inBufferLength, + &unichars[outBufferIndex], + &outLen); + totalChars += outLen; + if (NS_FAILED(rv)) { + // We consume one byte, replace it with U+FFFD + // and try the conversion again. + unichars[outBufferIndex + outLen++] = (PRUnichar)0xFFFD; + outBufferIndex += outLen; + outLen = unicharLength - (++totalChars); + + decoder->Reset(); + + if((inBufferLength + 1) > dataLen) { + inBufferLength = dataLen; + } else { + inBufferLength++; + } + + inBuffer = &inBuffer[inBufferLength]; + dataLen -= inBufferLength; + } + } while ( NS_FAILED(rv) && (dataLen > 0) ); + if (NS_SUCCEEDED(rv)) { - rv = encoder->GetMaxLength(unichars, unicharLength, &dstLength); + rv = encoder->GetMaxLength(unichars, totalChars, &dstLength); // allocale an output buffer dstPtr = (char *) PR_Malloc(dstLength + 1); if (dstPtr == nsnull) { rv = NS_ERROR_OUT_OF_MEMORY; } else { PRInt32 buffLength = dstLength
Attachment #146932 -
Attachment description: Patch that does the change in mimemoz2.cpp instead → Patch that does the change in mimemoz2.cpp instead
The patch does not need to be changed to apply to the 1.7 branch
Assignee | ||
Comment 34•20 years ago
|
||
Next time I'll remember what "edit attachment as comment" really means :-( I hope jshin has some time to review the new version now. This patch can be applied to trunk and 1.7 branch as is.
Attachment #146932 -
Attachment is obsolete: true
Assignee | ||
Updated•20 years ago
|
Attachment #147329 -
Flags: superreview?(bienvenu)
Attachment #147329 -
Flags: review?(jshin)
Assignee | ||
Updated•20 years ago
|
Attachment #146932 -
Flags: superreview?(bienvenu)
Attachment #146932 -
Flags: review?(jshin)
Comment 35•20 years ago
|
||
Comment on attachment 147329 [details] [diff] [review] Patch that does the change in mimemoz2.cpp instead r=jshin >+ const char * inBuffer = stringToUse; >+ PRInt32 totalChars = 0, >+ outBufferIndex = 0, >+ outLen = unicharLength; >+ PRInt32 dataLen = srcLen; >+ >+ do { >+ PRInt32 inBufferLength = dataLen; >+ rv = decoder->Convert(inBuffer, >+ &inBufferLength, >+ &unichars[outBufferIndex], >+ &outLen); >+ totalChars += outLen; >+ if (NS_FAILED(rv)) { >+ // We consume one byte, replace it with U+FFFD >+ // and try the conversion again. >+ unichars[outBufferIndex + outLen++] = (PRUnichar)0xFFFD; |PRUnichar(0xFFFD)| seems better. >+ outBufferIndex += outLen; >+ outLen = unicharLength - (++totalChars); It might be nice to add some comments about totalChars, dataLen, etc.
Comment 36•20 years ago
|
||
If I'm reading this correctly, if((inBufferLength + 1) > dataLen) { inBufferLength = dataLen; } else { inBufferLength++; could be: if ((inBufferLength + 1) > dataLen) break; inBufferLength++; is that correct? Other than that, it looks fine to me.
Assignee | ||
Comment 37•20 years ago
|
||
The comments tempted me into rewriting the loop, especially bienvenu's comment that it's OK to use 'break'. New version is a bit simpler, has more comment, does not call decoder->Convert for a zero length string if the last byte is invalid (as the code does a decoder->Reset() before, it really was not useful). The code was dependant on the fact that call would return NS_SUCCEEDED. I tested border cases (last character of the line invalid, all characters in the line invalid, format=flowed) A "if (NS_SUCCEEDED(rv)" test goes out, and this changes the indentation of the encoder-> part, but the lines did not change at all truly.
Attachment #147329 -
Attachment is obsolete: true
Assignee | ||
Comment 38•20 years ago
|
||
Preceeding version was not the right patch file :-( comment #37 still applies about simplifying the loop, adding comments.
Assignee | ||
Updated•20 years ago
|
Attachment #147732 -
Attachment is obsolete: true
Assignee | ||
Updated•20 years ago
|
Attachment #147329 -
Flags: superreview?(bienvenu)
Attachment #147329 -
Flags: review?(jshin)
Assignee | ||
Updated•20 years ago
|
Attachment #147733 -
Flags: superreview?(bienvenu)
Attachment #147733 -
Flags: review?(jshin)
Comment 39•20 years ago
|
||
Comment on attachment 147733 [details] [diff] [review] New version of the patch r=jshin sorry for the delay
Attachment #147733 -
Flags: review?(jshin) → review+
Assignee | ||
Comment 40•20 years ago
|
||
Bienvenu, can you SR this patch or forward to someone who can ? Too bad 1.8a1 is already out (I should have realized this was coming !), but at least let's not loose more time ... Scott, in my opinion this patch should go in aviary 1.0 branch too, but I don't know what is the proper way to request that. Maybe you can do the SR if bienvenu can not.
Updated•20 years ago
|
Attachment #147733 -
Flags: superreview?(bienvenu) → superreview+
Comment 41•20 years ago
|
||
thanks a lot for the heads up. I checked this into the aviary 1.0 branch
Whiteboard: fixed-aviary1.0
Assignee | ||
Comment 42•20 years ago
|
||
OK, I could verify the fix in the 2004-06-01-02-0.7 TB nightly. Who can check in trunk ?
Assignee | ||
Comment 43•20 years ago
|
||
*** Bug 243747 has been marked as a duplicate of this bug. ***
Comment 44•20 years ago
|
||
Ok. I can check that into the trunk. How about 1.7branch? Has it been taken care of? Apparently not because even a1.7 was not asked for, yet.
Assignee | ||
Comment 45•20 years ago
|
||
Comment on attachment 147733 [details] [diff] [review] New version of the patch Requesting approval for 1.7
Attachment #147733 -
Flags: approval1.7?
Comment 46•20 years ago
|
||
Comment on attachment 147733 [details] [diff] [review] New version of the patch a=chofmann for 1.7
Attachment #147733 -
Flags: approval1.7? → approval1.7+
Comment 47•20 years ago
|
||
fix checked into 1.7 branch
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Whiteboard: fixed-aviary1.0 → fixed-aviary1.0,fixed1.7
Updated•20 years ago
|
Product: MailNews → Core
Updated•16 years ago
|
Product: Core → MailNews Core
You need to log in
before you can comment on or make changes to this bug.
Description
•