Closed Bug 468351 Opened 16 years ago Closed 15 years ago

display of header values with unencoded special characters broken

Categories

(MailNews Core :: Internationalization, defect)

defect
Not set
normal

Tracking

(status1.9.1 .4-fixed)

RESOLVED FIXED
Thunderbird 3.0b4
Tracking Status
status1.9.1 --- .4-fixed

People

(Reporter: mnyromyr, Assigned: emk)

References

(Depends on 1 open bug)

Details

(Keywords: regression)

Attachments

(4 files, 2 obsolete files)

Since bug 254519 is fixed, header values containing unencoded special characters (eg. umlauts) break at the first of these characters in the headerpane display. 
Thunderbird doesn't seem to be affected.
Note: unencoded special characters are not allowed in the headers anyway.
Yes, but we did "guess" before what was meant; and even without guessing we should replace illegal characters by � (u+FFFD)...
...and not drop the rest of the string.
Thunderbird trunk is affected as well.
Steps to reproduce:
- Drop the attached mbox into your eg. Local Folders folder.
- Set your default display encoding to UTF-8.
- Get a message with unencoded ISO-8859-1 characters like the one attached.
  - The Subject: in threadpane and headerpane shows only �s from the first
    illegal byte onward instead of just replacing the offending byte.
  - The From: is blank in the threadpane; the From: is cut off at the first
    illegal byte in the headerpane.
  - The illegal umlauts in the body are correctly replaced by one � each.
The frontend gets passed already broken stuff, a debug build will show:

###!!! ASSERTION: not a UTF8 string: 'Error', file ../../dist/include/string/nsUTF8Utils.h, line 130
###!!! ASSERTION: length mismatch: 'calculator.Length() == converter.Length()', file /home/kd/projekte/mozilla/mozilla.org/src/trunk/mozilla/xpcom/string/src/nsReadableUtils.cpp, line 444
###!!! ASSERTION: Not a UTF-8 string. This code should only be used for converting from known UTF-8 strings.: 'Error', file ../../../dist/include/string/nsUTF8Utils.h, line 516
###!!! ASSERTION: not a UTF8 string: 'Error', file ../../dist/include/string/nsUTF8Utils.h, line 130
###!!! ASSERTION: Input wasn't UTF8 or incorrect length was calculated: 'Error', file /home/kd/projekte/mozilla/mozilla.org/src/trunk/mozilla/xpcom/string/src/nsReadableUtils.cpp, line 287
Assignee: nobody → smontagu
Component: MailNews: Message Display → Internationalization
Product: SeaMonkey → MailNews Core
QA Contact: message-display → mailnews.i18n
Bug 471423 also has an unencoded Big5 testcase.
I'm currently investigating the issue.
Assignee: smontagu → mnyromyr
(In reply to comment #0)
> Since bug 254519 is fixed, header values containing unencoded special
> characters (eg. umlauts) break at the first of these characters in the
> headerpane display. 

And breaks fully in the threadpane.

Screenshot:
http://www.flickr.com/photos/83837423@N00/3371877887/sizes/l/
How big of a regression is this? How visible is it to most users? Does it only happen for emails received from broken clients?
Flags: blocking-thunderbird3?
Attached image example
Here is anexample. Look at the different effect in threadpane, headerpane and compose window. I am seeing this nearly daily on newsgroups.
The regression is very visible in non-English environments where e.g. (older?) Outlook (and Express) versions are common.
I still think that the fix in bug 254519 was partly wrong (and that taught me not to trunk-port patches for others without looking closely at them, even if they are reviewed :-( ), but I haven't had the time yet to dig into this for real...
Flags: blocking-thunderbird3? → blocking-thunderbird3+
OS: Linux → All
Hardware: x86 → All
Karsten, are you still busy?
I've made a patch to fix the problem. Can I take your bug?
Sure!
Assignee: mnyromyr → VYV03354
Attached patch patch (mailnews part) (obsolete) — Splinter Review
We can't use CopyUTF8toUTF16 / NS_ConvertUTF8toUTF16 because message header may contain any octet sequence.
Also, we need to see folder charset in case charset is not available from the message itself.
Attachment #379331 - Flags: review?(bienvenu)
>  - The Subject: in threadpane and headerpane shows only �s from the first
>    illegal byte onward instead of just replacing the offending byte.
This is not a regression of bug 254519. We didn't change subject handling.
That said, I've also made a patch for this issue.
This patch is independent of the other parts of patch.
Attachment #379332 - Flags: review?(smontagu)
Status: NEW → ASSIGNED
Attachment #379332 - Flags: review?(smontagu) → review+
Attachment #379332 - Flags: superreview?(bienvenu)
Another reliable way of triggering the assertions in comment #5:

- Subscribe to news.newsfan.net, then subscribe to any Chinese newsgroup (which shows as a jumbled mix of characters)
- Subscribe to a Chinese group, then download messages.

Lots of assertions shown.

Gentle nudge to bienvenu for the review + superreview. :)
Bienvenu, could you review the patches?
Target Milestone: --- → Thunderbird 3.0b4
Attachment #379331 - Flags: superreview?(bienvenu)
Attachment #379331 - Flags: review?(neil)
Attachment #379331 - Flags: review?(bienvenu)
Comment on attachment 379331 [details] [diff] [review]
patch (mailnews part)

sorry for the delay - Neil might be better for looking at the string stuff, if he has time...
Attachment #379331 - Flags: superreview?(bienvenu)
Attachment #379331 - Flags: superreview+
Attachment #379331 - Flags: review?(neil)
Attachment #379331 - Flags: review?(bienvenu)
Comment on attachment 379331 [details] [diff] [review]
patch (mailnews part)

>       rv = aHdr->GetCharset(getter_Copies(charset));
>+      if (NS_FAILED(rv) || charset.IsEmpty() || charset.Equals("us-ascii") ||
>+          charsetOverride)
I would write this as
if (charsetOverride ||
    NS_FAILED(aHdr->GetCharset(getter_Copies(charset)) || 
    charset.IsEmpty() || 
    charset.Equals("us-ascii"))
(lines wrapped for Bugzilla readability)

>+    const char* cur = inString.get();
>+    const char* end = cur + inString.Length();
Use BeginReading() and EndReading()

>+      PRInt32 c = PRUint8(*cur++);
>+      if (c & 0x80)
char c = *cur++;
if (c & char(0x80))
[xpcom/string actually uses char(~0x7F)]

>+  nsAutoString utf16Text;
>+  nsresult rv = ConvertToUnicode(charset.get(), inString, utf16Text);
>+  if (NS_FAILED(rv))
I think I'd prefer
if (NS_SUCCEDED(rv))
{
  CopyUTF16toUTF8(utf16Text, outString);
  return;
}

>+    // EF BF BD (UTF-8 encoding of U+FFFD)
>+    static const char utf8ReplacementChar[] = "\357\277\275";
Use NS_NAMED_LITERAL_STRING

>+    ConvertRawBytesToUTF16(value.get(), opt->default_charset, output);
>+    CopyUTF16toUTF8(output, value);
Use ConvertRawBytesToUTF8

Rest looks OK to me but I didn't test it so setting sr+ instead of r+
Attachment #379331 - Attachment is obsolete: true
Attachment #389240 - Flags: superreview+
Attachment #389240 - Flags: review?(bienvenu)
Attachment #379331 - Flags: review?(bienvenu)
Comment on attachment 379332 [details] [diff] [review]
patch (core part)
[Checkin: Comment 38 & 43]

thx for the patch.
Attachment #379332 - Flags: superreview?(bienvenu) → superreview+
Attachment #389240 - Flags: review?(bienvenu) → review+
Comment on attachment 389240 [details] [diff] [review]
[backed out] (mailnews part) addressed review comment

thx for the patch!
Masatoshi, if you run a debug build, and click on this message, we get a ton of assertions in nsMimeConverter::DecodeMimeHeader() because the string isn't a UTF8 string. I'm wondering if you could look at that code and propose a fix for that assertion...
Keywords: checkin-needed
(In reply to comment #26)
> Masatoshi, if you run a debug build, and click on this message, we get a ton of
> assertions in nsMimeConverter::DecodeMimeHeader() because the string isn't a
> UTF8 string.
If you get assertions without a patch, my patch should also fix this.
If my patch doesn't help, please attach an offending message. I get no assertions using the patched build at the moment.
Does it matter if the mozilla-central and comm-central patches land separately?

Are they both needed to fix the bug?

Will we be able to get the mozilla-central part on 1.9.1?
(In reply to comment #28)
> Does it matter if the mozilla-central and comm-central patches land separately?
You can land both patches separately.
> Are they both needed to fix the bug?
The mozilla-central patch is required to fix the subject issue. The comm-central one is for all the rest.
> Will we be able to get the mozilla-central part on 1.9.1?
It can be applied to 191 cleanly.
(In reply to comment #27)

> If you get assertions without a patch, my patch should also fix this.
> If my patch doesn't help, please attach an offending message. I get no
> assertions using the patched build at the moment.

I hadn't applied the mozilla-central patch - I'm not getting any assertions anymore - the assertions were from the test case in this bug. Thx for fixing this!
Comment on attachment 389240 [details] [diff] [review]
[backed out] (mailnews part) addressed review comment

Checked in: http://hg.mozilla.org/comm-central/rev/a8ac240e2ba4
Attachment #389240 - Attachment description: addressed review comment → [checked in] (mailnews part) addressed review comment
Comment on attachment 389240 [details] [diff] [review]
[backed out] (mailnews part) addressed review comment

Backed out, this caused test failures (that look like crashes) in the mailnews tests:

TEST-UNEXPECTED-FAIL | /buildbot/linux-comm-1.9.1-check/build/objdir/mozilla/_tests/xpcshell/test_bayes/unit/test_bug228675.js | test failed (with xpcshell return code: -11), see following log:
  >>>>>>>
  ### XPCOM_MEM_LEAK_LOG defined -- logging leaks to /tmp/runxpcshelltests_leaks.log
TEST-INFO | (xpcshell/head.js) | test 1 pending
Directory request for: MailD that we (mailDirService.js) are not handling, leaving it to another handler.
Directory request for: MFCaF that we (mailDirService.js) are not handling, leaving it to another handler.
Directory request for: DefRt that we (mailDirService.js) are not handling, leaving it to another handler.
TEST-INFO | (xpcshell/head.js) | test 2 pending
Directory request for: CurWorkD that we (mailDirService.js) are not handling, leaving it to another handler.
TEST-INFO | (xpcshell/head.js) | test 2 finished
TEST-INFO | (xpcshell/head.js) | running event loop

  <<<<<<<

and several other files.
Attachment #389240 - Attachment description: [checked in] (mailnews part) addressed review comment → [backed out] (mailnews part) addressed review comment
Yep, it's a crash. Doesn't seem related to the bayes code, which the test is purportedly about. Here's a partial stack:

 	msvcr80d.dll!1023b130() 	
 	[Frames below may be incorrect and/or missing, no symbols loaded for msvcr80d.dll]	
>	mime.dll!nsCharTraits<char>::length(const char * s=0x00000000)  Line 629 + 0x9 bytes	C++
 	mime.dll!nsDependentCString::nsDependentCString(const char * data=0x00000000)  Line 90 + 0x12 bytes	C++
 	mime.dll!MimeHeaders_convert_header_value(MimeDisplayOptions * opt=0x03610f60, nsCString & value={...}, int convert_charset_only=0x00000001)  Line 76 + 0x13 bytes	C++
 	mime.dll!MimeHeaders_write_all_headers(MimeHeaders * hdrs=0x02402df8, MimeDisplayOptions * opt=0x03610f60, int attachment=0x00000000)  Line 616 + 0x11 bytes	C++
 	mime.dll!MimeMessage_write_headers_html(MimeObject * obj=0x03611200)  Line 757 + 0x15 bytes	C++
 	mime.dll!MimeMessage_close_headers(MimeObject * obj=0x03611200)  Line 428 + 0x9 bytes	C++
 	mime.dll!MimeMessage_parse_line(const char * aLine=0x023fa490, int aLength=0x00000001, MimeObject * obj=0x03611200)  Line 277 + 0x9 bytes	C++
 	mime.dll!convert_and_send_buffer(char * buf=0x023fa490, int length=0x00000001, int convert_newlines_p=0x00000001, int (char *, unsigned int, void *)* per_line_fn=0x02cccce0, void * closure=0x03611200)  Line 184 + 0xf bytes	C++
 	mime.dll!mime_LineBuffer(const char * net_buffer=0x03622b5c, int net_buffer_size=0x0000003b, char * * bufferP=0x03611228, int * buffer_sizeP=0x03611230, unsigned int * buffer_fpP=0x03611238, int convert_newlines_p=0x00000001, int (char *, unsigned int, void *)* per_line_fn=0x02cccce0, void * closure=0x03611200)  Line 272 + 0x1d bytes	C++
 	mime.dll!MimeObject_parse_buffer(const char * buffer=0x03622ac0, int size=0x000000d7, MimeObject * obj=0x03611200)  Line 275 + 0x31 bytes	C++
 	mime.dll!mime_display_stream_write(_nsMIMESession * stream=0x036112d0, const char * buf=0x03622ac0, int size=0x000000d7)  Line 949 + 0x16 bytes	C++
 	mime.dll!nsStreamConverter::OnDataAvailable(nsIRequest * request=0x0361001c, nsISupports * ctxt=0x0360f848, nsIInputStream * aIStream=0x03611390, unsigned int sourceOffset=0x00000000, unsigned int aLength=0x000000d7)  Line 957 + 0x1a bytes	C++
 	msglocal.dll!nsMailboxProtocol::ReadMessageResponse(nsIInputStream * inputStream=0x03611390, unsigned int sourceOffset=0x00000000, unsigned int length=0x000000d7)  Line 585 + 0x50 bytes	C++
 	msglocal.dll!nsMailboxProtocol::ProcessProtocolState(nsIURI * url=0x0360f850, nsIInputStream * inputStream=0x03611390, unsigned int offset=0x00000000, unsigned int length=0x000000d7)  Line 682 + 0x14 bytes	C++
 	msgbsutl.dll!nsMsgProtocol::OnDataAvailable(nsIRequest * request=0x036119f0, nsISupports * ctxt=0x0360f848, nsIInputStream * inStr=0x03611390, unsigned int sourceOffset=0x00000000, unsigned int count=0x000000d7)  Line 351 + 0x22 bytes	C++
 	necko.dll!nsInputStreamPump::OnStateTransfer()  Line 508 + 0x40 bytes	C++
 	necko.dll!nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream * stream=0x03611390)  Line 398 + 0xb bytes	C++
 	xpcom_core.dll!nsInputStreamReadyEvent::Run()  Line 112	C++
 	xpcom_core.dll!nsThread::ProcessNextEvent(int mayWait=0x00000001, int * result=0x0012bcd8)  Line 511	C++
 	xpcom_core.dll!NS_InvokeByIndex_P(nsISupports * that=0x00000008, unsigned int methodIndex=0x00000002, unsigned int paramCount=0x0012bcc8, nsXPTCVariant * params=0x02100178)  Line 102	C++
Comment on attachment 389240 [details] [diff] [review]
[backed out] (mailnews part) addressed review comment

>+void nsMsgI18NConvertRawBytesToUTF16(const nsCString& inString, 
>+                                     const nsCString& charset,
>+                                     nsAString& outString)
>+{
>+  if (IsUTF8(inString))
>+  {
>+    CopyUTF8toUTF16(inString, outString);
>+    return;
>+  }
>+
>+  nsresult rv = ConvertToUnicode(charset.get(), inString, outString);
ConvertToUnicode accepts a null charset (equivalent to ASCII), so to fix the crash we could change the signature of the second parameter of these conversion functions to const char *charset and pass charset.get() in everywhere...

>+    ConvertRawBytesToUTF8(value, nsDependentCString(opt->default_charset),
>+                          output);
...except here where you just pass opt->default_charset of course.
Changed charset parameter per comment #34.
This patch passed all mailnews tests.
Attachment #390590 - Flags: superreview?(neil)
Attachment #390590 - Flags: review?(bienvenu)
Attachment #389240 - Attachment is obsolete: true
Keywords: checkin-needed
Attachment #390590 - Flags: superreview?(neil) → superreview+
Whiteboard: [needs review bienvenu]
Comment on attachment 390590 [details] [diff] [review]
patch (mailnews part) v3
[Checkin: Comment 37]

yes, all mailnews tests pass. Thx for the new patch.
Attachment #390590 - Flags: review?(bienvenu) → review+
Keywords: checkin-needed
Whiteboard: [needs review bienvenu]
Attachment #390590 - Attachment description: patch (mailnews part) v3 → [checked in] patch (mailnews part) v3
Comment on attachment 390590 [details] [diff] [review]
patch (mailnews part) v3
[Checkin: Comment 37]

Checked in: http://hg.mozilla.org/comm-central/rev/1ad44651b264

I'll do the mozilla-central patch tomorrow.
Attachment #379332 - Attachment description: patch (core part) → [checked in to trunk] patch (core part)
Marking as fixed and adding tb3needs. The tb3needs flag will keep this on the lists so that we can hopefully drive the core patch into 1.9.1.

Masatoshi: please can you request approval on the core patch in a day or so once this has baked a little, and give the appropriate risk analysis/details that drivers need? Thanks.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Whiteboard: [tb3needs][baking on trunk][MC-fixed][Core-fixed-on-trunk]
Attachment #379332 - Flags: approval1.9.1.2?
Comment on attachment 379332 [details] [diff] [review]
patch (core part)
[Checkin: Comment 38 & 43]

The changed code path will be executed only if aDefaultCharset is specified.
http://mxr.mozilla.org/mozilla-central/source/netwerk/mime/src/nsMIMEHeaderParamImpl.cpp#572
However, browser calls it without aDefaultCharset.
http://mxr.mozilla.org/mozilla-central/source/netwerk/mime/src/nsMIMEHeaderParamImpl.cpp#101
And this is an only caller from browser. Therefore this change should not affect Firefox. (I'm not sure why this code belongs to the core...)
Attachment #379332 - Flags: approval1.9.1.2? → approval1.9.1.3?
Comment on attachment 379332 [details] [diff] [review]
patch (core part)
[Checkin: Comment 38 & 43]

Not for 1.9.1.2.
Whiteboard: [tb3needs][baking on trunk][MC-fixed][Core-fixed-on-trunk] → [tb3needs][Core-fixed-on-trunk][needs branch approval]
Whiteboard: [tb3needs][Core-fixed-on-trunk][needs branch approval] → [tb3needs][Core-fixed-on-trunk][awaiting branch approval]
Comment on attachment 379332 [details] [diff] [review]
patch (core part)
[Checkin: Comment 38 & 43]

Approved for 1.9.1.4, a=dveditz for release-drivers
Attachment #379332 - Flags: approval1.9.1.3? → approval1.9.1.4+
Whiteboard: [tb3needs][Core-fixed-on-trunk][awaiting branch approval] → [tb3needs][Core-fixed-on-trunk][need branch landing]
Attachment #390590 - Attachment description: [checked in] patch (mailnews part) v3 → patch (mailnews part) v3 [Checkin: Comment 37]
Comment on attachment 379332 [details] [diff] [review]
patch (core part)
[Checkin: Comment 38 & 43]


http://hg.mozilla.org/releases/mozilla-1.9.1/rev/9d244916bf08
Attachment #379332 - Attachment description: [checked in to trunk] patch (core part) → patch (core part) [Checkin: Comment 38 & 43]
Keywords: checkin-needed
Whiteboard: [tb3needs][Core-fixed-on-trunk][need branch landing]
If IMAP, same problem as this bug still occurs even after fix of this bug.
Bug 513472 is IMAP version of this bug.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: