Closed Bug 1251120 Opened 7 years ago Closed 7 years ago

crash in nsMsgI18NConvertFromUnicode

Categories

(Thunderbird :: General, defect)

x86
Windows NT
defect
Not set
critical

Tracking

(thunderbird45+ fixed, thunderbird46 fixed, thunderbird47 fixed, thunderbird48 fixed)

RESOLVED FIXED
Thunderbird 48.0
Tracking Status
thunderbird45 + fixed
thunderbird46 --- fixed
thunderbird47 --- fixed
thunderbird48 --- fixed

People

(Reporter: wsmwk, Assigned: jorgk-bmo)

References

Details

(Keywords: crash, regression, topcrash-thunderbird, Whiteboard: [regression:TB45])

Crash Data

Attachments

(1 file, 4 obsolete files)

#4 crash for 45.0b1.  I anticpate same for 45.0b2.

near zero crash rate for other  channels.
big uptick in crash rate starts with 45.0a1 buildid 20151119030239 

bp-6265c5dd-a251-4ca3-ae17-78c3d2160222 (gico)
is typical whacko stack
0 	xul.dll	nsMsgI18NConvertFromUnicode(char const*, nsString const&, nsACString_internal&, bool, bool)	c:/builds/moz2_slave/tb-rel-c-beta-w32_bld-00000000/build/mailnews/base/util/nsMsgI18N.cpp:106
1 	msvcrt.dll	_atan_pentium4	

others ...

bp-f5d37a11-f994-4f3e-b5b6-253892160222
 0 	xul.dll	nsMsgI18NConvertFromUnicode(char const*, nsString const&, nsACString_internal&, bool, bool)
1 	freebl3.dll	camellia_decrypt256
2 	windowscodecs.dll	CIFDPaddingManager::WritePaddingToStream(IStream*, int, unsigned long*)

bp-aa022582-881b-4f31-8d4c-7bc272151103 41.0b2
 0 	xul.dll	nsMsgI18NConvertFromUnicode(char const*, nsString const&, nsACString_internal&, bool)	c:/builds/moz2_slave/tb-rel-c-beta-w32_bld-00000000/build/mailnews/base/util/nsMsgI18N.cpp:93
1 	dwrite.dll	_SEH_epilog4_GS	

less whacko
bp-aac9188c-17df-4ef0-a8fb-d8ef22151124 45.0a1
 0 	xul.dll	nsMsgI18NConvertFromUnicode(char const*, nsString const&, nsACString_internal&, bool, bool)	c:/builds/moz2_slave/tb-c-cen-w64-ntly-000000000000/build/mailnews/base/util/nsMsgI18N.cpp:106
1 	xul.dll	nsMsgCompose::SendMsg(int, nsIMsgIdentity*, char const*, nsIMsgWindow*, nsIMsgProgress*)	c:/builds/moz2_slave/tb-c-cen-w64-ntly-000000000000/build/mailnews/compose/src/nsMsgCompose.cpp:1177
2 	xul.dll	XPTC__InvokebyIndex	xpcom/reflect/xptcall/md/win32/xptcinvoke_asm_x86_64.asm
still #4 crash for beta 45. I've sent a reproducible testcase to magnus and kent.  So would be great to fix for 45.0

With the email in an imap folder, edit message as new (it should have 2 attachments). Change recipient and click send. crash. email gets to recipient without the 2 attachments.

No crash if email opened as new from local folder. No crash with version 38.

The person who supplied the testcase said he crashes if he *replies* to the email in an imap folder. I could not make it crash with those steps
oldest 45.0a1 crash I find is 20151118030204 bp-7a80ad15-d5df-473b-a36a-4999c2151124 - increases perhaps start in that time period ~2015-11-18
checked that it also crashes nightly bp-8d0c9adf-e40f-4950-9c38-e86892160304

And correction to comment 1, email does NOT email gets to recipient.
Can't reproduce, and the crash id line is off...
(In reply to Magnus Melin from comment #4)
> Can't reproduce, and the crash id line is off...

You mean c:/builds/moz2_slave/tb-c-cen-w32-ntly-000000000000/build/mailnews/base/util/nsMsgI18N.cpp:106  ?

I narrowed the regression range to 45.0a1 2015-11-14 works, 2015-11-15 fails.
http://hg.mozilla.org/comm-central/pushloghtml?startdate=2015-11-14%2003:05:00&enddate=2015-11-15%2003:05:00
so suspect http://hg.mozilla.org/comm-central/rev/15454eb97bbe bug 1202401

revised steps
- start in safe mode, never sent any email, no saved passwords
- email in an imap folder
- edit message as new (it should have 2 attachments)
- Change recipient and click send. 
crash before geting smtp password prompt
Blocks: 1202401
Whiteboard: [regression:TB45?] → [regression:TB45]
(In reply to Wayne Mery (:wsmwk, use Needinfo for questions) from comment #5)
> (In reply to Magnus Melin from comment #4)
> > Can't reproduce, and the crash id line is off...
> 
> You mean
> c:/builds/moz2_slave/tb-c-cen-w32-ntly-000000000000/build/mailnews/base/util/
> nsMsgI18N.cpp:106  ?

Yes - http://mxr.mozilla.org/comm-central/source/mailnews/base/util/nsMsgI18N.cpp#106

> revised steps
> - start in safe mode, never sent any email, no saved passwords
> - email in an imap folder
> - edit message as new (it should have 2 attachments)
> - Change recipient and click send. 
> crash before geting smtp password prompt

Can't reproduce, even with no saved smtp password.
If you can reproduce, what are your charset settings for incoming/outgoing?
Send me the test case please.
Flags: needinfo?(vseerror)
Well, this is interesting. My viewing text encoding was set to chinese simplifed. I changed it to unicode, ruan same steps, no crash
Flags: needinfo?(vseerror)
Wayne, can you give the exact settings you have for incoming/outgoing encoding. 
And does it matter if the password is saved or not?
Flags: needinfo?(vseerror)
password logged in or not is not a necessary step. I mention it only to clearly demonstrate that the crash is during the message composition, not the sending.

revised steps
- start in safe mode never sent any email
- click the email in an imap folder
- set character encoding to chinese simplified
- edit message as new (it should have 2 attachments)
- Change recipient and sender
- click send. 

Note, I can't explain it, but if you first successfully send a message using another character encoding, then use the steps above, the message WILL send OK.  mail.compose.max_recycled_windows 1 and 0 both fail
Flags: needinfo?(vseerror)
Wayne, I'm sure you need a special message to make this crash, right? Can you please make this message accessible to me. I tried an e-mail with two attachments from an IMAP folder, viewing it in "Chinese, Simplified", edit as new, change sender/recipient, sent, and no crash. Surely this depends on the encoding of the message you're editing as new.
I've emailed it again - should be in your inbox.
Still can't reproduce on linux.
(In reply to Wayne Mery (:wsmwk, use Needinfo for questions) from comment #13)
> I've emailed it again - should be in your inbox.
Well, nice of you that you mailed it twice, but both messages went to my SPAM folder.

Now I've got the message. Basically, the attachment was a mailbox with one message in it.

I changed the ">From" to a "From" in the first line and stored the mailbox as a *local folder*.

Then I replied to the messages and upon sending it, it crashes straight way with my local compile. No fancy "edit as new" or "safe mode". Oh, all is as described: Message has two attachments and shows as "Chinese, Simplified".

Sadly, not so easy to debug. I get this:
A buffer overrun has occurred in thunderbird.exe which has corrupted the program's internal state. Press Break to debug the program or Continue to terminate the program. For more details please see Help topic 'How to debug Buffer Overrun Issues'.
with a break here at the end of nsresult nsMsgI18NConvertFromUnicode().

  rv = encoder->Finish(localbuf, &dstLength);
  if (NS_SUCCEEDED(rv)) {
    outString.Append(localbuf, dstLength);
    return !mappingFailure ? rv: NS_ERROR_UENC_NOMAPPING;
  }
  return rv;
} <<<<<=====

The exception is thrown indicating this line. So what most likely happens is that some buffer gets overwritten and the stack is corrupted and the program can even return.
Attached patch Proposed fix (v1). (obsolete) — Splinter Review
Ugly, ugly, ugly. You give it a buffer and a length, and it overruns the buffer. On different systems that may not occur or the one byte overrun might not be critical. On Windows it's absolutely fatal.

Making the buffer one byte longer gets rid of the clearly very reproducible crash.

Before we call finish, we must reset the buffer length to the original length, that was also wrong.
Assignee: nobody → mozilla
Status: NEW → ASSIGNED
Attachment #8729229 - Flags: review?(mkmelin+mozilla)
Attached patch Proposed fix (v2). (obsolete) — Splinter Review
Ugly, ugly, ugly. I filled the buffer at the end to see by how much it would be overrun and it overran by two bytes. So give it ten bytes more and keep our fingers crossed.

We should follow up with an M-C bug. Maybe I don't have all that much confidence that the conversion was actually right :-(
Attachment #8729229 - Attachment is obsolete: true
Attachment #8729229 - Flags: review?(mkmelin+mozilla)
Attachment #8729241 - Flags: review?(mkmelin+mozilla)
Magnus, if you want to play around with it, just set
localbuf[512]='M';
localbuf[513]='a';
localbuf[514]='g';
localbuf[515]='n';
localbuf[516]='u';
localbuf[517]='s';
and see how much is overwritten on Linux. On Windows I get two bytes overrun.
Attached patch Diagnostic patch (obsolete) — Splinter Review
I wanted to know how often the buffer is overrun so I added some debug.

This is the result:
The buffer is overrun one with the string ",2" as one can see in the debug here:
===== Returned length 514
===== Buffer from index 450 |2-23?0,200:10</div>
                        <div><b>To:</b>?0,2gnu|
===== Buffer from index 0 |<a
                            href="mailto:alici|

The "Magnus" I placed at the end is changed to ",2gnu".

No information is lost, since Convert() tells us that it returned more than the buffer can hold. Good one.

So my patch is OK to use to fix the crash and we need to tell the M-C guys to fix their stuff.
Attached patch Proposed fix/workaround (v3). (obsolete) — Splinter Review
OK, I've added a warning. So we now see
[6328] WARNING: encoder->Convert() returned 514 characters. Limit = 512: file c:/mozilla-source/comm-central/mailnews/base/util/nsMsgI18N.cpp, line 94
instead of a crash.
Attachment #8729241 - Attachment is obsolete: true
Attachment #8729241 - Flags: review?(mkmelin+mozilla)
Attachment #8729378 - Flags: review?(mkmelin+mozilla)
I'm not sure that all the paths in
https://dxr.mozilla.org/mozilla-central/source/intl/uconv/ucvcn/nsUnicodeToGBK.cpp#186
check before they write into the buffer.

Simon, we have a case here were we ask for 512 bytes and get 514. That overruns our buffer and crashes on some platforms.
Flags: needinfo?(smontagu)
Comment on attachment 8729378 [details] [diff] [review]
Proposed fix/workaround (v3).

Review of attachment 8729378 [details] [diff] [review]:
-----------------------------------------------------------------

Great work! r+=me with some nits.

::: mailnews/base/util/nsMsgI18N.cpp
@@ +77,5 @@
>    int32_t originalUnicharLength = inString.Length();
>    int32_t srcLength;
>    int32_t dstLength;
> +  char localbuf[512+10]; // We have seen cases were the buffer was overrun
> +                         // by two (!!) bytes. So give it ten bytes more.

Please file the m-c bug, and reference the bug number here. That makes it easier to remove this workaround when the m-c bug is fixed.

@@ +87,5 @@
>    while (consumedLen < originalUnicharLength) {
>      srcLength = originalUnicharLength - consumedLen;  
>      dstLength = 512;
>      rv = encoder->Convert(currentSrcPtr, &srcLength, localbuf, &dstLength);
> +    if (dstLength > 512) {

You don't want the sprintf in release builds since the warning will not print, so wrap this entire block in #ifdef DEBUG
Attachment #8729378 - Flags: review?(mkmelin+mozilla) → review+
Depends on: 1255863
To be continued in bug 1255863.
Flags: needinfo?(smontagu)
Done. Thanks for the quick review.
Attachment #8729374 - Attachment is obsolete: true
Attachment #8729378 - Attachment is obsolete: true
Attachment #8729649 - Flags: review+
Keywords: checkin-needed
FTR: using the diagnosis patch (attachment 8729374 [details] [diff] [review]) I still don't reproduce on linux.
Hmm, what can I say. Have you printed dstLength as well? I clearly see that 514 bytes are returned.
Well, unless something was fixed in M-C in the last few days. I refreshed from M-C five days ago.
I trust you agree that my workaround is harmless and can be used to avoid a crash.
Comment on attachment 8729649 [details] [diff] [review]
Proposed fix/workaround (v3a) - review nits fixed.

[Approval Request Comment]
Regression caused by (bug #): Caused indirectly by bug 1202401, where we started to use a faulty M-C function. See bug 1255863.
User impact if declined: Nice crashes ;-)
Testing completed (on c-c, etc.): Manual.
Risk to taking this patch (and alternatives if risky):
Not risky. We just provide a larger buffer to cater for expected overrun. Manual inspections shows that using the overrun bytes makes the thing work.
Attachment #8729649 - Flags: approval-comm-release?
Attachment #8729649 - Flags: approval-comm-beta?
Attachment #8729649 - Flags: approval-comm-aurora+
I've just tried this on last week's Earlybird (TB 46) and this weeks Earlybird (TB 47). When I reply to the e-mail, I get an immediate crash.

Landed on Aurora (TB 47) for further testing and because we need it for TB 45.
https://hg.mozilla.org/releases/comm-aurora/rev/2f90e3dba2f0

C-C is currently closed.
Attachment #8729649 - Flags: approval-comm-release? → approval-comm-esr45?
A bit of further investigation sais we don't seem run through this path on linux for the testcase in question... 
(Won't investigate further.)
Are we using the same test case? I'm using  a message "Subject: Re: Re: outdoor furniture price list from Origin Furnishings". The mailbox is 30149 lines long and has 2344941 bytes. charset="GB18030".
Target Milestone: --- → Thunderbird 48.0
I guess I was too tired and forgot to do enough of the steps to reproduce. Tried it now and I do get one case where length is 513 (so one too much). It just doesn't result in any crash on linux.
Without looking at the decoder in much detail, I think it overruns by at most one "character" (so 1-4? bytes) of the target encoding. I think with 10 extra bytes we're on the safe side until M-C fix the bug. Whether or not it crashes of course depends on the memory layout. As I said, all Windows versions I tried crashed on a simple reply to the message, no fancy steps necessary.
I'll land it on C-C myself using a transplant from Aurora.
Keywords: checkin-needed
Comment on attachment 8729649 [details] [diff] [review]
Proposed fix/workaround (v3a) - review nits fixed.

http://hg.mozilla.org/releases/comm-beta/rev/d67233a60c72
http://hg.mozilla.org/releases/comm-esr45/rev/60354f9683ed
Attachment #8729649 - Flags: approval-comm-esr45?
Attachment #8729649 - Flags: approval-comm-esr45+
Attachment #8729649 - Flags: approval-comm-beta?
Attachment #8729649 - Flags: approval-comm-beta+
Whiteboard: [regression:TB45] → [regression:TB45][checkin-needed comm-central]
Whiteboard: [regression:TB45][checkin-needed comm-central] → [regression:TB45]
https://hg.mozilla.org/comm-central/rev/8a7440769df9
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Although the buffer overrun is indeed our (m-c) fault, the conversion code is also wrong. Either handle NS_OK_UENC_MOREOUTPUT or use GetMaxLength(). You don't have to wait for m-c fix if you use GetMaxLength().
Actually, we analysed this in bug 1263088 and the C-C conversion code is correct.
You need to log in before you can comment on or make changes to this bug.