crash in nsMsgI18NConvertFromUnicode

RESOLVED FIXED in Thunderbird 48.0

Status

Thunderbird
General
--
critical
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: wsmwk, Assigned: Jorg K (GMT+2))

Tracking

({crash, regression, topcrash-thunderbird})

unspecified
Thunderbird 48.0
x86
Windows NT
crash, regression, topcrash-thunderbird
Dependency tree / graph

Thunderbird Tracking Flags

(thunderbird45+ fixed, thunderbird46 fixed, thunderbird47 fixed, thunderbird48 fixed)

Details

(Whiteboard: [regression:TB45], crash signature)

Attachments

(1 attachment, 4 obsolete attachments)

#4 crash for 45.0b1.  I anticpate same for 45.0b2.

near zero crash rate for other  channels.
big uptick in crash rate starts with 45.0a1 buildid 20151119030239 

bp-6265c5dd-a251-4ca3-ae17-78c3d2160222 (gico)
is typical whacko stack
0 	xul.dll	nsMsgI18NConvertFromUnicode(char const*, nsString const&, nsACString_internal&, bool, bool)	c:/builds/moz2_slave/tb-rel-c-beta-w32_bld-00000000/build/mailnews/base/util/nsMsgI18N.cpp:106
1 	msvcrt.dll	_atan_pentium4	

others ...

bp-f5d37a11-f994-4f3e-b5b6-253892160222
 0 	xul.dll	nsMsgI18NConvertFromUnicode(char const*, nsString const&, nsACString_internal&, bool, bool)
1 	freebl3.dll	camellia_decrypt256
2 	windowscodecs.dll	CIFDPaddingManager::WritePaddingToStream(IStream*, int, unsigned long*)

bp-aa022582-881b-4f31-8d4c-7bc272151103 41.0b2
 0 	xul.dll	nsMsgI18NConvertFromUnicode(char const*, nsString const&, nsACString_internal&, bool)	c:/builds/moz2_slave/tb-rel-c-beta-w32_bld-00000000/build/mailnews/base/util/nsMsgI18N.cpp:93
1 	dwrite.dll	_SEH_epilog4_GS	

less whacko
bp-aac9188c-17df-4ef0-a8fb-d8ef22151124 45.0a1
 0 	xul.dll	nsMsgI18NConvertFromUnicode(char const*, nsString const&, nsACString_internal&, bool, bool)	c:/builds/moz2_slave/tb-c-cen-w64-ntly-000000000000/build/mailnews/base/util/nsMsgI18N.cpp:106
1 	xul.dll	nsMsgCompose::SendMsg(int, nsIMsgIdentity*, char const*, nsIMsgWindow*, nsIMsgProgress*)	c:/builds/moz2_slave/tb-c-cen-w64-ntly-000000000000/build/mailnews/compose/src/nsMsgCompose.cpp:1177
2 	xul.dll	XPTC__InvokebyIndex	xpcom/reflect/xptcall/md/win32/xptcinvoke_asm_x86_64.asm
still #4 crash for beta 45. I've sent a reproducible testcase to magnus and kent.  So would be great to fix for 45.0

With the email in an imap folder, edit message as new (it should have 2 attachments). Change recipient and click send. crash. email gets to recipient without the 2 attachments.

No crash if email opened as new from local folder. No crash with version 38.

The person who supplied the testcase said he crashes if he *replies* to the email in an imap folder. I could not make it crash with those steps
status-thunderbird45: --- → affected
status-thunderbird46: --- → affected
tracking-thunderbird45: --- → ?
Keywords: regression, regressionwindow-wanted
oldest 45.0a1 crash I find is 20151118030204 bp-7a80ad15-d5df-473b-a36a-4999c2151124 - increases perhaps start in that time period ~2015-11-18
checked that it also crashes nightly bp-8d0c9adf-e40f-4950-9c38-e86892160304

And correction to comment 1, email does NOT email gets to recipient.

Comment 4

a year ago
Can't reproduce, and the crash id line is off...
(In reply to Magnus Melin from comment #4)
> Can't reproduce, and the crash id line is off...

You mean c:/builds/moz2_slave/tb-c-cen-w32-ntly-000000000000/build/mailnews/base/util/nsMsgI18N.cpp:106  ?

I narrowed the regression range to 45.0a1 2015-11-14 works, 2015-11-15 fails.
http://hg.mozilla.org/comm-central/pushloghtml?startdate=2015-11-14%2003:05:00&enddate=2015-11-15%2003:05:00
so suspect http://hg.mozilla.org/comm-central/rev/15454eb97bbe bug 1202401

revised steps
- start in safe mode, never sent any email, no saved passwords
- email in an imap folder
- edit message as new (it should have 2 attachments)
- Change recipient and click send. 
crash before geting smtp password prompt
Blocks: 1202401
Keywords: regressionwindow-wanted
Whiteboard: [regression:TB45?] → [regression:TB45]

Comment 6

a year ago
(In reply to Wayne Mery (:wsmwk, use Needinfo for questions) from comment #5)
> (In reply to Magnus Melin from comment #4)
> > Can't reproduce, and the crash id line is off...
> 
> You mean
> c:/builds/moz2_slave/tb-c-cen-w32-ntly-000000000000/build/mailnews/base/util/
> nsMsgI18N.cpp:106  ?

Yes - http://mxr.mozilla.org/comm-central/source/mailnews/base/util/nsMsgI18N.cpp#106

> revised steps
> - start in safe mode, never sent any email, no saved passwords
> - email in an imap folder
> - edit message as new (it should have 2 attachments)
> - Change recipient and click send. 
> crash before geting smtp password prompt

Can't reproduce, even with no saved smtp password.

Comment 7

a year ago
If you can reproduce, what are your charset settings for incoming/outgoing?
(Assignee)

Comment 8

a year ago
Send me the test case please.
Flags: needinfo?(vseerror)
Well, this is interesting. My viewing text encoding was set to chinese simplifed. I changed it to unicode, ruan same steps, no crash
Flags: needinfo?(vseerror)

Comment 10

a year ago
Wayne, can you give the exact settings you have for incoming/outgoing encoding. 
And does it matter if the password is saved or not?
Flags: needinfo?(vseerror)
password logged in or not is not a necessary step. I mention it only to clearly demonstrate that the crash is during the message composition, not the sending.

revised steps
- start in safe mode never sent any email
- click the email in an imap folder
- set character encoding to chinese simplified
- edit message as new (it should have 2 attachments)
- Change recipient and sender
- click send. 

Note, I can't explain it, but if you first successfully send a message using another character encoding, then use the steps above, the message WILL send OK.  mail.compose.max_recycled_windows 1 and 0 both fail
Flags: needinfo?(vseerror)
(Assignee)

Comment 12

a year ago
Wayne, I'm sure you need a special message to make this crash, right? Can you please make this message accessible to me. I tried an e-mail with two attachments from an IMAP folder, viewing it in "Chinese, Simplified", edit as new, change sender/recipient, sent, and no crash. Surely this depends on the encoding of the message you're editing as new.
I've emailed it again - should be in your inbox.

Comment 14

a year ago
Still can't reproduce on linux.
(Assignee)

Comment 15

a year ago
(In reply to Wayne Mery (:wsmwk, use Needinfo for questions) from comment #13)
> I've emailed it again - should be in your inbox.
Well, nice of you that you mailed it twice, but both messages went to my SPAM folder.

Now I've got the message. Basically, the attachment was a mailbox with one message in it.

I changed the ">From" to a "From" in the first line and stored the mailbox as a *local folder*.

Then I replied to the messages and upon sending it, it crashes straight way with my local compile. No fancy "edit as new" or "safe mode". Oh, all is as described: Message has two attachments and shows as "Chinese, Simplified".

Sadly, not so easy to debug. I get this:
A buffer overrun has occurred in thunderbird.exe which has corrupted the program's internal state. Press Break to debug the program or Continue to terminate the program. For more details please see Help topic 'How to debug Buffer Overrun Issues'.
with a break here at the end of nsresult nsMsgI18NConvertFromUnicode().

  rv = encoder->Finish(localbuf, &dstLength);
  if (NS_SUCCEEDED(rv)) {
    outString.Append(localbuf, dstLength);
    return !mappingFailure ? rv: NS_ERROR_UENC_NOMAPPING;
  }
  return rv;
} <<<<<=====

The exception is thrown indicating this line. So what most likely happens is that some buffer gets overwritten and the stack is corrupted and the program can even return.
(Assignee)

Comment 16

a year ago
Created attachment 8729229 [details] [diff] [review]
Proposed fix (v1).

Ugly, ugly, ugly. You give it a buffer and a length, and it overruns the buffer. On different systems that may not occur or the one byte overrun might not be critical. On Windows it's absolutely fatal.

Making the buffer one byte longer gets rid of the clearly very reproducible crash.

Before we call finish, we must reset the buffer length to the original length, that was also wrong.
Assignee: nobody → mozilla
Status: NEW → ASSIGNED
Attachment #8729229 - Flags: review?(mkmelin+mozilla)
(Assignee)

Comment 17

a year ago
Created attachment 8729241 [details] [diff] [review]
Proposed fix (v2).

Ugly, ugly, ugly. I filled the buffer at the end to see by how much it would be overrun and it overran by two bytes. So give it ten bytes more and keep our fingers crossed.

We should follow up with an M-C bug. Maybe I don't have all that much confidence that the conversion was actually right :-(
Attachment #8729229 - Attachment is obsolete: true
Attachment #8729229 - Flags: review?(mkmelin+mozilla)
Attachment #8729241 - Flags: review?(mkmelin+mozilla)
(Assignee)

Comment 18

a year ago
Magnus, if you want to play around with it, just set
localbuf[512]='M';
localbuf[513]='a';
localbuf[514]='g';
localbuf[515]='n';
localbuf[516]='u';
localbuf[517]='s';
and see how much is overwritten on Linux. On Windows I get two bytes overrun.
(Assignee)

Comment 19

a year ago
Created attachment 8729374 [details] [diff] [review]
Diagnostic patch

I wanted to know how often the buffer is overrun so I added some debug.

This is the result:
The buffer is overrun one with the string ",2" as one can see in the debug here:
===== Returned length 514
===== Buffer from index 450 |2-23?0,200:10</div>
                        <div><b>To:</b>?0,2gnu|
===== Buffer from index 0 |<a
                            href="mailto:alici|

The "Magnus" I placed at the end is changed to ",2gnu".

No information is lost, since Convert() tells us that it returned more than the buffer can hold. Good one.

So my patch is OK to use to fix the crash and we need to tell the M-C guys to fix their stuff.
(Assignee)

Comment 20

a year ago
Created attachment 8729378 [details] [diff] [review]
Proposed fix/workaround (v3).

OK, I've added a warning. So we now see
[6328] WARNING: encoder->Convert() returned 514 characters. Limit = 512: file c:/mozilla-source/comm-central/mailnews/base/util/nsMsgI18N.cpp, line 94
instead of a crash.
Attachment #8729241 - Attachment is obsolete: true
Attachment #8729241 - Flags: review?(mkmelin+mozilla)
Attachment #8729378 - Flags: review?(mkmelin+mozilla)
(Assignee)

Comment 21

a year ago
I'm not sure that all the paths in
https://dxr.mozilla.org/mozilla-central/source/intl/uconv/ucvcn/nsUnicodeToGBK.cpp#186
check before they write into the buffer.

Simon, we have a case here were we ask for 512 bytes and get 514. That overruns our buffer and crashes on some platforms.
Flags: needinfo?(smontagu)
Comment on attachment 8729378 [details] [diff] [review]
Proposed fix/workaround (v3).

Review of attachment 8729378 [details] [diff] [review]:
-----------------------------------------------------------------

Great work! r+=me with some nits.

::: mailnews/base/util/nsMsgI18N.cpp
@@ +77,5 @@
>    int32_t originalUnicharLength = inString.Length();
>    int32_t srcLength;
>    int32_t dstLength;
> +  char localbuf[512+10]; // We have seen cases were the buffer was overrun
> +                         // by two (!!) bytes. So give it ten bytes more.

Please file the m-c bug, and reference the bug number here. That makes it easier to remove this workaround when the m-c bug is fixed.

@@ +87,5 @@
>    while (consumedLen < originalUnicharLength) {
>      srcLength = originalUnicharLength - consumedLen;  
>      dstLength = 512;
>      rv = encoder->Convert(currentSrcPtr, &srcLength, localbuf, &dstLength);
> +    if (dstLength > 512) {

You don't want the sprintf in release builds since the warning will not print, so wrap this entire block in #ifdef DEBUG
Attachment #8729378 - Flags: review?(mkmelin+mozilla) → review+
(Assignee)

Updated

a year ago
Depends on: 1255863
(Assignee)

Comment 23

a year ago
To be continued in bug 1255863.
Flags: needinfo?(smontagu)
(Assignee)

Comment 24

a year ago
Created attachment 8729649 [details] [diff] [review]
Proposed fix/workaround (v3a) - review nits fixed.

Done. Thanks for the quick review.
Attachment #8729374 - Attachment is obsolete: true
Attachment #8729378 - Attachment is obsolete: true
Attachment #8729649 - Flags: review+
(Assignee)

Updated

a year ago
Keywords: checkin-needed

Comment 25

a year ago
FTR: using the diagnosis patch (attachment 8729374 [details] [diff] [review]) I still don't reproduce on linux.
(Assignee)

Comment 26

a year ago
Hmm, what can I say. Have you printed dstLength as well? I clearly see that 514 bytes are returned.
Well, unless something was fixed in M-C in the last few days. I refreshed from M-C five days ago.
I trust you agree that my workaround is harmless and can be used to avoid a crash.
(Assignee)

Updated

a year ago
status-thunderbird47: --- → affected
status-thunderbird48: --- → affected
(Assignee)

Comment 27

a year ago
Comment on attachment 8729649 [details] [diff] [review]
Proposed fix/workaround (v3a) - review nits fixed.

[Approval Request Comment]
Regression caused by (bug #): Caused indirectly by bug 1202401, where we started to use a faulty M-C function. See bug 1255863.
User impact if declined: Nice crashes ;-)
Testing completed (on c-c, etc.): Manual.
Risk to taking this patch (and alternatives if risky):
Not risky. We just provide a larger buffer to cater for expected overrun. Manual inspections shows that using the overrun bytes makes the thing work.
Attachment #8729649 - Flags: approval-comm-release?
Attachment #8729649 - Flags: approval-comm-beta?
Attachment #8729649 - Flags: approval-comm-aurora+
(Assignee)

Comment 28

a year ago
I've just tried this on last week's Earlybird (TB 46) and this weeks Earlybird (TB 47). When I reply to the e-mail, I get an immediate crash.

Landed on Aurora (TB 47) for further testing and because we need it for TB 45.
https://hg.mozilla.org/releases/comm-aurora/rev/2f90e3dba2f0

C-C is currently closed.
status-thunderbird47: affected → fixed
tracking-thunderbird45: ? → +
(Assignee)

Updated

a year ago
Attachment #8729649 - Flags: approval-comm-release? → approval-comm-esr45?

Comment 29

a year ago
A bit of further investigation sais we don't seem run through this path on linux for the testcase in question... 
(Won't investigate further.)
(Assignee)

Comment 30

a year ago
Are we using the same test case? I'm using  a message "Subject: Re: Re: outdoor furniture price list from Origin Furnishings". The mailbox is 30149 lines long and has 2344941 bytes. charset="GB18030".

Updated

a year ago
Target Milestone: --- → Thunderbird 48.0
(Assignee)

Comment 31

a year ago
Since this was landed on Aurora (comment #28) I've tried
http://ftp.mozilla.org/pub/thunderbird/nightly/latest-comm-aurora/thunderbird-47.0a2.en-US.win32.installer.exe
Works as expected.

Comment 32

a year ago
I guess I was too tired and forgot to do enough of the steps to reproduce. Tried it now and I do get one case where length is 513 (so one too much). It just doesn't result in any crash on linux.
(Assignee)

Comment 33

a year ago
Without looking at the decoder in much detail, I think it overruns by at most one "character" (so 1-4? bytes) of the target encoding. I think with 10 extra bytes we're on the safe side until M-C fix the bug. Whether or not it crashes of course depends on the memory layout. As I said, all Windows versions I tried crashed on a simple reply to the message, no fancy steps necessary.
(Assignee)

Comment 34

a year ago
I'll land it on C-C myself using a transplant from Aurora.
Keywords: checkin-needed
Comment on attachment 8729649 [details] [diff] [review]
Proposed fix/workaround (v3a) - review nits fixed.

http://hg.mozilla.org/releases/comm-beta/rev/d67233a60c72
http://hg.mozilla.org/releases/comm-esr45/rev/60354f9683ed
Attachment #8729649 - Flags: approval-comm-esr45?
Attachment #8729649 - Flags: approval-comm-esr45+
Attachment #8729649 - Flags: approval-comm-beta?
Attachment #8729649 - Flags: approval-comm-beta+

Updated

a year ago
status-thunderbird45: affected → fixed
status-thunderbird46: affected → fixed
Whiteboard: [regression:TB45] → [regression:TB45][checkin-needed comm-central]

Updated

a year ago
Whiteboard: [regression:TB45][checkin-needed comm-central] → [regression:TB45]
(Assignee)

Comment 36

a year ago
https://hg.mozilla.org/comm-central/rev/8a7440769df9
Status: ASSIGNED → RESOLVED
Last Resolved: a year ago
status-thunderbird48: affected → fixed
Resolution: --- → FIXED
Although the buffer overrun is indeed our (m-c) fault, the conversion code is also wrong. Either handle NS_OK_UENC_MOREOUTPUT or use GetMaxLength(). You don't have to wait for m-c fix if you use GetMaxLength().
(Assignee)

Comment 38

a year ago
Actually, we analysed this in bug 1263088 and the C-C conversion code is correct.
You need to log in before you can comment on or make changes to this bug.