Crash in nsMapiHook::PopulateCompFieldsW

RESOLVED FIXED in Thunderbird 67.0

Status

defect
--
critical
RESOLVED FIXED
4 months ago
2 months ago

People

(Reporter: wsmwk, Assigned: jorgk)

Tracking

({crash})

Thunderbird 67.0
x86
Windows 7

Thunderbird Tracking Flags

(thunderbird66 fixed, thunderbird67 fixed)

Details

(crash signature)

Attachments

(2 attachments, 6 obsolete attachments)

Reporter

Description

4 months ago

#1 crash for 65.0b4. 50 crashes in 7 hours for 20 users.

This bug is for crash report bp-81c27caa-267b-46eb-b740-7a1b00190130.

Top 10 frames of crashing thread:

0 xul.dll nsMapiHook::PopulateCompFieldsW comm/mailnews/mapi/mapihook/src/msgMapiHook.cpp:701
1 xul.dll CMapiImp::SendMailW comm/mailnews/mapi/mapihook/src/msgMapiImp.cpp:264
2 rpcrt4.dll Invoke
3 rpcrt4.dll _imp_load__FreeAddrInfoW
4 ole32.dll ole32.dll@0x1407f5
5 ole32.dll ChannelProcessInitialize
6 ole32.dll ChannelWrapper_QueryInterface
7 ole32.dll CCtxComChnl::ContextInvoke
8 ole32.dll MTAInvoke d:\w7rtm\com\ole32\com\dcomrem\callctrl.cxx:2097
9 ole32.dll STAInvoke

Assignee

Comment 1

4 months ago
Posted patch 1523818-crash.patch (obsolete) — Splinter Review

Looks like we forgot to zero out this structure. I'm not sure what the {} were meant to do.

Assignee: nobody → jorgk
Status: NEW → ASSIGNED
Attachment #9040038 - Flags: review?(mikekaganski)

Comment 2

4 months ago
Comment on attachment 9040038 [details] [diff] [review]
1523818-crash.patch

Review of attachment 9040038 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/mapi/mapihook/src/msgMapiImp.cpp
@@ -247,5 @@
>  
>      MOZ_LOG(MAPI, mozilla::LogLevel::Debug, ("CMapiImp::SendMailW using flags %d\n", aFlags));
>  
>      // Handle possible nullptr argument.
> -    nsMapiMessageW Message{};

https://en.cppreference.com/w/cpp/language/zero_initialization
Assignee

Comment 3

4 months ago

Interesting, I've never seen it. So why would msgMapiHook.cpp:701 crash? if (aMessage->lpOriginator). From the crash stats and our own experience, Windows prefers to use SendMailW so we get into this code patch.

Strange that it doesn't crash for us.

Wayne, do you have access to user comments? Maybe someone is stating how exactly they've been invoking this, it wouldn't have been through "Sent to > Mail recipient".

A bit sad that we stopped crashes in SendMail (without the W) for 64bit users and created a new crash elsewhere. The reports I looked at were all for x86, so 32bit :-(

Flags: needinfo?(vseerror)
MOZ_LOG(MAPI, mozilla::LogLevel::Debug, ("CMapiImp::SendMailW flags=%x subject: %s sender: %s\n",
        aFlags,

Parameter aFlags is of type long. Should the format string use "flags=%lx" ?
Not sure if that can cause a side effect here.

Is NS_ConvertUTF16toUTF8(NULL).get() allowed? That's what
NS_ConvertUTF16toUTF8(aMessage->lpszSubject).get(),
will do, if aMessage is NULL, and Message is zero-initialized.

Assignee

Comment 6

4 months ago

Right, thanks for looking for possible crash reasons. I think all that stuff is only evaluated if logging is switched on. That said, we can fix this issue. The crash is in PopulateCompFieldsW and that doesn't have logging.

I guess after being so happy about getting 64bit going, we need to compile 32bit again and check it. Maybe the crash comes from "normal" "Sent to > Mail recipient". That would be bad.

701 if (aMessage->lpOriginator)
702 aCompFields->SetFrom(nsDependentString(aMessage->lpOriginator->lpszAddress));

I wonder if a crash on line 702 could get reported as crashing on line 701. Maybe that's possible with compiler optimizations?

If yes, could we crash because aMessage->lpOriginator is non-NULL, but aMessage->lpOriginator->lpszAddress is NULL?

Also, it makes me a bit nervous that we pass a pointer to a local variable around, and hope that it has been correctly initialized. It might be fine, but maybe it makes it more difficult for us to see a side effect. I suggest to simply the code to avoid that local variable.

(In reply to Kai Engert (:kaie:) from comment #7)

I suggest to simply the code to avoid that local variable.

Like in this attachment.

Assignee

Comment 9

4 months ago

Thanks, I guess you found the crash :-) - Likely that we're one line off and it's here:
aCompFields->SetFrom(nsDependentString(aMessage->lpOriginator->lpszAddress));
If you pass in a null, you crash.

I'd take that hunk and the improvements to the logging, but not the other, well, overzealous checking. I'll do a patch, OK?

If you want to try the simpler change first, that's fine, but if it doesn't fix it, we'll try the overzealous checking next, ok? ;-)

Assignee

Comment 11

4 months ago
Posted patch no-local-pointer.patch (v2) (obsolete) — Splinter Review

How about this. Just fixing the crash and the logging (and the tabs).

Attachment #9040038 - Attachment is obsolete: true
Attachment #9040038 - Flags: review?(mikekaganski)
Attachment #9040247 - Flags: feedback?(kaie)
Assignee

Comment 12

4 months ago
Posted patch no-local-pointer.patch (v2b) (obsolete) — Splinter Review

Or like this. Dumping empty strings is never helpful.

Attachment #9040247 - Attachment is obsolete: true
Attachment #9040247 - Flags: feedback?(kaie)
Flags: needinfo?(vseerror)
Attachment #9040248 - Flags: feedback?(kaie)

If you keep
aMessage=&Message;
then you don't need the initial aMessage&& in the log statements.

Use flags=%lx in both places.

If PopulateCompFieldsW and PopulateCompFields require that aMessage is non-null, then maybe they should check on entry, and return with failure if NULL. Yes, it's unlikely that it ever gets called with a NULL parameter given the current code, but it would document the function's expectation. Also, if we'll still crash at the same place, we know it cannot be a null pointer being passed in, but a corrupt pointer.

Assignee

Comment 14

4 months ago
Posted patch no-local-pointer.patch (v2c) (obsolete) — Splinter Review

Tweaks as per previous comment.

Attachment #9040248 - Attachment is obsolete: true
Attachment #9040248 - Flags: feedback?(kaie)
Attachment #9040252 - Flags: feedback?(kaie)
Assignee

Comment 15

4 months ago
Posted patch no-local-pointer.patch (v2d) (obsolete) — Splinter Review

Sorry, there needed to be one more check.

Attachment #9040252 - Attachment is obsolete: true
Attachment #9040252 - Flags: feedback?(kaie)
Attachment #9040254 - Flags: feedback?(kaie)
Comment on attachment 9040254 [details] [diff] [review]
no-local-pointer.patch (v2d)

r=kaie

I'd prefer to add
  if (!aMessage) return NS_ERROR_FAILURE;
to the beginning of both PopulateCompFieldsWithConversion and PopulateCompFieldsW, but up to you, r+ with or without this change.
Attachment #9040254 - Flags: feedback?(kaie) → feedback+
Assignee

Comment 17

4 months ago
Posted patch no-local-pointer.patch (v3) (obsolete) — Splinter Review

Well, what you're suggesting will change the behaviour of the system. How about returning early with OK since that's the net effect anyway.

I was going to land this with you as the author since you spotted the problem. Hence the f? to make sure you're happy with it since it's put in your name :-)

Attachment #9040254 - Attachment is obsolete: true
Attachment #9040267 - Flags: review+
Attachment #9040267 - Flags: feedback?(kaie)

(In reply to Jorg K (GMT+1) from comment #17)

Well, what you're suggesting will change the behaviour of the system.

No. Currently, if PopulateCompFieldsW* gets called with aMessage==NULL, it will crash, because the code will dereference the bad aMessage pointer.

My suggestion to return immediately will prevent such a crash, and ensure we'll fail/abort the activity in a safe way

If we still crash inside PopulateCompFieldsW*, then we will know that it isnn't because of a NULL pointer, but because of a corrupted pointer.

Reporter

Comment 19

4 months ago

Some comments

  • send an email from Quickbooks
  • trying to attach photo's from windows photo program
  • trying to send an attachment to an email, TB said someone else was using my email, I said yes and Tbird crashed
Assignee

Comment 20

4 months ago
Comment on attachment 9040267 [details] [diff] [review]
no-local-pointer.patch (v3)

(In reply to Kai Engert (:kaie:) from comment #18)

> No. Currently, if PopulateCompFieldsW* gets called with aMessage==NULL, it will crash, because the code will dereference the bad aMessage pointer.
> 
> My suggestion to return immediately will prevent such a crash, and ensure we'll fail/abort the activity in a safe way

With the trickery of passing the address of local variable, null will never be passed. So my check is nonsense. I'll move the check into the caller.
Attachment #9040267 - Attachment is obsolete: true
Attachment #9040267 - Flags: feedback?(kaie)
Assignee

Comment 21

4 months ago

How about this? Close to your original suggestion.

Attachment #9040322 - Flags: feedback?(kaie)

Comment 22

4 months ago
Comment on attachment 9040322 [details] [diff] [review]
no-local-pointer.patch (v4)

Ok, this is a good simplification of the original suggestion. It has a slight side effect, v4 will skip calling the SetTo/SetCc/SetBcc functions with the empty string. However, those fields should already be empty, so the change should be fine.
Attachment #9040322 - Flags: feedback?(kaie) → feedback+
Comment hidden (obsolete)
Reporter

Comment 24

4 months ago

Perhaps related crash sig nsTSubstring<T>::Assign | nsTSubstring<T>::Assign | nsMapiHook::PopulateCompFieldsW ?

bp-b34ad4f5-bf95-4f69-b414-049850190131 bp-b1ae66d4-234a-437e-8625-5c9b50190131 bp-8efc6d9d-774c-403c-869f-ca7810190130 bp-cc1d4526-7d05-4e8b-a418-6847b0190130

Assignee

Comment 25

4 months ago

(In reply to Kai Engert (:kaie:) from comment #22)

However, those fields should already be empty, so the change should be fine.

Yes, they should be initialised.

Comment 26

4 months ago

Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/bab34c7694e5
Don't pass null to nsDependentString() to fix crash in nsMapiHook::PopulateCompFieldsW. r=jorgk

Status: ASSIGNED → RESOLVED
Last Resolved: 4 months ago
Resolution: --- → FIXED
Assignee

Updated

4 months ago
Target Milestone: --- → Thunderbird 67.0
Assignee

Updated

4 months ago
Attachment #9040322 - Flags: review+
Attachment #9040322 - Flags: approval-comm-esr60+
Attachment #9040322 - Flags: approval-comm-beta+

Comment 27

4 months ago

I'm really sorry for the mistakes. Thank you for fixing my bug(s)!

(In reply to Wayne Mery (:wsmwk) from comment #24)

Perhaps related crash sig nsTSubstring<T>::Assign | nsTSubstring<T>::Assign | nsMapiHook::PopulateCompFieldsW ?

I've been staring at that code for a while.

Those crashes on line 764 suggest that a valid pointer, which points to invalid/random memory locations, is passed to the function.

However, if that's what happened, then it's very surprising it didn't crash earlier in the function, like on line 714, 755, or inside HandleAttachmentsW. If the contents were really random, it's unlikely that most fields were null (and caused these earlier lines to be skipped).

Could there be a problem with the Unicode/UTF-16 processing on line 764 ? Could the arriving data be in an unexpected encoding, that we treat incorrectly? The string code crashes while trying to obtain the length of the input.

FWIW, we had 893 crashes in three days, but only 4 crashes like those from comment 24. Maybe we shouldn't worry too much about comment 24, and check later, after we have crash data on a build with the attached fix.

Reporter

Updated

3 months ago
See Also: → 1527450

Comment 31

3 months ago

behavior change when sending mail via MAPI

Before this bugfix was introduced, when sending a letter via MAPI, a third-party application opened the form for editing the letter before sending it and it was possible to add a text or make changes. And you had to press the "Send" button to send it.

Now sending the letter immediately (or if not turned off, it shows a warning that a third-party application is trying to send a letter without your knowledge)

Assignee

Comment 32

3 months ago

Let's continue the discussion in bug 1527450. Meanwhile we had lots of MAPI "fun" in bug 1530820 where users reported problems after an auto-upgrade from 60.5.1 to 60.5.2 which shipped a corrected MAPI library. The solution seems to be to install again over the top.

In your case I suggest you reinstall TB 66 beta 2 again which might solve the issue. We have withdrawn MAPISendMailW() and all should return to normal now.

Assignee

Updated

2 months ago
Attachment #9040322 - Flags: approval-comm-esr60+ → approval-comm-esr60?
Assignee

Comment 33

2 months ago
Comment on attachment 9040322 [details] [diff] [review]
no-local-pointer.patch (v4)

Let's leave some of the fun for TB 68. We had enough MAPI chagrin in TB 60.
Attachment #9040322 - Flags: approval-comm-esr60?
You need to log in before you can comment on or make changes to this bug.