Closed Bug 1398807 Opened 7 years ago Closed 3 years ago

Crash in nsMsgCompose::CloseWindow, with nsMsgCompose object destroyed?

Categories

(MailNews Core :: Composition, defect)

Unspecified
All
defect

Tracking

(thunderbird_esr78 affected, thunderbird_esr91+ verified, thunderbird92+ verified)

RESOLVED FIXED
93 Branch
Tracking Status
thunderbird_esr78 --- affected
thunderbird_esr91 + verified
thunderbird92 + verified

People

(Reporter: wsmwk, Assigned: mkmelin)

References

(Depends on 2 open bugs)

Details

(Keywords: crash, topcrash-thunderbird)

Crash Data

Attachments

(2 files)

~20% of crashes are Mac, which is pretty high for Mac

This bug was filed from the Socorro interface and is 
report bp-ba56e1db-d50f-4f49-a6ef-c330d0170903.
=============================================================
0 	XUL	nsMsgCompose::CloseWindow()	/builds/slave/tb-rel-c-esr52-m64_bld-0000000/build/mailnews/compose/src/nsMsgCompose.cpp:1565
1 	XUL	nsMsgComposeSendListener::OnStopCopy(nsresult)	/builds/slave/tb-rel-c-esr52-m64_bld-0000000/build/mailnews/compose/src/nsMsgCompose.cpp:3923
2 	XUL	nsMsgComposeAndSend::MaybePerformSecondFCC(nsresult)	/builds/slave/tb-rel-c-esr52-m64_bld-0000000/build/mailnews/compose/src/nsMsgSend.cpp:3965
3 	XUL	nsMsgComposeAndSend::OnStopOperation(nsresult)	/builds/slave/tb-rel-c-esr52-m64_bld-0000000/build/mailnews/compose/src/nsMsgSend.cpp:3924
4 	XUL	nsMsgFilterAfterTheFact::AdvanceToNextFolder()	/builds/slave/tb-rel-c-esr52-m64_bld-0000000/build/mailnews/base/search/src/nsMsgFilterService.cpp:362
5 	XUL	nsMsgApplyFiltersToMessages::RunNextFilter()	/builds/slave/tb-rel-c-esr52-m64_bld-0000000/build/mailnews/base/search/src/nsMsgFilterService.cpp:1102
6 	XUL	nsMsgFilterAfterTheFact::AdvanceToNextFolder()	/builds/slave/tb-rel-c-esr52-m64_bld-0000000/build/mailnews/base/search/src/nsMsgFilterService.cpp:464
7 	XUL	nsMsgFilterService::ApplyFilters(int, nsIArray*, nsIMsgFolder*, nsIMsgWindow*, nsIMsgOperationListener*)	/builds/slave/tb-rel-c-esr52-m64_bld-0000000/build/mailnews/base/search/src/nsMsgFilterService.cpp:1129
8 	XUL	nsMsgComposeAndSend::FilterSentMessage()	/builds/slave/tb-rel-c-esr52-m64_bld-0000000/build/mailnews/compose/src/nsMsgSend.cpp:3894
9 	XUL	nsMsgComposeAndSend::NotifyListenerOnStopCopy(nsresult)	/builds/slave/tb-rel-c-esr52-m64_bld-0000000/build/mailnews/compose/src/nsMsgSend.cpp:3853
10 	XUL	CopyListener::OnStopCopy(nsresult)	/builds/slave/tb-rel-c-esr52-m64_bld-0000000/build/mailnews/compose/src/nsMsgCopy.cpp:116
11 	XUL	nsMsgCopyService::ClearRequest(nsCopyRequest*, nsresult)	/builds/slave/tb-rel-c-esr52-m64_bld-0000000/build/mailnews/base/src/nsMsgCopyService.cpp:210
12 	XUL	nsMsgCopyService::NotifyCompletion(nsISupports*, nsIMsgFolder*, nsresult)	/builds/slave/tb-rel-c-esr52-m64_bld-0000000/build/mailnews/base/src/nsMsgCopyService.cpp:695
13 	XUL	nsMsgLocalMailFolder::OnCopyCompleted(nsISupports*, bool)	/builds/slave/tb-rel-c-esr52-m64_bld-0000000/build/mailnews/local/src/nsLocalMailFolder.cpp:1483
bp-0b4db255-2d44-43cf-83ec-95f610170902 writes "I cannot send out mails and messages".

But almost none of the crash reporters have sent more than one crash report.
Hmm, this is the code that crashes:
NS_IMETHODIMP nsMsgCompose::CloseWindow(void)
{
...
  if (m_baseWindow)
  {
...
    nsIBaseWindow * window = m_baseWindow;
    m_baseWindow = nullptr;
    rv = window->Destroy(); <=== 1565

Hard to see why that would crash although I don't understand the pointer acrobatics here. Nulling out m_baseWindow might destroy the window if no reference is left. Strange. Kent, do you have a better understanding?

Why not just:
rv = m_baseWindow->Destroy();
m_baseWindow = nullptr;
Flags: needinfo?(rkent)
The only reason I can think of for this code is doing what they also suggested for m_editor, making sure that any other uses in nsMsgCompose are not used. But this makes no sense, because this location is the only current use in nsMsgCompose of m_baseWindow.

Looking at the crash report, it looks to me like the value of window ("Crash Address") is 0xffffffffe5e5e5f9. I'll guess that the value represents destroyed memory, but an offset into it. What could be destroyed? My best guess would be the nsMsgCompose object itself, with the difference between 0xffffffffe5e5e5f9 and 0xffffffffe5e5e5e5 being the location of m_baseWindow in "this". But it is not at all clear how that could happen.
Flags: needinfo?(rkent)
Mac user in bp-5a142ebf-b394-49ad-86d7-663130180106 writes "the message is well sent but it is impossible for thunderbird to copy the message in the "sent" folder. Actually I tend to think that the "sent" folder is not named "sent" but something like "sent messages" or "messages envoyés" as I use a french version of my mailbox (this is a outlook web application hosted and managed by OVH). This doesn't occur with another mail software. This is a bit of a pity because I lose trace of my sent messages"

I wonder if some users don't crash, but instead experience hangs like bug 1381485 (which has a stacktrace)
Component: General → Composition
Product: Thunderbird → MailNews Core
Crash rate is up 30% since early November, but unclear why. Mac/Windows ratio is the same, about 50-50.

I suspect for Mac this is also related Bug 1381485 - Hangs frequently while sending imap mail while copying message to imap Sent folder on Mac. No problem if Sent is set to local folder. deadlock on CGLClearDrawable?
Blocks: 1381485

Crash rate has continued to increase with users adopting version 60 - about 50% higher than 4 months ago. About 2/3 of crashes are Mac

Wayne (and others), is it possible this bug is related to screen resolution (and maybe even frame rates, as a by product of screen size)? That could explain why it affects a higher than expected number of Mac users. Is it possible to filter reports of this bug based on system type or screen resolution directly (ie Macbook vs iMac - with iMacs generally having higher resolutions - 4K or 5K for modern ones)

See: Bug #1386701

bp-ae4492ad-34fd-4837-8788-afbc00190922 68.1.0

(In reply to Scott from comment #7)

Wayne (and others), is it possible this bug is related to screen resolution (and maybe even frame rates, as a by product of screen size)? That could explain why it affects a higher than expected number of Mac users. Is it possible to filter reports of this bug based on system type or screen resolution directly (ie Macbook vs iMac - with iMacs generally having higher resolutions - 4K or 5K for modern ones)

See: Bug #1386701

I don't know about current versions, but certainly not for older versions

Summary: Crash in nsMsgCompose::CloseWindow → Crash in nsMsgCompose::CloseWindow, with nsMsgCompose object destroyed? [Mac]

The screen resolution comment has got me thinking.
These days about half my screen time I'm running dual display on a 2016 MacBook Pro, with the addition of a Phillips 43" monitor (link below)
I can remember the last two cases of TB freezing and perhaps they happened while running dual display described above.
My tickets include Apple crash report, presumably with hardware info, so the tickets may be able to confirm this.

Back a little further, I can't recall if the Thuderbird lockups on Macbook Pro were with dual display or not. I didn't create tickets for the initial crashes.

Back a few months I did have Thunderbird lockups on my work system. A Fedora Linux setup, an old PC with two old LCD monitors. One HDMI, one DVI. I haven't used that system in months.

Hardware setup (current):
Macbook Pro 2016, MacOs 10.15.3
https://www.philips.com.au/c-p/BDM4350UC_75/brilliance-4k-ultra-hd-lcd-display
https://www.apple.com/au/shop/product/MUF82ZA/A/usb-c-digital-av-multiport-adapter

Hope this helps.

(In reply to Wayne Mery (:wsmwk) from comment #8)

bp-ae4492ad-34fd-4837-8788-afbc00190922 68.1.0

(In reply to Scott from comment #7)

Wayne (and others), is it possible this bug is related to screen resolution (and maybe even frame rates, as a by product of screen size)? That could explain why it affects a higher than expected number of Mac users. Is it possible to filter reports of this bug based on system type or screen resolution directly (ie Macbook vs iMac - with iMacs generally having higher resolutions - 4K or 5K for modern ones)

See: Bug #1386701

I don't know about current versions, but certainly not for older versions

(In reply to Mike from comment #10)

The screen resolution comment has got me thinking.

In the main bug thread (https://bugzilla.mozilla.org/show_bug.cgi?id=1381485), a number of have been testing a preference change which forces Thunderbird to use hardware acceleration in the mac client.

It has shown very promising results for several off us (including all the users in my office that had the problem). I think at the very least there is correlation here between screen resolution and the integrated GPU being overworked and the bug occurring.

Head on over there to try the preference change for yourself.

No longer blocks: 1381485
Depends on: 1381485, 1508649, 1592338

I'm liking my comment 4 theory about Macs. 74% of version 68 crashes are Mac, for version 78 only 24% are Mac (76% windows). i.e. the crash rate for Mac is much lower where HWA is enabled in version 78.

version 78 crashes:
Mac bp-bc83834b-9dc3-48e4-a96e-4935b0200819
Windows bp-5eb95e75-d2d2-4efe-ae14-7f72e0200911

Flags: needinfo?(benc)

For 78.11.0, this signature isn't in the top 300.

Severity: critical → S4
Whiteboard: [rare]
Attached image beta crash rate.png

Something changed (a new bug) to make the existing crash signature much worse ...

beta 90 and 91 crash rate are crazily increased as seen in the graph https://crash-stats.mozilla.org/signature/?product=Thunderbird&release_channel=%21release&signature=nsMsgCompose%3A%3ACloseWindow&date=%3E%3D2021-05-17T00%3A08%3A00.000Z&date=%3C2021-08-17T00%3A08%3A00.000Z#graphs

Crash rank for 78.12.0 is outside 200
Crash rank for 91 is solidly 7

bp-cd7d7539-feb4-4ce1-a2bf-5a1b00210816 92.0 Windows
0 xul.dll nsMsgCompose::CloseWindow() comm/mailnews/compose/src/nsMsgCompose.cpp:1336
1 xul.dll nsMsgComposeSendListener::OnStopCopy(nsresult) comm/mailnews/compose/src/nsMsgCompose.cpp:3314
2 xul.dll NS_InvokeByIndex
3 xul.dll static XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) js/xpconnect/src/XPCWrappedNative.cpp:1130
4 xul.dll XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*) js/xpconnect/src/XPCWrappedNativeJSOps.cpp:921
5 xul.dll js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason) js/src/vm/Interpreter.cpp:489
6 xul.dll Interpret(JSContext*, js::RunState&) js/src/vm/Interpreter.cpp:3242
7 xul.dll js::RunScript(JSContext*, js::RunState&) js/src/vm/Interpreter.cpp:371
8 xul.dll js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason) js/src/vm/Interpreter.cpp:521
9 xul.dll js::Call(JSContext*, JS::Handle<JS::Value>, JS::Handle<JS::Value>, js::AnyInvokeArgs const&, JS::MutableHandle<JS::Value>, js::CallReason) js/src/vm/Interpreter.cpp:566
10 xul.dll js::fun_apply(JSContext*, unsigned int, JS::Value*) js/src/vm/JSFunction.cpp:1151
11 xul.dll js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason) js/src/vm/Interpreter.cpp:489
12 xul.dll Interpret(JSContext*, js::RunState&) js/src/vm/Interpreter.cpp:3242
13 xul.dll js::RunScript(JSContext*, js::RunState&) js/src/vm/Interpreter.cpp:371
14 xul.dll js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason) js/src/vm/Interpreter.cpp:521
15 xul.dll js::Call(JSContext*, JS::Handle<JS::Value>, JS::Handle<JS::Value>, js::AnyInvokeArgs const&, JS::MutableHandle<JS::Value>, js::CallReason) js/src/vm/Interpreter.cpp:566
16 xul.dll JS_CallFunctionValue(JSContext*, JS::Handle<JSObject*>, JS::Handle<JS::Value>, JS::HandleValueArray const&, JS::MutableHandle<JS::Value>) js/src/vm/CallAndConstruct.cpp:53
17 xul.dll nsXPCWrappedJS::CallMethod(unsigned short, nsXPTMethodInfo const*, nsXPTCMiniVariant*) js/xpconnect/src/XPCWrappedJSClass.cpp:973
18 xul.dll PrepareAndDispatch(nsXPTCStubBase*, unsigned int, unsigned int*, unsigned int*) xpcom/reflect/xptcall/md/win32/xptcstubs.cpp:79
19 xul.dll SharedStub() xpcom/reflect/xptcall/md/win32/xptcstubs.cpp:98
20 xul.dll nsTimerImpl::Fire(int) xpcom/threads/nsTimerImpl.cpp:618

bp-08ee67ee-58ef-4815-8728-1208e0210816 91.0 Mac
compare version 78 bp-52f09315-8e7c-4eaf-8a30-1446d0210815

If refcount of m_baseWindow is 1, the following becomes UAF and will crash.

NS_IMETHODIMP nsMsgCompose::CloseWindow(void) {
...
    nsIBaseWindow* window = m_baseWindow;
    m_baseWindow = nullptr;
    rv = window->Destroy();

So we should use forget() or std::move like the following.

    nsCOMPtr<nsIBaseWindow> window = m_baseWindow.forget();
    rv = window->Destroy();
Flags: needinfo?(mkmelin+mozilla)
Whiteboard: [rare]

Thanks for the analysis @m_kato!! Makes sense to me.

Flags: needinfo?(mkmelin+mozilla)
Assignee: nobody → mkmelin+mozilla
Status: NEW → ASSIGNED
Severity: S4 → S2
OS: macOS → All
Summary: Crash in nsMsgCompose::CloseWindow, with nsMsgCompose object destroyed? [Mac] → Crash in nsMsgCompose::CloseWindow, with nsMsgCompose object destroyed?

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/b4081109cf60
fix crash in nsMsgCompose::CloseWindow. r=benc

Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 93 Branch

Ready for beta uplift?
#10 crash on beta, so we should know soon whether it helps. (but there are zero crashes on daily)

Flags: needinfo?(mkmelin+mozilla)

Comment on attachment 9237271 [details]
Bug 1398807 - fix crash in nsMsgCompose::CloseWindow. r=benc

[Approval Request Comment]
User impact if declined: may crash
Testing completed (on c-c, etc.): c-c
Risk to taking this patch (and alternatives if risky): should be safe

Flags: needinfo?(mkmelin+mozilla)
Attachment #9237271 - Flags: approval-comm-esr91?
Attachment #9237271 - Flags: approval-comm-beta?

Comment on attachment 9237271 [details]
Bug 1398807 - fix crash in nsMsgCompose::CloseWindow. r=benc

[Triage Comment]
Approved for beta

Attachment #9237271 - Flags: approval-comm-beta? → approval-comm-beta+

No crashes so far for 92.0b4, but it's several days too soon to say that the crash is gone.

Comment on attachment 9237271 [details]
Bug 1398807 - fix crash in nsMsgCompose::CloseWindow. r=benc

[Triage Comment]
Approved for esr91

Attachment #9237271 - Flags: approval-comm-esr91? → approval-comm-esr91+

92.0b4 looks clean

v78 crash rate is much lower than v91's (ranks ~250), so I don't think this is worth uplifting to 78.

Crash stats is clean for 91.1.0

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: