Closed Bug 1503821 Opened 3 years ago Closed 4 months ago

Crash in nsMsgDBFolder::ContainsChildNamed via import's runnable context

Categories

(MailNews Core :: Import, defect)

x86
Windows
defect
Not set
critical

Tracking

(thunderbird_esr68 affected, thunderbird_esr78 fixed, thunderbird69 wontfix, thunderbird70 wontfix, thunderbird78 affected, thunderbird79 affected)

RESOLVED FIXED
Thunderbird 79.0
Tracking Status
thunderbird_esr68 --- affected
thunderbird_esr78 --- fixed
thunderbird69 --- wontfix
thunderbird70 --- wontfix
thunderbird78 --- affected
thunderbird79 --- affected

People

(Reporter: wsmwk, Unassigned)

References

Details

(4 keywords, Whiteboard: [regression:TB64][fixed by bug 272292])

Crash Data

Attachments

(2 files)

This crash is new in version 64

bp-252ddc2c-7913-4c26-8f7d-7a5220181101.
=============================================================

Top 10 frames of crashing thread:

0 xul.dll nsMsgDBFolder::ContainsChildNamed comm/mailnews/base/util/nsMsgDBFolder.cpp:4183
1 xul.dll ContainsChildNamedRunnable::Run comm/mailnews/import/src/nsImportMail.cpp:1064
2 xul.dll nsThreadSyncDispatch::Run xpcom/threads/nsThreadSyncDispatch.h:39
3 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1245
4 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:530
5 xul.dll nsXULWindow::ShowModal xpfe/appshell/nsXULWindow.cpp:411
6 xul.dll nsContentTreeOwner::ShowAsModal xpfe/appshell/nsContentTreeOwner.cpp:489
7 xul.dll nsWindowWatcher::OpenWindowInternal toolkit/components/windowwatcher/nsWindowWatcher.cpp:1264
8 xul.dll nsWindowWatcher::OpenWindow2 toolkit/components/windowwatcher/nsWindowWatcher.cpp:412
9 xul.dll nsGlobalWindowOuter::OpenInternal dom/base/nsGlobalWindowOuter.cpp:7083

=============================================================
Whiteboard: [regression:TB64]
Version: 59 → 64
Crash at https://hg.mozilla.org/releases/comm-beta/annotate/0a10c274a1fe/mailnews/base/util/nsMsgDBFolder.cpp#l4024

  *containsChild = child != nullptr;

If there's no operator precedence mixup here I don't know how this could crash. 
But then again, I do prefer parenthesis for things like this so you don't have to worry.
It looks like something has gone wrong in the import context from where this function is called:

nsImportMail.cpp:1064
NS_IMETHODIMP ContainsChildNamedRunnable::Run()
{
  mResult = m_folder->ContainsChildNamed(m_name, m_result);  // 1064
  return NS_OK;  // Sync runnable must return OK.
}

#9 ranked crash for 68.0.

Ben, any thoughts?

Flags: needinfo?(benc)

(presumably this is still happening in version 70)

OS: Windows 10 → Windows
Summary: Crash in nsMsgDBFolder::ContainsChildNamed via import → Crash in nsMsgDBFolder::ContainsChildNamed via import's runnable context

Unfortunately no test cases have emerged.

Any speculative approaches? If so we could test on beta, which has 2-3 crashes per week.

Flags: needinfo?(jorgk)

Over 400 crashes per week - not good for a function which could gain us users.

(In reply to Jorg K (GMT+1) (PTO to 15th Dec 2019, sporadically reading bugmail) from comment #2)

It looks like something has gone wrong in the import context from where this
function is called:

nsImportMail.cpp:1064
NS_IMETHODIMP ContainsChildNamedRunnable::Run()
{
mResult = m_folder->ContainsChildNamed(m_name, m_result); // 1064
return NS_OK; // Sync runnable must return OK.
}

Is this still the case?

Flags: needinfo?(mkmelin+mozilla)

Ryan, any comments or reporters stand out that are worth contacting?

Flags: needinfo?(ryanvm)
Flags: needinfo?(mkmelin+mozilla)
Flags: needinfo?(ryanvm)
Flags: needinfo?(jorgk-bmo)
Flags: needinfo?(benc)

Hard to say if it would help. But shouldn't hurt.

Assignee: nobody → mkmelin+mozilla
Status: NEW → ASSIGNED
Attachment #9114839 - Flags: review?(benc)
Flags: needinfo?(ryanvm)
Comment on attachment 9114839 [details] [diff] [review]
bug1503821_containschildnamed_crash.patch

Review of attachment 9114839 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/import/src/nsImportMail.cpp
@@ +922,5 @@
>  }
>  
>  nsresult ProxyContainsChildNamed(nsIMsgFolder *aFolder, const nsAString &aName,
>                                   bool *aResult) {
> +  NS_ENSURE_ARG(aFolder);

Looks good to me and I think it should be added. But I don't think it'll help.
(I'd say this should be a hard MOZ_ASSERT, if we weren't trying to track down this annoying bug.)
Attachment #9114839 - Flags: review?(benc) → review+

This is a weird bug, and I don't see how it's screwing up.
There's only one path through the code, so it's easy enough to trace the execution. Just to set it straight in my own head:

ProxyContainsChildNamed() is called from only one place), in the import thread.

It creates a ContainsChildNamedRunnable which is synchronously dispatched to the main thread.
(Other events on the import thread could get serviced before the dispatch returns... but unless they kill the import thread I can't see them interfering).

The crash occurs in nsMsgDBFolder::ContainsChildNamed(const nsAString &name, bool *containsChild) on this line:

*containsChild = child != nullptr;

It's an EXCEPTION_ACCESS_VIOLATION_WRITE which I can only see as containsChild pointing to invalid memory. But containsChild should still be valid, pointing to the local bool out in import thread... which should still be in scope as the dispatch is synchronous.

It's another one that I think would be easy if we had a test case where we could catch it in the debugger...

Can anyone suggest a simple way I can get this codepath to run (on Linux)? Do I need to have some other email client installed to import from? Or is there some way to import an mbox file that would go through here?
Even just stepping through a non-crashing run in the debugger might shed some light.

It's another one that I think would be easy if we had a test case where we could catch it in the debugger...

Hopefully Ryan can dig up some dirt.

Blocks: 1580249

(In reply to Wayne Mery (:wsmwk) from comment #8)

Ryan, any comments or reporters stand out that are worth contacting?

Importing from Outlook appears to be a very common theme in the user comments. Occasional mentions of Windows Live Mail too, but mostly Outlook.

Flags: needinfo?(ryanvm)

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/66f561a80e1a
try to prevent crash in nsMsgDBFolder::ContainsChildNamed. r=benc

Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED

Let's keep this open and see how it goes.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: --- → Thunderbird 73.0

It'll take 2-4 weeks of 73 beta to know whether it helped.

(In reply to Wayne Mery (:wsmwk) from comment #17)

It'll take 2-4 weeks of 73 beta to know whether it helped.

As of today we're only one week into beta 73 (shipped on 17th and cranked the rate on Monday). So far two crashes, both from the same user
bp-c9f0c640-8477-4bf3-acdb-4043f0200121

Flags: needinfo?(mkmelin+mozilla)

Mystery.

Assignee: mkmelin+mozilla → nobody
Flags: needinfo?(mkmelin+mozilla)
Target Milestone: Thunderbird 73.0 → ---

Might be the same kind of thing as in bug 1629204 - problematic use of GetSubFolders()

(In reply to Magnus Melin [:mkmelin] from comment #21)

Might be the same kind of thing as in bug 1629204 - problematic use of GetSubFolders() [v.fixed for 68.9.0]

But the signature is gone approx August 7 with 78.1.1 (released at rate 100), and beta 80 (the last beta crash is buildid 20200726170737 = 79.0b3, the last beta of 79)

The fix must be from https://hg.mozilla.org/releases/comm-esr78/pushloghtml?fromchange=THUNDERBIRD_78_0_RELEASE&tochange=THUNDERBIRD_78_1_1_RELEASE&full=1 The only likely bug in that list would seem to be Bug 272292 - unable to import Seamonkey data into Thunderbird - which never shipped in beta

What do you think?

(or bugzilla query of 78.1.1 uplifts https://mzl.la/3pFTdvQ from 80 or 81 (which doesn't include Bug 272292) if all 78.1.1 uplifts follow Rob's comment convention of "78.1.1:". )

Flags: needinfo?(rob)
Flags: needinfo?(mkmelin+mozilla)
Attached image drop in crash rate
Duplicate of this bug: 1580249

from bug 1580249 comment 2 "Ahh, this looks like exactly the same problem as Bug 1503821: an illegal write from a runnable dispatched from the import thread."

(In reply to Wayne Mery (:wsmwk) from comment #22)

(or bugzilla query of 78.1.1 uplifts https://mzl.la/3pFTdvQ from 80 or 81 (which doesn't include Bug 272292) if all 78.1.1 uplifts follow Rob's comment convention of "78.1.1:". )

I've tried to follow that convention as it was suggested by Jörg when I first started doing the uplifts. There may be releases or uplifts done by others that did not follow that convention.

Flags: needinfo?(rob)

Yes very likely bug 272292 fixed it.

Status: REOPENED → RESOLVED
Closed: 1 year ago4 months ago
Flags: needinfo?(mkmelin+mozilla)
Resolution: --- → FIXED
Whiteboard: [regression:TB64] → [regression:TB64][fixed by bug 272292]
Target Milestone: --- → Thunderbird 79.0
You need to log in before you can comment on or make changes to this bug.