Closed Bug 555539 Opened 14 years ago Closed 9 years ago

crash [@ nsCOMPtr_base::~nsCOMPtr_base() | nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact()] due to ref-counting

Categories

(MailNews Core :: Filters, defect)

x86
All
defect
Not set
critical

Tracking

(Not tracked)

VERIFIED WORKSFORME

People

(Reporter: wsmwk, Unassigned)

References

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

crash [@ nsCOMPtr_base::~nsCOMPtr_base() | nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact()]
about 16 per week

xref bug 537017 and bug 537018

bp-dd25d431-47d5-4acc-b57c-693582100322 linux
bp-722d58f4-eb6c-4146-b5f1-5a4d92100301 windows
0	xpcom_core.dll	nsCOMPtr_base::~nsCOMPtr_base	 objdir-tb/mozilla/xpcom/build/nsCOMPtr.cpp:81
1	thunderbird.exe	nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact	mailnews/base/search/src/nsMsgFilterService.cpp:332
2	thunderbird.exe	nsMsgFilterAfterTheFact::`scalar deleting destructor'	
3	thunderbird.exe	nsCommandHandler::Release	content/xul/document/src/nsXULDocument.cpp:4592
4	xpcom_core.dll	nsCOMPtr_base::~nsCOMPtr_base	objdir-tb/mozilla/xpcom/build/nsCOMPtr.cpp:81
5	thunderbird.exe	nsTArray<nsCOMPtr<nsIMsgUserFeedbackListener> >::DestructRange	objdir-tb/mozilla/dist/include/xpcom/nsTArray.h:848
6	thunderbird.exe	nsTArray<nsCOMPtr<nsIMsgSearchNotify> >::RemoveElementsAt	objdir-tb/mozilla/dist/include/xpcom/nsTArray.h:663
7	thunderbird.exe	nsAutoTObserverArray<nsCOMPtr<nsIDBChangeListener>,0>::Clear	objdir-tb/mozilla/dist/include/xpcom/nsTObserverArray.h:242
8	thunderbird.exe	nsMsgMailNewsUrl::SetUrlState	mailnews/base/util/nsMsgMailNewsUrl.cpp:136
9	thunderbird.exe	nsImapMailFolder::SetUrlState	mailnews/imap/src/nsImapMailFolder.cpp:6604
10	xpcom_core.dll	NS_InvokeByIndex_P	xpcom/reflect/xptcall/src/md/win32/xptcinvoke.cpp:102
emailed reporter of bp-aaa22c35-3ffd-4168-a221-9fdb82100108 (pekka)
~top 100 crash for v3.1.7

bp-b1a265fe-8651-4ead-a229-cb3b22101115 (ed)
EXCEPTION_ACCESS_VIOLATION_READ
0x8
0	xpcom_core.dll	nsCOMPtr_base::~nsCOMPtr_base	objdir-tb/mozilla/xpcom/build/nsCOMPtr.cpp:81
1	thunderbird.exe	nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact	mailnews/base/search/src/nsMsgFilterService.cpp:332
2	thunderbird.exe	nsMsgFilterAfterTheFact::`vector deleting destructor'	
3	thunderbird.exe	nsCommandHandler::Release	mailnews/imap/src/nsImapOfflineSync.cpp:61
4	xpcom_core.dll	nsCOMPtr_base::~nsCOMPtr_base	objdir-tb/mozilla/xpcom/build/nsCOMPtr.cpp:81
5	thunderbird.exe	nsTArray<nsCOMPtr<nsIAutoSyncMgrListener> >::DestructRange	objdir-tb/mozilla/dist/include/nsTArray.h:862
6	thunderbird.exe	nsTArray<nsCOMPtr<nsIAutoSyncMgrListener> >::RemoveElementsAt	objdir-tb/mozilla/dist/include/nsTArray.h:663
7	thunderbird.exe	nsAutoTObserverArray<nsCOMPtr<nsIDBChangeListener>,0>::Clear	objdir-tb/mozilla/dist/include/nsTObserverArray.h:242
8	thunderbird.exe	nsMsgMailNewsUrl::SetUrlState	mailnews/base/util/nsMsgMailNewsUrl.cpp:137
9	thunderbird.exe	nsImapMailFolder::SetUrlState	mailnews/imap/src/nsImapMailFolder.cpp:6640 


nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact() for linux
bp-97e0f233-af73-4840-ac7b-ef3182101217
SIGSEGV
0x2000200
0		@0x2000200	
1	thunderbird-bin	nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact	nsCOMPtr.h:469
2	thunderbird-bin	nsMsgFilterAfterTheFact::Release	mailnews/base/search/src/nsMsgFilterService.cpp:314
3	libxpcom_core.so	nsCOMPtr_base::~nsCOMPtr_base	nsCOMPtr.cpp:81
4	thunderbird-bin	nsTArray<nsCOMPtr<nsIUrlListener> >::DestructRange	nsCOMPtr.h:469
5	thunderbird-bin	nsTArray<nsCOMPtr<nsIUrlListener> >::RemoveElementsAt	nsTArray.h:663
6	thunderbird-bin	nsAutoTObserverArray<nsCOMPtr<nsIUrlListener>, 0u>::Clear	nsTArray.h:674
7	thunderbird-bin	nsMsgMailNewsUrl::SetUrlState	mailnews/base/util/nsMsgMailNewsUrl.cpp:137 


Most (>60%) crash comments are not in English:
Thunderbird wordt regelmatig en ongecontroleerd afgesloten. Als informatie ? verschijnt dan de wizzard <Mozilla-crashreporter> in beeld.
emails empfangen
Der Absturz passiert neuerdings immer, wenn ich eine E-Mail versende.
Clicked on Inbox of Hotmail account
anhang zur mail gelesen
is this a good clue? ...

iheart of bp-4a13f4fd-99d5-4229-b6d8-9fed02101026  writes
"if I am recalling correctly, my crash issues stopped when I disabled offline use for the imap mailbox I check"
triggered by auotsync?

(In reply to comment #4) 
> iheart of bp-4a13f4fd-99d5-4229-b6d8-9fed02101026  writes
> "if I am recalling correctly, my crash issues stopped when I disabled offline
> use for the imap mailbox I check"

more specifically, crashes stopped after disabling Folder Properties -> Synchronization
Has anyone who sees this been running filters manually? That's what the stack would seem to indicate, but I suspect this has to do with autosync and not running filters manually (or perhaps having both happen at the same time)
Crash Signature: [@ nsCOMPtr_base::~nsCOMPtr_base() | nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact()]
(In reply to David :Bienvenu from comment #6)
> Has anyone who sees this been running filters manually? That's what the
> stack would seem to indicate, but I suspect this has to do with autosync and
> not running filters manually (or perhaps having both happen at the same time)

perhaps both? unfortunately not enough comments to know. (plus, I doubt the numbers of manual filter users is sufficient to surface the issue in crash stats)

most comments definitely about automatic:
* Email Pop3 just came in and filter should move it to different folder.
* Crashed on new mail arrival

but does bp-a11319bc-b33c-4b46-abfe-4b15e2110810 suggest manual filter?
* "I tested a new filter, mail from imap to imap folder"
(unfortunately, no email address for the reporter)


more signatures from https://crash-stats.mozilla.com/query/query?product=Thunderbird&version=ALL%3AALL&range_value=4&range_unit=weeks&date=08%2F13%2F2011+08%3A05%3A40&query_search=signature&query_type=contains&query=nsMsgFilterAfterTheFact&reason=&build_id=&process_type=any&hang_type=any&do_query=1

nsRefPtr<nsTypedSelection>::~nsRefPtr<nsTypedSelection>() | nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact() (version 5 only, windows)

nsCOMPtr_base::~nsCOMPtr_base | nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact   (Mac+linux)
crash rate of 3.x seems higher.
almost no v5 crashes
no crash so far for v7

v6 bp-c7d31643-f1d0-4218-a1fe-89dba2110822
v7.01.multiple crashes per day - easy to reproduce.
Works fine on manual filtering, not on autofiltering.
Ask me for details or if you want some debugtools installed on client.
Filed 30-ish crash reports so far.
--Fred
Do you have an idea which filter is causing the crash ?
Yes sir, I do.
I run 2 filters, which one is on a medium traffic mailing list.
On the mailclient I got two accounts setup, each with IMAP+TLS/SSL.
The filter move emails from one account to the other.

But I do believe the filtering code itself is not a problem, read why below.
The crash only occur when filtering is set to run automatically. 

<wild guess>
Sessions for the two accounts has not completed, at the time the filter is moving the emails, hence the crash.
</wild guess>
Just upgraded to 8.0, same behavior as above.
I run Thunderbird 8 in debugmode, caught the crash, and uploaded the compressed logfile of it to http://fredrikhansen.se/thunderbird_1-121-20_1409.rar
I hope that will be useful for futher narrowing this one down.
I assume there is personal data in this kind of file.
--Fred
fredik also gets the crash signature of bug 696365 nsRefPtr<nsTypedSelection>::~nsRefPtr<nsTypedSelection>() | nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact() as in bp-e35b30aa-69d4-4328-8037-b2f7d2110810  ...

and signature mozalloc_abort(char const* const) | mozalloc_handle_oom() | moz_xrealloc with no bug report yet.
Blocks: 696365
Mac and linux  [@ nsCOMPtr_base::~nsCOMPtr_base | nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact ] eg. bp-e1a1c9b7-3907-49e0-84b1-a066a2111229


nsCOMPtr_base::~nsCOMPtr_base() | nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact()  might also be the same
bp-fed2cc33-0709-4f23-86e2-72f702120102 has a reporter.
Crash Signature: [@ nsCOMPtr_base::~nsCOMPtr_base() | nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact()] → [@ nsCOMPtr_base::~nsCOMPtr_base() | nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact()] [@ nsCOMPtr_base::~nsCOMPtr_base | nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact ]
don't know if this will help, but using ref ptr is better than what we have. The search session owns the nsMsgFilterAfterTheFact object until the search finishes, so we shouldn't need to do the self-ownership hack.
Assignee: nobody → dbienvenu
Attachment #612033 - Flags: review?(neil)
Comment on attachment 612033 [details] [diff] [review]
clean up ref-counting

> nsMsgFilterAfterTheFact::nsMsgFilterAfterTheFact(nsIMsgWindow *aMsgWindow, nsIMsgFilterList *aFilterList, nsISupportsArray *aFolderList)
> {
>-
>-  NS_ADDREF(this); // we own ourselves, and will release ourselves when execution is done.
Kill it with fire!

>   if (m_searchSession)
>     m_searchSession->UnregisterListener(this);
> 
>   if (m_filters)
>     (void)m_filters->FlushLogIfNecessary();
You said that the search session owns this object, could it potentially get destroyed before this function returns?

>diff --git a/mailnews/base/search/src/nsMsgSearchTerm.cpp b/mailnews/base/search/src/nsMsgSearchTerm.cpp
>diff --git a/mailnews/imap/src/nsImapMailFolder.cpp b/mailnews/imap/src/nsImapMailFolder.cpp
Part of some other patch?
(In reply to neil@parkwaycc.co.uk from comment #17)
> You said that the search session owns this object, could it potentially get
> destroyed before this function returns?

I guess I should have said they own each other; That cycle is broken in ::OnEndExecution. But I'm not sure what happens in the multiple folder case. I'll have to look at that, because we create a new search session for each folder.

the nsMsgSearchTerm stuff is definitely part of some other patch; sorry about that.

> /nsImapMailFolder.cpp
> Part of some other patch?
Actually, it's really part of the patch. FilterAfterTheFact doesn't seem to be creating an undo object so I was hitting an assertion there.
(In reply to David Bienvenu from comment #18)
> (In reply to comment #17)
> > You said that the search session owns this object, could it potentially get
> > destroyed before this function returns?
> I guess I should have said they own each other; That cycle is broken in
> ::OnEndExecution.
But the old code broke the cycle at the end of OnEndExecution, where it's definitely safe. The new code is less clear...
also nsRefPtr<nsCanvasPatternAzure>::~nsRefPtr<nsCanvasPatternAzure>() | nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact()
Crash Signature: [@ nsCOMPtr_base::~nsCOMPtr_base() | nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact()] [@ nsCOMPtr_base::~nsCOMPtr_base | nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact ] → [@ nsCOMPtr_base::~nsCOMPtr_base() | nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact()] [@ nsCOMPtr_base::~nsCOMPtr_base | nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact ] [@ nsRefPtr<nsCanvasPatternAzure>::~nsRefPtr<nsCanvasPatternAzure>() | nsMsg…
=> ASSIGNED  (assuming bienvenu plans to finish this off)
Status: NEW → ASSIGNED
I think I have a user who could test a patch via tryserver build
can you make a tryserver build?
(In reply to neil@parkwaycc.co.uk from comment #19)
> (In reply to David Bienvenu from comment #18)
> > (In reply to comment #17)
> > > You said that the search session owns this object, could it potentially get
> > > destroyed before this function returns?
> > I guess I should have said they own each other; That cycle is broken in
> > ::OnEndExecution.
> But the old code broke the cycle at the end of OnEndExecution, where it's
> definitely safe. The new code is less clear...

FWIW, top 150 crash for version 17 (no longer top 100)
Flags: needinfo?(mozilla)
I think Neil is right that the multiple folder case is broken by my patch.
Flags: needinfo?(mozilla)
newer signature nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact()
Crash Signature: nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact() ] → nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact() ] [@ nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact() ]
Whiteboard: [patchlove]
(In reply to Wayne Mery (:wsmwk) from comment #24)
> FWIW, top 150 crash for version 17 (no longer top 100)

still occurring, but now not even top 300. So I wonder what has caused the drop in rank.

The few users I had contact with indicated in February/March time frame that they no longer see the crash.

FWIW, in discussing https://bugzilla.mozilla.org/show_bug.cgi?id=797710#c3 with rkent, rkent writes 
"The memory management in nsMsgFilterAfterTheFact is quite fragile, which is what bug 555539 was trying to clean up. I would expect that if that bug would land, then many other crashes of nsMsgFilterAfterTheFact would also go away (though other issues might replace them). They may have different root causes, but the memory management currently leads to a crash with minor programming errors (as my ExQuilla case showed.)"
Blocks: 797710
Crash Signature: nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact() ] [@ nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact() ] → nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact() ] [@ nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact() ] [@ nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact ]
rate of crashes increased in version 24.
unclear why
Status: ASSIGNED → NEW
Depends on: 695671
Assignee: mozilla → nobody
Summary: crash [@ nsCOMPtr_base::~nsCOMPtr_base() | nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact()] → crash [@ nsCOMPtr_base::~nsCOMPtr_base() | nsMsgFilterAfterTheFact::~nsMsgFilterAfterTheFact()] due to ref-counting
Wayne asked me what I thought we should do with this patch.

I don't understand comment 25 "the multiple folder case is broken by my patch"  Both before and after bug 695671, operations are done sequentially, only returning if there is an active async operation (which holds a reference). I don't understand how multiple folders make any difference, nothing is done in parallel.

I believe that the theory of this crash is that one of the async operations does a second end-of-operation callback, that results in a second restart of the filter after OnEndExecution has been called. In addition to the search, the other async operations are done by a local folder (for folder parsing) and by the copy service (to move or copy messages). Parsing explicitly breaks the reference to the listener. In Search, OnEndExecution removes the listener. In the copy service, after OnStopCopy the copy request is deleted, which holds the back reference to the filter object. In general, it should be the responsibility of the object that does a terminating callback such as OnStopCopy to hold onto a reference to the listener during the callback, but free it afterwards.

In the case of a double callback to one of the listeners, in the current design that would result in double free. With the proposal in that patch, as long as the object doing the callback maintains a reference while doing the double callbacks, I believe the proposed changed could stop the crash.

Let me do some experimenting with that, unless someone can convince me why the multiple folder case breaks this.
Assignee: nobody → kent
Status: NEW → ASSIGNED
This bug is now obsolete since there have been major changes to the filterAfterTheFact code.
Assignee: rkent → nobody
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
Comment on attachment 612033 [details] [diff] [review]
clean up ref-counting

Dropping obsolete review request.
Attachment #612033 - Flags: review?(neil)
(In reply to Kent James (:rkent) from comment #31)
> This bug is now obsolete since there have been major changes to the
> filterAfterTheFact code.

yeah, only 2 version 31.x crashes in the past 5 months. And also not same stack, and Mac-only.
eg bp-aa8ff9a4-7fc3-40de-82d3-97c402150409
Status: RESOLVED → VERIFIED
Whiteboard: [patchlove]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: