Closed Bug 471492 Opened 16 years ago Closed 12 years ago

crash [@ nsImapMailFolder::CopyMessagesOffline(nsIMsgFolder*, nsIArray*, int, nsIMsgWindow*, nsIMsgCopyServiceListener*)] when moving/deleting/copying many messages

Categories

(MailNews Core :: Networking: IMAP, defect)

x86
All
defect
Not set
critical

Tracking

(thunderbird-esr1718+ fixed)

VERIFIED FIXED
Thunderbird 18.0
Tracking Status
thunderbird-esr17 18+ fixed

People

(Reporter: wsmwk, Assigned: rkent)

References

Details

(Keywords: crash, qawanted, topcrash, Whiteboard: [bulkoperations][regression action in 3.1?])

Crash Data

Attachments

(1 file)

from crash-stats, crash [@ nsImapMailFolder::CopyMessagesOffline]
#4 b1+b2pre crasher
does not appear in TB2/talkback

bp-1183313a-db9b-4bc0-a554-b3ba62081224
nsImapMailFolder::CopyMessagesOffline	nsImapMailFolder.cpp:6418
nsImapMailFolder::CopyMessages	nsImapMailFolder.cpp:6554
nsMsgCopyService::DoNextCopy	nsMsgCopyService.cpp:321
nsMsgCopyService::DoCopy	nsMsgCopyService.cpp:263
nsMsgCopyService::CopyMessages	nsMsgCopyService.cpp:519
nsImapMailFolder::DeleteMessages	nsImapMailFolder.cpp:2183
nsMsgDBView::DeleteMessages	nsMsgDBView.cpp:2731
nsMsgDBView::ApplyCommandToIndices	nsMsgDBView.cpp:2523
nsMsgDBView::DoCommand	nsMsgDBView.cpp:2280
NS_InvokeByIndex_P	xpcom/reflect/xptcall/src/md/win32/xptcinvoke.cpp:101
XPCWrappedNative::CallMethod	js/src/xpconnect/src/xpcwrappednative.cpp:2422
XPC_WN_CallMethod	js/src/xpconnect/src/xpcwrappednativejsops.cpp:1477
http://hg.mozilla.org/comm-central/annotate/59db95131052/mailnews/imap/src/nsImapMailFolder.cpp#l6418

6417  nsCOMPtr <nsIMsgOfflineImapOperation> destOp;
6418  mDatabase->GetOfflineOpForKey(fakeBase + sourceKeyIndex, PR_TRUE, getter_AddRefs(destOp));

... but mDatabase can't be null there.
Product: Core → MailNews Core
looking at crash-stats this happens on Linux, on MacOS X.
Flags: wanted-thunderbird3?
OS: Windows Vista → All
fix signature for crash-stats
Summary: crash [@ nsImapMailFolder::CopyMessagesOffline] → crash [@ nsImapMailFolder::CopyMessagesOffline(nsIMsgFolder*, nsIArray*, int, nsIMsgWindow*, nsIMsgCopyServiceListener*)]
this isn't anywhere near a topcrash in 3.1 or 3.0

3.1b1 crash
bp-666bd992-8ff9-4a29-b8c0-9673b2100329
0	thunderbird-bin	nsImapMailFolder::CopyMessagesOffline	 mailnews/imap/src/nsImapMailFolder.cpp:6970
1	thunderbird-bin	nsImapMailFolder::CopyMessages	mailnews/imap/src/nsImapMailFolder.cpp:7216
2	thunderbird-bin	nsMsgCopyService::DoNextCopy	mailnews/base/src/nsMsgCopyService.cpp:321
3	thunderbird-bin	nsMsgCopyService::DoCopy	mailnews/base/src/nsMsgCopyService.cpp:263
4	thunderbird-bin	nsMsgCopyService::CopyMessages	mailnews/base/src/nsMsgCopyService.cpp:531
5	thunderbird-bin	nsImapMailFolder::DeleteMessages	mailnews/imap/src/nsImapMailFolder.cpp:2355
6	thunderbird-bin	nsMsgDBView::DeleteMessages	mailnews/base/src/nsMsgDBView.cpp:3001
7	thunderbird-bin	nsMsgDBView::ApplyCommandToIndices	mailnews/base/src/nsMsgDBView.cpp:2727
8	thunderbird-bin	nsMsgDBView::DoCommand	mailnews/base/src/nsMsgDBView.cpp:2392
9	libxpcom_core.dylib	NS_InvokeByIndex_P	xpcom/reflect/xptcall/src/md/unix/xptcinvoke_unixish_x86.cpp:179
10	thunderbird-bin	XPCWrappedNative::CallMethod	js/src/xpconnect/src/xpcwrappednative.cpp:2721
11	thunderbird-bin	XPC_WN_CallMethod	js/src/xpconnect/src/xpcwrappednativejsops.cpp:1740
12	libmozjs.dylib	js_Invoke	js/src/jsinterp.cpp:1360
13	libmozjs.dylib	js_Interpret	js/src/jsops.cpp:2240
14	libmozjs.dylib	js_Invoke	js/src/jsinterp.cpp:1368
15	libmozjs.dylib	js_fun_call	js/src/jsfun.cpp:1955
16	libmozjs.dylib	js_Interpret	js/src/jsops.cpp:2208
17	libmozjs.dylib	js_Invoke	js/src/jsinterp.cpp:1368
18	libmozjs.dylib	js_InternalInvoke	js/src/jsinterp.cpp:1423
19	libmozjs.dylib	JS_CallFunctionValue	js/src/jsapi.cpp:5112
20	thunderbird-bin	nsJSContext::CallEventHandler	dom/base/nsJSEnvironment.cpp:2134
Keywords: topcrash
#100 crash for v3.1.2
crash rate more than tripled as of version 3.1 ~in July, which suggests perhaps something regressive happened in the code.
tripled again with v3.1.2, in the past month, probably because of upgrades from 3.0
Keywords: qawanted
Whiteboard: regression action in 3.1?
#84 for v3.1.7 => topcrash.
related to Bug 617945?  crash [@ nsImapMailFolder::DeleteMessages(nsIArray*, nsIMsgWindow*, int, int, nsIMsgCopyServiceListener*, int)]

recent examples  

line 7090
bp-ee450aee-5f8d-4857-9344-934c52110109 (dipen)
EXCEPTION_ACCESS_VIOLATION_READ
0x0
0	thunderbird.exe	nsImapMailFolder::CopyMessagesOffline	mailnews/imap/src/nsImapMailFolder.cpp:7090
1	thunderbird.exe	nsImapMailFolder::CopyMessages	mailnews/imap/src/nsImapMailFolder.cpp:7336
2	thunderbird.exe	nsMsgCopyService::DoNextCopy	mailnews/base/src/nsMsgCopyService.cpp:321
3	thunderbird.exe	nsMsgCopyService::DoCopy	mailnews/base/src/nsMsgCopyService.cpp:263
4	thunderbird.exe	nsMsgCopyService::CopyMessages	mailnews/base/src/nsMsgCopyService.cpp:531
5	thunderbird.exe	nsMsgDBView::CopyMessages	mailnews/base/src/nsMsgDBView.cpp:2682
6	thunderbird.exe	nsMsgDBView::ApplyCommandToIndicesWithFolder	mailnews/base/src/nsMsgDBView.cpp:2697
7	thunderbird.exe	nsMsgDBView::DoCommandWithFolder	mailnews/base/src/nsMsgDBView.cpp:2352 

line 7074
bp-72cd82f5-9eb2-40c5-844e-4b0a62110110 (G.J.Perrin)
EXC_BAD_ACCESS / KERN_PROTECTION_FAILURE
0x0
Selected 2161 messages in an IMAP mailbox on an Exchange 2003 server. Used the menus (not drag and drop) to move to a different mailbox (Sent Items) on the same server. When the crash occurred I noticed that the deleted flag has appeared against visible items. 
0	thunderbird-bin	nsImapMailFolder::CopyMessagesOffline	mailnews/imap/src/nsImapMailFolder.cpp:7074
1	thunderbird-bin	nsImapMailFolder::CopyMessages	mailnews/imap/src/nsImapMailFolder.cpp:7336
2	thunderbird-bin	nsMsgCopyService::DoNextCopy	mailnews/base/src/nsMsgCopyService.cpp:321
3	thunderbird-bin	nsMsgCopyService::CopyMessages	mailnews/base/src/nsMsgCopyService.cpp:263
4	thunderbird-bin	nsMsgDBView::CopyMessages	mailnews/base/src/nsMsgDBView.cpp:2682
5	thunderbird-bin	nsMsgDBView::ApplyCommandToIndicesWithFolder	mailnews/base/src/nsMsgDBView.cpp:2702
6	thunderbird-bin	nsMsgDBView::DoCommandWithFolder	mailnews/base/src/nsMsgDBView.cpp:2352 


crash comments of the last month:
was trying to move 25000 messages to a folder using google account imap. and at same time the mail headers were getting downloaded . abut 49000 got downloadded of 81000 mails...
Was deleting ~360 emails from a Dovecot imap folder.
trying to switch from Mozilla Firefox to Thunderbird, and Thunderbird crashed and shut down on me, I have no idea why.
trying to delete over 9,000 email threads with somewhere around 23-25K in messages.
Selected 2161 messages in an IMAP mailbox on an Exchange 2003 server. Used the menus (not drag and drop) to move to a different mailbox (Sent Items) on the same server. When the crash occurred I noticed that the deleted flag has appeared against visible items.
Mass deletion fail.
incredibly slow on large folders
deleting 15k email from a mailbox
Keywords: topcrash
Summary: crash [@ nsImapMailFolder::CopyMessagesOffline(nsIMsgFolder*, nsIArray*, int, nsIMsgWindow*, nsIMsgCopyServiceListener*)] → crash [@ nsImapMailFolder::CopyMessagesOffline(nsIMsgFolder*, nsIArray*, int, nsIMsgWindow*, nsIMsgCopyServiceListener*)] when moving/deleting/copying many messages
Crash Signature: [@ nsImapMailFolder::CopyMessagesOffline(nsIMsgFolder*, nsIArray*, int, nsIMsgWindow*, nsIMsgCopyServiceListener*)]
bienvenu, do we need a protocol log for this? (thinking not)
(In reply to Wayne Mery (:wsmwk) from comment #7)
> bienvenu, do we need a protocol log for this? (thinking not)

No, looks like this is prone to happen when moving/deleting mass quantities of imap mail
(In reply to David :Bienvenu from comment #8)
> looks like this is prone to happen when moving/deleting mass quantities
> of imap mail

yeah, most comments are that.  except bp-0f62a013-32f7-4588-b5dc-bca282110722 "trying to check my email. said it had exceeded bandwidth "

line 7090 and 7074 are the most common - about 50-50 split. But one oddball is line 7184 for bp-131bc32a-8790-402e-a79a-416212110726
1 	thunderbird.exe 	nsImapMailFolder::CopyMessagesOffline 	mailnews/imap/src/nsImapMailFolder.cpp:7184
2 	thunderbird.exe 	nsImapMailFolder::CopyMessages 	mailnews/imap/src/nsImapMailFolder.cpp:7336
3 	thunderbird.exe 	nsMsgCopyService::DoNextCopy 	mailnews/base/src/nsMsgCopyService.cpp:321
4 	thunderbird.exe 	nsMsgCopyService::DoCopy 	mailnews/base/src/nsMsgCopyService.cpp:263
5 	thunderbird.exe 	nsMsgCopyService::CopyMessages 	mailnews/base/src/nsMsgCopyService.cpp:531
6 	thunderbird.exe 	nsImapMailFolder::DeleteMessages 	mailnews/imap/src/nsImapMailFolder.cpp:2405
7 	thunderbird.exe 	nsMsgSearchDBView::ProcessRequestsInOneFolder 	mailnews/base/src/nsMsgSearchDBView.cpp:1098
8 	thunderbird.exe 	nsMsgSearchDBView::DeleteMessages 	mailnews/base/src/nsMsgSearchDBView.cpp:909
9 	thunderbird.exe 	nsMsgDBView::ApplyCommandToIndices 	mailnews/base/src/nsMsgDBView.cpp:2726
10 	thunderbird.exe 	nsMsgDBView::DoCommand 	mailnews/base/src/nsMsgDBView.cpp:2393
Keywords: topcrashtopcrash-
[@ nsImapMailFolder::CopyMessagesOffline ] #14 crash for Mac TB6
Crash Signature: [@ nsImapMailFolder::CopyMessagesOffline(nsIMsgFolder*, nsIArray*, int, nsIMsgWindow*, nsIMsgCopyServiceListener*)] → [@ nsImapMailFolder::CopyMessagesOffline(nsIMsgFolder*, nsIArray*, int, nsIMsgWindow*, nsIMsgCopyServiceListener*)] [@ nsImapMailFolder::CopyMessagesOffline ]
bienvenu, 
if the crashing code isn't easily amenable to fixing, can some of these "bulkoperation" bugs help?  
https://bugzilla.mozilla.org/buglist.cgi?type1-0-0=substring&list_id=1292255&field0-0-0=short_desc&type0-0-1=substring&field0-0-1=keywords&type1-0-1=allwordssubstr&resolution=---&classification=Client%20Software&classification=Components&status_whiteboard_type=allwordssubstr&query_format=advanced&status_whiteboard=bulk&type0-0-0=anywordssubstr&field1-0-0=short_desc&product=MailNews%20Core&product=Thunderbird&field1-0-1=short_desc

Bug 296453 blocks several of the others.

crash settled in as #10 for TB6 (two week period).  And of 11 crashes...
6 end at nsImapMailFolder.cpp:7261 - bp-4020f5c3-9d73-4131-a4e4-87add2110913
4 end at nsImapMailFolder.cpp:7258 - bp-4020f5c3-9d73-4131-a4e4-87add2110913
1 end at nsImapMailFolder.cpp:7239 - bp-8419b1e1-a43c-4921-8d1c-8a3ee2110913
Keywords: topcrash-topcrash
Whiteboard: regression action in 3.1? → [bulkoperations][regression action in 3.1?]
sorry, this fell off my radar. It's probably amenable to fixing but it would be great to be able to reproduce it.
mDatabase seems to be null because GetDatabase() returns error.

bienvenu, should we return error immediately when GetDatabase() returns error?
(In reply to Makoto Kato from comment #13)
> mDatabase seems to be null because GetDatabase() returns error.
> 
> bienvenu, should we return error immediately when GetDatabase() returns
> error?

If that were the case, we would have crashed much earlier. We're already inside an if block that checks if mDatabase is null. So someone must be indirectly nulling out mDatabase while inside the if (mDatabase) block. I'd really like to figure out why that is. Since we're on the UI thread, we don't yield, and afaik, we don't pump a nested event loop, it can't be some random process. Which is all why reproducing it would be really helpful.
Crash Signature: [@ nsImapMailFolder::CopyMessagesOffline(nsIMsgFolder*, nsIArray*, int, nsIMsgWindow*, nsIMsgCopyServiceListener*)] [@ nsImapMailFolder::CopyMessagesOffline ] → nsIMsgCopyServiceListener*)] [@ nsImapMailFolder::CopyMessagesOffline(nsIMsgFolder*, nsIArray*, int, nsIMsgWindow*, nsIMsgCopyServiceListener*)] [@ nsImapMailFolder::CopyMessagesOffline ] [@ nsImapMailFolder::CopyMessagesOffline(nsIMsgFolder*, nsIArray*…
This is basically the approach that I am proposing for all of the folder objects in bug 792915. The crash theory is that someone is doing a folder.msgdatabase = null.
Assignee: nobody → kent
Status: NEW → ASSIGNED
Attachment #663578 - Flags: review?(mozilla)
Comment on attachment 663578 [details] [diff] [review]
Keep local copy of database object

OK, but everything happens on the UI thread, and this runs to completion, so it would have to be something called as a side effect of the code in this method, and I don't know what that would be. Or something this calls would have to pump a nested event queue, and that also seems unlikely.
Attachment #663578 - Flags: review?(mozilla) → review+
Checked in https://hg.mozilla.org/comm-central/rev/a98f810380b6

I agree in general with comment 18, so I think it is fair to call this patch an experiment. So I am not going to resolve this bug to fixed quite yet until we see some of the crash stats. I'll recommend it though for aurora to try to make sure we get that feedback. It's a very low risk patch.
Target Milestone: --- → Thunderbird 18.0
Kent, this isn't definitive, but I expect next week will continue to be good news ... TB18 beta 1 has no crashes after a week in the field. Compares favorably against TB17 betas, which had about 10 crashes per week. (I'm afraid that's the best I can offer based on the beta users we've got)

If this continues favorably next week, then I think we should request this for 17.0.1
kent, i think we can say this is gone in TB18, as best as we're going to be able to say.  Still now crashes for beta 18, and there were many (and still are) for beta 17.
Flags: needinfo?(kent)
Comment on attachment 663578 [details] [diff] [review]
Keep local copy of database object

[Approval Request Comment]
Regression caused by (bug #): 
User impact if declined: 
Testing completed (on c-c, etc.): 
Risk to taking this patch (and alternatives if risky):

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: 
Fix Landed on Version:
Risk to taking this patch (and alternatives if risky): 
String or UUID changes made by this patch: 

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Attachment #663578 - Flags: approval-comm-esr17?
Comment on attachment 663578 [details] [diff] [review]
Keep local copy of database object

Ok, lets take this forward to ESR and see if it helps.
Attachment #663578 - Flags: approval-comm-esr17? → approval-comm-esr17+
Let's at least resolve to fixed, since the experiment mentioned in comment 18 seems to be a success.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Flags: needinfo?(kent)
Resolution: --- → FIXED
Flags: wanted-thunderbird3?
yay! virtually no crashes in TB17.0.2.  so topcrash v.dead.
thanks rkent
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: