Closed Bug 619358 Opened 14 years ago Closed 14 years ago

compact crash [@ nsFolderCompactState::OnStopRunningUrl(nsIURI*, unsigned int)]

Categories

(MailNews Core :: Backend, defect)

x86
All
defect
Not set
critical

Tracking

(blocking-thunderbird3.1 .8+, thunderbird3.1 .8-fixed)

VERIFIED FIXED
Thunderbird 3.3a2
Tracking Status
blocking-thunderbird3.1 --- .8+
thunderbird3.1 --- .8-fixed

People

(Reporter: wsmwk, Assigned: Bienvenu)

References

Details

(Keywords: crash, regression, topcrash)

Crash Data

Attachments

(1 file, 1 obsolete file)

compact crash [@ nsFolderCompactState::OnStopRunningUrl(nsIURI*, unsigned int)]

#2 crash for 3.1.7.  99.9% are windows

although the crash exists in prior releases it was never was in top 300. so  there is regression behavior in 3.1.7. most likely Bug 608449 - compact of local folders > 4GB fails ?

bp-0bf4cec5-11ce-4d40-8f40-927f62101212 (huub) 3.1.7
email-archive not on local drive but on NAS (Gb-connection with pc). Crashes are starting after upgrade to 3.1.7
EXCEPTION_STACK_OVERFLOW
0xa0fe40
0	thunderbird.exe	nsFolderCompactState::OnStopRunningUrl	mailnews/base/src/nsMsgFolderCompactor.cpp:356
1	thunderbird.exe	nsFolderCompactState::CompactCompleted	mailnews/base/src/nsMsgFolderCompactor.cpp:528
2	thunderbird.exe	nsFolderCompactState::CompactNextFolder	mailnews/base/src/nsMsgFolderCompactor.cpp:569
3	thunderbird.exe	nsFolderCompactState::OnStopRunningUrl	mailnews/base/src/nsMsgFolderCompactor.cpp:359
4	thunderbird.exe	nsFolderCompactState::CompactCompleted	mailnews/base/src/nsMsgFolderCompactor.cpp:528
5	thunderbird.exe	nsFolderCompactState::CompactNextFolder	mailnews/base/src/nsMsgFolderCompactor.cpp:569
6	thunderbird.exe	nsFolderCompactState::OnStopRunningUrl	mailnews/base/src/nsMsgFolderCompactor.cpp:359
7	thunderbird.exe	nsFolderCompactState::CompactCompleted	mailnews/base/src/nsMsgFolderCompactor.cpp:528
8	thunderbird.exe	nsFolderCompactState::CompactNextFolder	mailnews/base/src/nsMsgFolderCompactor.cpp:569
9	thunderbird.exe	nsFolderCompactState::OnStopRunningUrl	mailnews/base/src/nsMsgFolderCompactor.cpp:359 

bp-d4b68c2d-d42c-4ce6-a80a-6361d2101109 3.1.6 (ben)
bp-f1ad55ad-5ca7-44a4-a4f2-7c9b62101018 3.1.4

bp-95c539bf-f1d8-43de-ae26-ec2a92101214 3.1.7  Mac
I can add, that in 90 days there was only one trunk crash.

Other compact sigs that don't yet have bugs

nsFolderCompactState::CompactCompleted(unsigned int) [1] huge uptick, strongly correlates to 3.1.7

nsFolderCompactState::FinishCompact() (there might be an uptick here too)

nsCOMPtr_base::assign_from_helper(nsCOMPtr_helper const&, nsID const&) | nsFolderCompactState::CompactCompleted(unsigned int)  ditto


[1] nsFolderCompactState::CompactCompleted(unsigned int)
bp-3c66cbb8-3479-4c91-b9c3-d91e72101214
EXCEPTION_STACK_OVERFLOW
0xa0ef5e
0	thunderbird.exe	nsFolderCompactState::CompactCompleted	mailnews/base/src/nsMsgFolderCompactor.cpp:528
1	thunderbird.exe	nsFolderCompactState::CompactNextFolder	mailnews/base/src/nsMsgFolderCompactor.cpp:569
2	thunderbird.exe	nsFolderCompactState::OnStopRunningUrl	mailnews/base/src/nsMsgFolderCompactor.cpp:359
3	thunderbird.exe	nsFolderCompactState::CompactCompleted	mailnews/base/src/nsMsgFolderCompactor.cpp:528
4	thunderbird.exe	nsFolderCompactState::CompactNextFolder	mailnews/base/src/nsMsgFolderCompactor.cpp:569
5	thunderbird.exe	nsFolderCompactState::OnStopRunningUrl	mailnews/base/src/nsMsgFolderCompactor.cpp:359
6	thunderbird.exe	nsFolderCompactState::CompactCompleted	mailnews/base/src/nsMsgFolderCompactor.cpp:528
7	thunderbird.exe	nsFolderCompactState::CompactNextFolder	mailnews/base/src/nsMsgFolderCompactor.cpp:569
This might be caused by the fact that we clear m_parsingFolder earlier than we used to, which changes the code path we take in nsFolderCompactState::OnStopRunningUrl - but I don't see this crash here...
other crashes for this sig with reporters
bp-e778c5e5-bb99-45b8-82a0-d3aa72101211  (porsche)
bp-a3bafbc0-92cc-47e6-bdcf-8783c2101214  (john)
bp-69b1efa0-e8c5-48c9-81e0-b04582101209  (avtk)


nsFolderCompactState::FinishCompact() filed Bug 619366


also new in v3.1.7, same as this bug

nsCOMPtr_base::assign_from_helper(nsCOMPtr_helper const&, nsID const&) | nsFolderCompactState::CompactNextFolder() only one crash bp-baacbe7f-d9af-4e9c-aeaa-12ced2101212 (cy) 

-- same stack but not new --
nsFolderCompactState::CompactCompleted(unsigned int) filed as bug 619372


-- different stacks --

nsImapIncomingServer::GetCanCompactFoldersOnServer(int*)] different stack, filed Bug 619379

not new in v3.1.7 rare, only Mac, unclear if it needs bug
nsFolderCompactState::CompactNextFolder() 
bp-d4fea5a4-8ff3-4b26-a7ef-b614d2101212
EXC_BAD_ACCESS / KERN_PROTECTION_FAILURE
0xc
0	thunderbird-bin	nsFolderCompactState::CompactNextFolder	mailnews/base/src/nsMsgFolderCompactor.cpp:585
1	thunderbird-bin	nsMsgHeaderParser::ExtractHeaderAddressName	mailnews/mime/src/nsMsgHeaderParser.cpp:1255
2	thunderbird-bin	nsMsgDBView::FetchAuthor	mailnews/base/src/nsMsgDBView.cpp:395
3	thunderbird-bin	nsMsgDBView::GetCellText	mailnews/base/src/nsMsgDBView.cpp:1856
4	thunderbird-bin	nsMsgGroupView::GetCellText	mailnews/base/src/nsMsgGroupView.cpp:928
5	thunderbird-bin	nsTreeBodyFrame::PaintText	layout/xul/base/src/tree/src/nsTreeBodyFrame.cpp:3560
6	thunderbird-bin	nsTreeBodyFrame::PaintCell	layout/xul/base/src/tree/src/nsTreeBodyFrame.cpp:3305
OK, I figured out how to reproduce this - you need a folder with deleted messages, and a missing or out of date db.  The former, because we need to think there are deleted messages in order to compact a folder, and the latter to expose the bug. I think this only happens with local folders (i.e., pop3 or local folders account). Doing an advanced search on the account is one work-around for causing all the .msf files to get built.
Attached patch fix for crash (obsolete) — Splinter Review
this clears out the listener member var before notifying the listener, which fixes the infinite recursion. I'll try to come up with a unit test for this.

We might want a 3.1.8 for this, if it turns out to affect a lot of people. The work around is not obvious, since the user may not know which folder is has a missing or invalid .msf file...
Assignee: nobody → bienvenu
Attachment #497942 - Flags: review?(bugzilla)
trunk has this bug too...
still holding steady at 2.1 - 2.8% of crashes. 
but I can't tell without much digging whether it's trending down or up.

eyeballing reports of last two days sorted by date it looks like at least 1/3 are repeat crashes (reattempts) - not surprising. 

But I agree we need to think about whether to spin a 3.1.8
this alternative passes in the original url listener to the compact that gets called after reparsing the folder. This fixes the crash, and although it's the tail wagging the dog, I need to do this to write a unit test that works, because the original url listener was getting lost in the case of an invalid db.
Attachment #497942 - Attachment is obsolete: true
Attachment #498161 - Flags: review?(bugzilla)
Attachment #497942 - Flags: review?(bugzilla)
STR for reproducing the crash:

1. delete a message from a local folder (e.g., folder1).
2. Select a different folder (so we don't reselect folder1 on restart).
3. shutdown TB, delete folder1.msf
4. restart, do a file | compact all folders.

This will generate a stack overflow.
(In reply to comment #5)
>The work around is not obvious, since the user may not know which folder is has a
> missing or invalid .msf file...

This is definitely true in some cases - because some comments are about startup and expunge
http://getsatisfaction.com/mozilla_messaging/topics/please_help_inbox_crashing_thunderbird_after_latest_update is the only topic that's a sure match to this bug, though a few more topics I've just looked at might develop into matches.
Attachment #498161 - Flags: review?(bugzilla) → review+
I've checked this into trunk for today's nightlies:

http://hg.mozilla.org/comm-central/rev/0fdf675ff82f
blocking-thunderbird3.1: --- → .8+
Flags: in-testsuite+
Target Milestone: --- → Thunderbird 3.3a2
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Attachment #498161 - Flags: approval-thunderbird3.1.8+
the unit test fails on 3.1.x because the unit test fails to delete the "folder2.msf" file, which makes the compact code generate an assertion, which aborts tests in debug builds.  It fails because someone is holding onto the db, which is not happening on the trunk. I do have a fix for this (changing the folder compaction code not to use ParseFolder directly, but rather, call GetDatabaseWithReparse), which I'm going to propose for the trunk, because it'll make compaction maintain custom header attributes. But I have a Hobson's choice on the branch - don't add the unit test, which adds a bit more risk to the branch because we don't have test coverage, or take an additional fix for the branch just to make the unit test happy, which adds risk of its own. I'll spend a bit more time trying to figure out why the branch is holding onto the db more than the trunk, but I suspect that'll just result in an additional fix that we may or may not want for the branch.
never mind, my mistake - my tree was out of date 
changeset:   5915:f618bf426a15
tag:         tip
user:        David Bienvenu <bienvenu@nventure.com>
date:        Tue Dec 21 09:45:40 2010 -0800
summary:     fix crash compacting folders with out of date or missing summary fi
les, a=standard8, bug 619358
(In reply to comment #13)
> I have a Hobson's
> choice on the branch - don't add the unit test, which adds a bit more risk to
> the branch because we don't have test coverage, or take an additional fix for
> the branch just to make the unit test happy, which adds risk of its own.
> I'll spend a bit more time trying to figure out why the branch is holding onto the
> db more than the trunk, but I suspect that'll just result in an additional fix
> that we may or may not want for the branch.

bienvenu, have you flipped the coin yet? :)

This is top 5 crash, so would be great to get on branch.
nevermind, I missed comment 14 :(
awesome
Crash Signature: [@ nsFolderCompactState::OnStopRunningUrl(nsIURI*, unsigned int)]
crash does not exist in v5 and newer.
v.fixed

perhaps this also fixed Bug 484550, which does not exist for v5?
Status: RESOLVED → VERIFIED
Blocks: 484550
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: