Closed Bug 249754 Opened 18 years ago Closed 13 years ago
Unable to open all mail folders when there are large number of mail folders, after compact
When opening a mail account for the first time. Mozilla mail client will create a .msf for each mail folder in local directories. But Mozilla mail client never close these "*.msf" files afte creating. So when there are too many mail folders (more than the max number of files can be opened by a process e.g. 1024 on Linux and 256 on solaris). Mozilla mail client will fail to create these files because too many files opened (created). What user see is he can see the folders in the left side (tree) in the UI but he can't see any mails in these folders.
To reproduce the bug: If you have a mail account that has more than 256 mail folders, you can re-produce it on solaris platform. 1. launch mozilla 2. create a new profile 3. setup the mail account or you don't have so many folders then you can try: 1. open db/mork/src/morkFile.cpp 2. add printf statement after each "MORK_FILEOPN" and before each "MORK_FILECLOSE". 3. rebuild mozilla 4. launch mozilla mail client ( you had better dump the output to a file) 5. setup a mail account 6. let mozilla create all "*.msf" files i.e. wait untill there are no any communications between cient and server. 7. close mozilla mail. check the dump file, You can see near the end of file there are too many open files operations but no corresponding close() operations.
Comment on attachment 152325 [details] [diff] [review] close the *.msf after "get" and "set" ACL flags of a mail folder Can you give r? Thanks
Attachment #152325 - Flags: review?(bienvenu)
Please see the comments in bug 227770
Comment on attachment 152325 [details] [diff] [review] close the *.msf after "get" and "set" ACL flags of a mail folder no reason to force closed...I'll look at this real quick.
Attachment #152325 - Flags: review?(bienvenu) → review-
What build were you running? I think this was fixed a while ago. And I could not recreate it. We leave open the db's for the special folders (drafts, templates, sent, inbox) but not other folders, on initial folder discovery.
The mozilla I used is 1.8a2. I will checkout the latest build and check if the bug is still there
But the bug also exists in thunderbird. The version I used is version 0.6 (20040714)
The bug can be re-repoduced on solaris 8,9 and 10. Because the max number of files permited to open by a process is 256 instead of 1024.
OS: Linux → Solaris
Hi, Dave, Could you please tell me which patch (or bug ID) fixed this bug? Thanks
I have tried the latest build The bug is still there. Please see comments in bug 227770
are you saying that we get acl for each folder that we discover, i.e., that we issue protocol to the server to retrieve acl info? I don't see that happening when I try it, and the .msf files don't seem to be opened for all the folders. Or is it the case that we get the acl for all the folders from all the db's, thus opening them in the process? We don't want to close the db's after anyone gets the acl, because you'll close the db out from open folders in some situations.
I further investiaged the bug. Let me explain the bug in more details as following: Frist, suppose there is a user X, who has created some (many) mail folders say F1,F2 ... Fn. After creating a new profile, X sets up his mail account in mozilla mail client. After connecting to a IMAP mail server, Mozilla mail will call nsImapIncomingServer::AddFolderRights() at some point after it sees F1 and the following is the call stack: nsImapIncomingServer::AddFolderRights() nsImapMailFolder::AddFolderRights() nsMsgIMAPFolderACL::SetFolderRightsForUser() nsMsgIMAPFolderACL::UpdateACLCache() nsImapMailFolder::SetAclFlags() nsImapMailFolder::GetDatabase() nsMsgDatabase::OpenFolderDB() ... morkStdioFile::OpenStdio() a F1.msf fill will be opened with "rb+" mode. The same thing will happen on F2,F3 ... Fn And these files are never closed at least before mail client finish scanning all these folders. Actually, I found that these files are closed only when Mozilla mail client are terminated. If X has 300 mail folders, he will see the bug on solaris platform. Because the max number of files a process can open is 256. While on Linux X should have more than 1024 folders to see the bug, which is a rare case in practice. That's why my patch will try to close these files.
only close the db if we opened it - also, don't even try to open the db if the acl flags haven't changed.
Attachment #152325 - Attachment is obsolete: true
Brian, can you try this patch and let me know if it fixes the problem for you? thx!
Status: NEW → ASSIGNED
David, Yes. your patch does fix my problem.
Comment on attachment 155282 [details] [diff] [review] proposed fix excellent, Thx, Brian.
Attachment #155282 - Flags: superreview?(mscott)
David, Can you checkin the patch?
I will after I get a code review :-)
Attachment #155282 - Flags: superreview?(mscott) → superreview+
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
David, oho can give r to this patch?
the fix has been checked in, r/sr=mscott.
There is another place needed to be fixed. The following is the call stack: ... nsMsgFolderDataSource::GetTarget() nsMsgFolderDataSource::createFolderNode nsMsgFolderDataSource::createFolderOpenNode nsImapMailFolder::GetSubFolders() nsImapMailFolder::CreateSubFolders() nsImapMailFolder::AddSubfolderWithPath() nsMsgDBFolder::GetFlags() nsMsgDBFolder::ReadDBFolderInfo() nsImapMailFolder::GetDBFolderInfoAndDB nsImapMailFolder::GetDatabase() nsMsgDBService::OpenFolderDB() nsMailDatabase::Open() nsMsgDatabase::Open() nsMsgDatabase::OpenMDB() morkFactory::OpenOldFile() morkFile::OpenOldFile() morkStdioFile::OpenOldStdioFile() morkStdioFile::morkStdioFile() morkStdioFile::OpenStdio() I think we should do the same thing in nsImapMailFolder::GetDBFolderInfoAndDB() as the previous patch does for nsImapMailFolder::GetAclFlags() or we will have the same problem. I.e. the previous patch fixes only part of the problem.
No, the whole point of GetDBFolderInfoAndDB() is to open the db...your patch closes the db and returns the closed db pointer to the caller, if I'm reading it correctly. Has the problem re-appeared? In comment 16, you said that it was fixed at that point...
Yes, the problem re-appeared. Some of our Sun internal users saw this problem again and I found Mozilla open too many files in morkStdioFile::OpenStdio() so that it exceeds the max open file numbers.
mDatabase is still open after nsMsgDBFolder::ReadDBFolderInfo() and never has a chance to close it. So when there are too many .msf files, Mozilla will exhaust opening handler slots. So I think we need to close the db if it is not opened before entrying nsMsgDBFolder::ReadDBFolderInfo().
Comment on attachment 184474 [details] [diff] [review] a new fix again, no - you're closing the db out from underneath the db folder info. This needs to be fixed at a much higher level.
Attachment #184474 - Flags: review?(bienvenu) → review-
Boying Lu, comment 25 you said you still had trouble. Is that still true? Is your proposal (comment 28) still appropriate?
Comment on attachment 345243 [details] [diff] [review] another patch I'm sceptical that GetUidValidity is leaving databases open. We only call that when you select a folder, and if you select a folder, we're going to use the db after this call (and the db was probably open before this call anyway). And, for Compact, if we issue a compact, I suspect we're going to just open the db again right away as well. I can double check, but surprised that this made any difference...
> And, for Compact, if we issue a compact, I suspect we're going to just open the > db again right away as well. I can double check, but surprised that this made > any difference... Here is a call stack of "Compact()" which causes the error: libc.so.1`__open+0x4 libc.so.1`_endopen+0xc0 libc.so.1`fopen+0x1c libmork.so`unsigned morkStdioFile::AcquireBud(nsIMdbEnv*,nsIMdbHeap*,nsIMdbFile**)+0x144 libmork.so`void morkWriter::MakeWriterStream(morkEnv*)+0x184 libmork.so`unsigned char morkWriter::WriteMore(morkEnv*)+0x8c libmork.so`void morkThumb::DoMore_Commit(morkEnv*)+0x34 libmork.so`void morkThumb::DoMore_CompressCommit(morkEnv*)+0x18 libmork.so`void morkThumb::DoMore(morkEnv*,unsigned*,unsigned*,unsigned char*,unsigned char*)+0x120 libmork.so`unsigned morkThumb::DoMore(nsIMdbEnv*,unsigned*,unsigned*,unsigned char*,unsigned char*)+0x6c libmail.so`unsigned nsMsgDatabase::Commit(int)+0x2cc libmail.so`unsigned nsMsgDBFolder::SetStringProperty(const char*,const char*)+0x3d4 libmail.so`unsigned nsMsgDatabase::ApplyRetentionSettings(nsIMsgRetentionSettings*,int)+0x46c libmail.so`unsigned nsMsgDBFolder::ApplyRetentionSettings(int)+0x1ec libmail.so`unsigned nsMsgDBFolder::ApplyRetentionSettings()+0x14 libmail.so`unsigned nsImapMailFolder::Compact(nsIUrlListener*,nsIMsgWindow*)+0x7c libmail.so`unsigned nsFolderCompactState::Compact(nsIMsgFolder*,int,nsIMsgWindow*)+0xfc libmail.so`unsigned nsFolderCompactState::CompactNextFolder()+0x374 libmail.so`unsigned nsFolderCompactState::OnStopRunningUrl(nsIURI*,unsigned)+0x10c libmail.so`unsigned nsUrlListenerManager::BroadcastChange(nsIURI*,nsUrlNotifyType,unsigned)+0x2d8
Here is the call stack of "GetUidValidity()" which causes the problem: libc.so.1`__open+0x4 libc.so.1`_endopen+0xc0 libc.so.1`fopen+0x1c libmork.so`void morkStdioFile::OpenStdio(morkEnv*,const char*,const char*)+0xa8 libmork.so`morkStdioFile::morkStdioFile #Nvariant 1(morkEnv*,const morkUsage&,nsIMdbHeap*,nsIMdbHeap*,const char*,const char*)+0x7c libmork.so`morkStdioFile*morkStdioFile::OpenOldStdioFile(morkEnv*,nsIMdbHeap*,const char*,unsigned char)+0x90 libmork.so`unsigned morkFactory::OpenOldFile(nsIMdbEnv*,nsIMdbHeap*,const char*,unsigned char,nsIMdbFile**)+0x30 libmail.so`unsigned nsMsgDatabase::OpenMDB(const char*,int)+0xd0 libmail.so`unsigned nsMsgDatabase::Open(nsIFileSpec*,int,int)+0xc0 libmail.so`unsigned nsMailDatabase::Open(nsIFileSpec*,int,int)+0x48 libmail.so`unsigned nsMsgDBService::OpenFolderDB(nsIMsgFolder*,int,int,nsIMsgDatabase**)+0x29c libmail.so`unsigned nsImapMailFolder::GetDatabase(nsIMsgWindow*)+0x114 libmail.so`unsigned nsImapMailFolder::GetDBFolderInfoAndDB(nsIDBFolderInfo**,nsIMsgDatabase**)+0x3c libmail.so`unsigned nsImapMailFolder::GetUidValidity(int*)+0x54 libxpcom_core.so`XPTC_InvokeByIndex+0x48 libxpcom_core.so`void*EventHandler(PLEvent*)+0x28 libxpcom_core.so`PL_HandleEvent+0x14 libxpcom_core.so`PL_ProcessPendingEvents+0x78 libxpcom_core.so`unsigned nsEventQueueImpl::ProcessPendingEvents()+0x7c libwidget_gtk2.so`int event_processor_callback(_GIOChannel*,GIOCondition,void*)+0x18
bug 278208 related?
=> all since this is being pursed as a general solution
Severity: normal → critical
OS: Solaris → All
QA Contact: database
Hardware: PC → All
this makes it so we clear the folders' cached database pointer when we've finished compacting an imap folder in the queue. I've verified that w/ this patch, we no longer leave all imap db's open after doing a file | compact folders on an imap account.
I'm the guy who reported this as bug 420300 and noted that this bug also affects Search Folders. Is there a way to fix it once so that it fixes both Compact All Folders and Search All Folders? Thank you for taking this up...
Does the patch also fix the issue in Comment #33? If yes, I can provide a TB2 build with this patch to ask for the user to verify it.
Perhaps - it only affects users who do a file | compact folders. The stack in comment 33 isn't really sufficient to see what process is going on, other than that we're trying to do a folder select on the imap server, which will cause us to fetch imap uid validity, and which does happen during folder compact. Thx for having your user try to verify it.
Comment on attachment 345243 [details] [diff] [review] another patch I'm going to minus this - it will make us open, close and re-open the .msf file for each folder compacted, and will also leave the .msf file open if any messages need to be expunged. Releasing the folder db ref after the compact is the right time to do this...thx again for working on this.
changeset: 1150:a339e31bd145 marking fixed for the compact folders case...if there are remaining issues, we should open a new bug for single issues with steps to reproduce...
Status: REOPENED → RESOLVED
Closed: 18 years ago → 13 years ago
Resolution: --- → FIXED
Our user has confirmed that the bug has been fixed on TB188.8.131.52+patch. Can you put this patch back to TB2 code base?
(In reply to comment #45) > Our user has confirmed that the bug has been fixed on TB184.108.40.206+patch. > Can you put this patch back to TB2 code base? does this affect areas that are, or act like, dataloss? failed filters? junk message not moving? where # folders is not extremely large? If the answer is yes, then perhaps this simple patch should go into 1.8.x
Summary: Unable to open all mail folders when there are large number of mail folders → Unable to open all mail folders when there are large number of mail folders, after compact
it's basically a resource usage issue - since compacting imap folders wasn't closing db's, it was consuming file handles and memory. If you had a lot of folders, then you could use up a lot of file handles - that tends to be an issue if your OS has a reachable limit to the number of open files, which some varieties of Unix do. Not being able to open more files leads to all sorts of failures which you need to shutdown and restart to fix. Using a lot of memory can cause slowdowns and eventual failure if you run out of swap space, for example.
Comment on attachment 348647 [details] [diff] [review] proposed fix for imap compact all folders relatively simple fix that we've been running with on the trunk for quite a while.
Attachment #348647 - Flags: approval1.8.1.next?
Comment on attachment 348647 [details] [diff] [review] proposed fix for imap compact all folders Approved for 220.127.116.11. a=ss
Attachment #348647 - Flags: approval1.8.1.next? → approval1.8.1.next+
Checked fix into the 1.8 branch for Bienvenu Checking in mailnews/base/src/nsMsgFolderCompactor.cpp; /cvsroot/mozilla/mailnews/base/src/nsMsgFolderCompactor.cpp,v <-- nsMsgFolderCompactor.cpp new revision: 18.104.22.168; previous revision: 22.214.171.124
You need to log in before you can comment on or make changes to this bug.