Closed Bug 427614 Opened 17 years ago Closed 17 years ago

increased memory usage on TB startup of 100-120MB shortly after profile manager, and 100% cpu for several seconds

Categories

(MailNews Core :: Database, defect)

x86
Windows Vista
defect
Not set
major

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: wsmwk, Assigned: Bienvenu)

References

Details

(5 keywords, Whiteboard: fixed by backout of bug 418551)

I'm seeing 43MB for 20080331 build, which increases to 161MB for 20080404. And 100% usage during startup for 10-20 seconds. And, not surprising, in memory constrained system (with memory hog FF), general lack of TB UI responsiveness. likely regression of bug 418551. could be dupe of bug 426658. I have a copy of my profile if needed. reproducible: 100%
On the first startup when using the new sqlite system, I saw a similar period of high CPU usage (didn't check memory at the time) In subsequent startups, except for the circumstances in Bug 426658 I see memory usage at 28.8 MB with a short period of 100% CPU Reverting to an older build with Panacea.dat memory is at 27.7 MB with the same profile.
Here are my memory numbers for testing on Vista RTM (not SP1) on fresh profiles, immediately after startup: 20080331 Commit size: ~18,000Kb +/- 5%, no 100% lockup. 20080402 Commit size: ~18,000Kb +/- 5%, no 100% lockup. 20080407 Commit size: ~18,000Kb +/- 5%, no 100% lockup. After logging into an IMAP account with ~150 mails at the inbox: 20080331 Commit size: ~23,000Kb +/- 5%, no lockup. 20080402 Commit size: ~22,000Kb +/- 5%, no lockup. 20080407 Commit size: ~24,000Kb +/- 5%, no lockup. And switching to a folder with 2,000 mails: 20080331 Commit size: ~25,000Kb +/- 5%, no lockup. 20080402 Commit size: ~24,000Kb +/- 5%, no lockup. 20080407 Commit size: ~26,000Kb +/- 5%, no lockup. Random clickarounds, reading of mails, sorting by threading etc. in that 2,000 mail folder: 20080331 Commit size: ~25,000Kb +/- 5%, no lockup. 20080402 Commit size: ~25,000Kb +/- 5%, no lockup. 20080407 Commit size: ~27,000Kb +/- 5%, no lockup. .msf file sizes in profile at end of testing: 20080331 final ~150 mails folder size: ~76Kb, final ~2,000 mails folder size: ~717Kb 20080402 final ~150 mails folder size: ~73Kb, final ~2,000 mails folder size: ~680Kb 20080407 final ~150 mails folder size: ~73Kb, final ~2,000 mails folder size: ~680Kb Thus, I don't seem to see any form of 100% CPU lockup anywhere on Vista, though memory numbers are varying throughout the nightlies. Still needs more testing.
Keywords: qawanted
worse than I thought - high cpu use is 45 seconds. (versus 9 seconds 20080331 build). shutdown is also longer, ~20 seconds (versus 4 seconds 20080331 build). Highly repeatable. in this profile panacea.dat is 215k. msgFolderCache.sqlite is 872K. cpu usage and memory increase occurs between profile manager prompt and the prompt about do you want to make this executable the default application for mail, news and rss. another profile, 15K panacea.dat=>75K msgFolderCache.sqlite (100k after shutdown), 9 second startup to first account's password prompt.
Wayne, this is tricky, I can't seem to reproduce on a fresh profile, but my guess may be due to some migration issues? You could run a debug build to see if the issue can be detected and the relevant data obtained. (For Macs, hangs can be detected using SpinControl.app if I remember correctly, not so sure of Windows though.)
Gary, you won't see much perf impact on a fresh profile. (unless you copied over your mail accounts and newsgroups) The more accounts mail and newsgroups that you have the more the impact. If you stick with it through the new DB build process, it does settle down somewhat. But bug 426658 is a real PIA for everyday use.
My setup is: 11 news accounts, with probably 30 subscribed groups RSS ~ 40 feeds 2 pop accounts - 2 imap account - about 20 folders each + 11 virtual folders local folders - 394 folders (based on # of *.msf files)
I can runa windows debug build if needed, someone would have to supply it and walk me through.
Someone should put this in their mozconfig: --enable-dtrace And then run it through DTrace ("Instruments.app" on OS X") to see what's going on. If the memory increase is this huge as the summary suggests, it should be pretty obvious where things go wrong, and where to focus.
question, does marking thread as read (via "r") also take longer? (I need to investigate)
Keywords: footprint, perf
Surely what's going on is every .msf file is getting opened, because there's no folder cache, and every .msf file is left open, so all the memory allocated to open the dbs remains allocated. A tool that shows you all the open file handles belonging to TB will tell you if that's the case. Clicking the folders one after the other might actually cause the memory usage to go down because clicking away from a folder (other than the inbox) causes us to release the folder's cached reference to the db.
I'm skeptical that sqlite performs fast enough to be used in paint calls, unlike Mork, which would mean that some sort of additional caching would be needed. But I'm no sqlite expert - does Lightning issue sql queries from paint calls, or does it cache stuff in memory?
would speed of "rebuild index" also be impacted? I just did one and performance "stank" - on the order of 5-10x
yes, I suspect that storing data in sqlite is quite a bit slower than storing it in Mork, so if we're updating the total and unread message counts incrementally as we parse the folder, that's going to cause a lot of sql updates - if we're not doing transactions, or committing after each update, that would slow us down as well. Mork was designed to be fast for the things that we do a lot in the mail code (e.g., lots of small updates). Sqlite is not so much, so the mailnews code will need to be changed not to assume such a fast db.
So it sounds like what mailnews wants is async db operations, so as not to block UI when there are lots of small uncommited changes?
judging from the comments in the great sqlite async io bug, I would not jump to that conclusion. It would be far better to avoid changing the same piece of data many many times quickly (e.g., the total msgs count) if we know that's what we're doing. I'm just assuming that's why Wayne found rebuild-index to be slow, but I could be wrong. I would certainly try avoiding committing transactions frequently first before doing async io.
datapoint - unknown if related to "sql" - in the last hour I had two cases where a message was in my thread twice, which is what drove me to do reindex.
As per discussion in today's Tb meeting, in bug 418551, I've proposed that we back that patch out and take another run at landing it after we've got more concrete understanding of the various performance, memory, & disk footprint implications.
datapoint - shutdown just took 1.5 minutes, TB instance was running ~18 hrs (don't know if it's relevant, but moderately memory constrained) Also, for a few days now I've been flopping back and forth frequently between sqlized and non-sql version with no ill effects.
The backout of bug 418551 should have fixed this.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Product: Core → MailNews Core
Status: RESOLVED → VERIFIED
Whiteboard: fixed by backout of bug 418551
You need to log in before you can comment on or make changes to this bug.