Closed Bug 1305314 Opened 9 years ago Closed 5 years ago

Thunderbird 45.3.0 on Ubuntu 14.04.5 seems to hog CPU at 100% when mail.db.idle_limit is set to default 300000

Categories

(MailNews Core :: Backend, defect)

Unspecified
Linux
defect
Not set
major

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: cedric.bhihe, Unassigned)

References

Details

(Keywords: perf)

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:49.0) Gecko/20100101 Firefox/49.0 Build ID: 20160919213721 Steps to reproduce: reset mail.db.idle.limit to 300000 from my earlier modification to 30 000 000 per tip on http://rainbow.chard.org/2013/02/19/thunderbird-high-cpu/ . Actual results: TB almost instantly hogs CPU at 98%-100% and crawls to a halt.
We have a few other reports where changing mail.db.idle_limit helps. How many imap accounts, pop accounts, and (roughly) total number of folders?
Severity: normal → major
Flags: needinfo?(cedric.bhihe)
Keywords: perf
Hi Wayne, thanks for responding so quickly. Seen a great deal of informal reports in fora. Seen also yr repeated suggestions to report bugs. I have two POP3 accounts with approx. 8GB of received and sent emails Account A: 3.5GB in approx 260 folders, Account B: 4.5GB in approx 340 folders.
Component: Untriaged → Backend
Product: Thunderbird → MailNews Core
Summary: Thunderbird 45.3.0 on Ubuntu 14.04.5 x64 seems to hog CPU at 100% → Thunderbird 45.3.0 on Ubuntu 14.04.5 x64 seems to hog CPU at 100% when mail.db.idle.limit is set to default 300000
Version: 45 Branch → 45
(In reply to cedric.bhihe@gmail.com from comment #0) > Steps to reproduce: > reset mail.db.idle.limit to 300,000 from my earlier modification to 30,000,000 > per tip on http://rainbow.chard.org/2013/02/19/thunderbird-high-cpu/ . > Actual results: > TB almost instantly hogs CPU at 98%-100% and crawls to a halt. Why you do it intentionally even though you already know the tip... I guessed 300,000 microseconds(0.3sec) is internally used instead of 300,000 miliseconds(300secs==5minutes). (In reply to cedric.bhihe@gmail.com from comment #2) > I have two POP3 accounts with approx. 8GB of received and sent emails > Account A: 3.5GB in approx 260 folders, > Account B: 4.5GB in approx 340 folders. MsgDB open(and close) is not relevant to msgStore file size. MsgDB open is relevant to xxx.msf file size which is relevent to "number of messeages in a folder", because entire xxx.msf file is read from HDD and msgDB object is recreated from the xxx.msf file data. (Q1) What is average xxx.msf file size and total xxx.msf file size? Is there "file size=0" xxx.msf file? Is there "xxx exist but xxx.msf doesn't exist" state in your mail directory? Even if many folders(==many sets of xxx.msf/xxx) exist in your Thunderbird, IIUC, folder is not opened unless someone accesses to it. (Q2) Do you open such many folders(700 folders) at same time? (Q3) Do you define virtual folder(search folder) which has search target=many folders, or search target=almost all folders you have? If virtual folder(search folder), all search target folders are opened when restart. And, virtual folder(search folder) may watches search target folder state. So, folder(msgDB) may be re-opened immediately by virtual folder(search folder) when folder(msgDB) is closed by mail.db.idle_limit. If 700 folders are opened at same time, file handle shortage may occur. Even if file handle shortage happens and msgDB open fails, if it's on folder(msgDB) open, Thunderbird can be tolerant with it, because Rebuild-Index(Repair Folder) is internally invoked upon next folder(msgDB) open. (in contrast to it, if file handle shortage occurred on xxx.mab file=AddressBook, xxx.mab file won't be re-opened.) In this case, following can occur. (a) mail.db.idle.limit=30,000,000 msgDB close by mail.db.idle.limit, msgDB re-open after the msgDB close, Rebuild-Index(Repair Folder) due to something bad, and so on, are executed within far less than 100% CPU utilization. (b) mail.db.idle.limit=300,000 msgDB close by mail.db.idle.limit, msgDB re-open after the msgDB close, Rebuild-Index(Repair Folder) due to something bad, and so on, require greater than 100% CPU utilization. Another concern is Gloda(Global Indexer, and Global Search). Gloda may watch folder state, so Gloda may re-open folder just after folder close by mail.db.idle.limit. Further, contention between MsgDB close and MsgDB open may produce Rebuild-Index(Repair Folder). See Bug 905576, please. (note: pretty long) Because problem is relevant to msgDB close and msgDB open, please get MsgDB:5 log(NSPR log) first, and check log content by Text Editor. See "User Story" of Bug 402793 for getting NSPR log.
(In reply to WADA from comment #3) > (In reply to cedric.bhihe@gmail.com from comment #0) > > Steps to reproduce: > > reset mail.db.idle.limit to 300,000 from my earlier modification to 30,000,000 > > per tip on http://rainbow.chard.org/2013/02/19/thunderbird-high-cpu/ . > > Actual results: > > TB almost instantly hogs CPU at 98%-100% and crawls to a halt. > > Why you do it intentionally even though you already know the tip... > > I guessed 300,000 microseconds(0.3sec) is internally used instead of 300,000 > miliseconds(300secs==5minutes). 5 minutes in millisec per bug 793455. N.B. the pref name is mail.db.idle_limit, not mail.db.idle.limit > (In reply to cedric.bhihe@gmail.com from comment #2) > > I have two POP3 accounts with approx. 8GB of received and sent emails > > Account A: 3.5GB in approx 260 folders, > > Account B: 4.5GB in approx 340 folders. you might also try mail.db.max_open > 30 (the default) But first, look forward to your response to wada's questions! :)
See Also: → 1305207
Summary: Thunderbird 45.3.0 on Ubuntu 14.04.5 x64 seems to hog CPU at 100% when mail.db.idle.limit is set to default 300000 → Thunderbird 45.3.0 on Ubuntu 14.04.5 x64 seems to hog CPU at 100% when mail.db.idle_limit is set to default 300000
(In reply to cedric.bhihe@gmail.com from comment #2) > Hi Wayne, thanks for responding so quickly. Seen a great deal of informal > reports in fora. Seen also yr repeated suggestions to report bugs. > I have two POP3 accounts with approx. 8GB of received and sent emails > Account A: 3.5GB in approx 260 folders, > Account B: 4.5GB in approx 340 folders. Ok, and how many of those folders regularly get new email in those 5 minutes or you look into them? Also, do you think the speed of looking into folders has improved greatly after you increased the idle limit? Also, has TB memory use increased?
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #4) > (In reply to WADA from comment #3) > > I guessed 300,000 microseconds(0.3sec) is internally used instead of 300,000 miliseconds(300secs==5minutes). > 5 minutes in millisec per bug 793455. Change by the bug 793455 is following only. > -// How long should we leave idle db's open, in seconds. > -pref("mail.db.idle_limit", 3000); > +// How long should we leave idle db's open, in milliseconds. > +pref("mail.db.idle_limit", 300000); https://dxr.mozilla.org/comm-central/source/mailnews/base/util/msgDBCacheManager.js#105 > checkCachedDBs : function () > > let idleLimit = Services.prefs.getIntPref("mail.db.idle_limit"); > > // db.lastUseTime below is in microseconds while Date.now and idleLimit pref > // is in milliseconds. > let closeThreshold = (Date.now() - idleLimit) * 1000; > > let cachedDBs = gDbService.openDBs; > > for (let i = 0; i < cachedDBs.length; i++) { > > if (db.lastUseTime < closeThreshold) > { > log.debug("closing expired msgDatabase for folder: " + db.folder.name); > db.folder.msgDatabase = null; > } > else > numOpenDBs++; > } I thought "close msgDB" was executed using explicit msgDB close function. db.folder.msgDatabase = null is : Set null in msgFolder.msgDatabase property of msgFolder object for a mail folder which is currently referred via. db.folder. msgDB object, which is originally C++ Object, will be purged by garbage collector sooner or later when reference count will become ZERO. However, if the msgDB object is accessed via other than msgFolder.msgDabase property, the msgDB Object can be accessed until it's purged by garbage collector of both JavaScript and C++. This kind of timing issue may be a cause of "contention between msgDB close and msgDB open" which was observed in Bug 905576. 1. By checkCachedDBs, db.folder.msgDatabase = null is set. 2. Because the msgDB for the folder is removed from cachedDBs, when other task tries to open the msgFolder/msgDB, or when other task re-opens the msgDB due to access error to folder.msgDatabase, or when other task accesses the msgDB via. msgfolder.msgDatabase(null is set), if the msgDB object is not purged by garbage collector, it's referred again, and if the msgDB object is already purged by garbage collector, msgDB is opened from scratch and entire xxx.msf file is read from HDD. IIUC, microsecond resolution in PR_Now() is achieved by utilizing High Precision Event Timer(HPET) of chip set. On MS Windows, it's accessed via. QueryPerformanceCounter() of MS Windows. And, "Skew of timer in multi-CPU environment due to ACPI" is already hidden by HAL of MS Windows. On Linux, it's perhaps achieved using other function of Linux. Strategy on the "Skew of timer in multi-CPU environment due to ACPI" may depend on Linux version or Linux distro. IIRC, it's already resolved in Linux, but default setting may be different by distro. ACPI : Advanced Configuration and Power Interface This bug may be affected by such difference. If such timer skew is relevant, problem won't occur when ACPI is completely disabled on your PC. But, if "db.folder.msgDatabase=null" is done by checkCachedDBs() each 5 minutes, I can't think such CPU hog can occur merely by changing mail.db.idle_limit from 30,000,000 to 300,000. In Linux, resolution of high resolution timer may be larger than MS Windows. Or, wraparound of high resolution timer may occur on Linux frequently than we thought.
(In addition to comment #6 by me) Thunderbird is currently shippred with following setting already. > (a) pref("mail.db.idle_limit", 300000); I can't think all Linux/Thunderbird users do followiong tip. > (b) pref("mail.db.idle_limit", 30000000); (Q1) Does any Linux distro distribute Thunderbird for his distro with mail.db.idle_limit=30000000? If not, and if CPU hog occurs by mail.db.idle_limit=300000 on any Linux, CPU hog issue on Linux should be reported at many support forums by many Linux/Thunderbird users, and many bugs should be opened at B.M.O by many Linux/Thunderbird users. (Q2) why problem of CPU hog can be observed in your environment merely by changing mail.db.idle_limit from 30,000,000 to 300,000?
You can also help by restoring the value to default and the enabling debug output of the database cache manager. You set these 2 prefs to a value of "Debug" via the Advanced config method that you already know: +// Should we output dbcache log via dump? Set to "Debug" to show. +pref("mailnews.database.dbcache.logging.dump", "None"); +// Should we output dbcache log to the "error console"? Set to "Debug" to show. +pref("mailnews.database.dbcache.logging.console", "None"); Then restart TB and open Tools->Error console. Every minute you will see a database purge run. Please observe if there are any oddities. Like databases closed sooner than 5 minutes. Less databases left open than the default 30? Or do you have always so many databases/folders cached, that some are always closed and need to be reopened soon after. Please provide some stats from the output.
FYI. Bug 1305207 is report for similar phenomenon on Mac OS X when mail.db.idle_limit is chasnged from 30,000,000 to 300,000 which is default. In that bug, log by mailnews.database.dbcache.logging.console=Debug was obtained by bug opener.
See Also: → 1388696
Summary: Thunderbird 45.3.0 on Ubuntu 14.04.5 x64 seems to hog CPU at 100% when mail.db.idle_limit is set to default 300000 → Thunderbird 45.3.0 on Ubuntu 14.04.5 seems to hog CPU at 100% when mail.db.idle_limit is set to default 300000

Hard to believe this bug is 3 years old

pitrou, can you please run the performance profiler, using some modified value of mail.db.idle_limit (that doesn't make the problem entirely go away)?

  1. You must be using Thunderbird 68 or newer - betas from https://www.thunderbird.net/en-US/channel/ or current nightly build from https://archive.mozilla.org/pub/thunderbird/nightly/latest-comm-central/
  2. Install profiler add-on into thunderbird 68 (or newer https://www.thunderbird.net/en-US/channel/ ) - get the add-on file from https://github.com/firefox-devtools/Gecko-Profiler-Addon/blob/master/gecko_profiler.xpi?raw=true and in Tools > add-ons click the gear to install add-on from file
  3. Follow instructions at https://profiler.firefox.com/ (videos BASED ON FIREFOX at https://profiler.firefox.com/docs/#/./videos-intro )
  4. Create a profiler URL and post it here.
Flags: needinfo?(cedric.bhihe) → needinfo?(pitrou)

I've just tried this. The Gecko Profiler Addon doesn't seem to work correctly. When I install it at first, I can start and stop profiling, capture a profile and send it to profiler.firefox.com for display... But once I restart Thunderbird, the addon doesn't work anymore. The icon says "Gecko Profiler (off)", yet when I click on it it says "The profiler is recording", the "Discard & stop" button doesn't work and neither does the "Capture Profile" button. The only way to get it to work again is to uninstall and reinstall it (merely disabling then reenabling isn't enough). This is with Thunderbird 68.2.2, Ubuntu 18.04 version.

As a sidenote, I've been running with mail.db.idle_limit set to 30000 for ~one hour and haven't encountered the issue again, but it may need more time.

Flags: needinfo?(pitrou)

9 days after having set mail.db.idle_limit to 30000, I still haven't experienced the original issue again. I'm using Thunderbird on a daily basis. So perhaps the root cause disappeared...

The reporter, cedric, reported privately that this issue is gone. "changed platform to Archlinux and moved on to posterior versions of TB, rounding 68.2.2 right now (as you know). But thank you very much for following up. Really. It's much appreciated. I'd move to close this "bug report", as there is no supporting evidence other than hunches here and there, that it may have had to do with Ubuntu's handling of low memory situations. The discussion reignites regularly with people tearing their hair off whenever their mem-pager starts going loco. Don't know whether this is relevant, but in my case, changing box, distro and league (in terms of RAM) was a drastic change ... for the much better."

If anyone still sees this issue please obtain a profile using https://github.com/thunderbird-conversations/thunderbird-conversations/wiki/Profiling-Conversation's-Performance And if you are not the author of this bug report, please create a new bug report.

(In reply to pitrou from comment #14)

9 days after having set mail.db.idle_limit to 30000, I still haven't experienced the original issue again. I'm using Thunderbird on a daily basis. So perhaps the root cause disappeared...

Thanks for the update.

Status: UNCONFIRMED → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
OS: Unspecified → Linux
You need to log in before you can comment on or make changes to this bug.