164.99 KB, text/plain
286.79 KB, text/plain
4.39 KB, patch
|Details | Diff | Splinter Review|
4.37 KB, patch
|Details | Diff | Splinter Review|
Created attachment 705274 [details] Stack traces for threads present at startup This report, and the attached traces, relate to Thunderbird 17.0.2, specifically the thunderbird-17.0.2-1.fc18.x86_64 Fedora packaged version but I have been seeing this leak since at least Thunderbird 9. A previous downstream report is here: https://bugzilla.redhat.com/show_bug.cgi?id=784193 Basically, over a period of days the number of threads in use by Thunderbird will increase to the point that it is using hundreds of threads and other programs start being unable to fork or create threads because the per-user process limit has been reached. To illustrate I am attaching two dumps collected with gdb using "thread apply all bt full" to get a backtrace of all the threads. The first was collected shortly after startup, when it was using 36 threads, and the second 24 hours later when it was using 85 threads.
Hi Tom. Does it reproduce using a *mozilla* supplied package?
Summary: Thunderbird leaks threads → Thunderbird leaks process threads
The simple answer to that question is no The complicated answer is that doing that caused me to switch from a 64 bit to a 32 bit build, which led to lightning being disabled, which led me to figure out that it is actually lightning causing the leak, not thunderbird itself, so I am going to close this.
Status: UNCONFIRMED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → INVALID
Tom, thanks for the update. I'm throwing this over to calendar from the redhat bug... Tom Hughes 2013-01-31 05:10:04 EST After further investigation I have discovered that disabling the lightning extension stops the thread leak, so it seems that lightning is actually responsible and I am going to transfer this bug to the thunderbird-lightning component. I have also done some comparisons of the collected stack traces, and it seems that all the leaked threads look like this: #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165 #1 0x0000003916e23ab0 in PR_WaitCondVar (cvar=0x7f5d1d1b03c0, timeout=4294967295) at ../../../mozilla/nsprpub/pr/src/pthreads/ptsynch.c:385 #2 0x0000003916e23db5 in PR_Wait (mon=0x7f5d24431c80, timeout=<optimized out>) at ../../../mozilla/nsprpub/pr/src/pthreads/ptsynch.c:582 #3 0x00007f5d36155b9d in Wait (interval=4294967295, this=0x7f5d244fa390) at ../../dist/include/mozilla/ReentrantMonitor.h:89 #4 Wait (interval=4294967295, this=<synthetic pointer>) at ../../dist/include/mozilla/ReentrantMonitor.h:192 #5 nsEventQueue::GetEvent (this=0x7f5d244fa390, mayWait=true, result=0x7f5d1d0fedf8) at /usr/src/debug/thunderbird-17.0.2/comm-release/mozilla/xpcom/threads/nsEventQueue.cpp:51 #6 0x00007f5d36156878 in nsThread::ProcessNextEvent (this=0x7f5d244fa340, mayWait=true, result=0x7f5d1d0fee3f) at /usr/src/debug/thunderbird-17.0.2/comm-release/mozilla/xpcom/threads/nsThread.cpp:605 #7 0x00007f5d3612c82b in NS_ProcessNextEvent_P (thread=<optimized out>, mayWait=true) at /usr/src/debug/thunderbird-17.0.2/comm-release/objdir/mozilla/xpcom/build/nsThreadUtils.cpp:220 #8 0x00007f5d36157080 in nsThread::ThreadFunc (arg=Python Exception <class 'gdb.error'> Attempt to dereference a generic pointer.: 0x7f5d244fa340) at /usr/src/debug/thunderbird-17.0.2/comm-release/mozilla/xpcom/threads/nsThread.cpp:257 #9 0x0000003916e28e23 in _pt_root (arg=Python Exception <class 'gdb.error'> Attempt to dereference a generic pointer.: 0x7f5d24867570) at ../../../mozilla/nsprpub/pr/src/pthreads/ptthread.c:156 #10 0x0000003909207d15 in start_thread (arg=Python Exception <class 'gdb.error'> Attempt to dereference a generic pointer.: 0x7f5d1d0ff700) at pthread_create.c:308 #11 0x0000003908af246d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114 Unfortunately that looks pretty generic to me and hence probably doesn't give much clue as to the cause of the leak.
Severity: normal → major
Status: RESOLVED → REOPENED
Ever confirmed: true
Resolution: INVALID → ---
Not entirely sure if that's the same leak but I'm able to observe a similar one on Thunderbird 20.0a2 from 20130202042002 with a new profile with only Lightning installed. The number of threads seems to correlate very closely with the number of ics network calendars enabled and their refresh rate. For each calendar refresh one new thread appears. When it got to 322 Thunderbird became unusable, as in not opening windows any more and otherwise not functioning properly. Threads look similar: Thread 61 (Thread 0xa63feb40 (LWP 7639)): #0 0xb7710424 in __kernel_vsyscall () #1 0xb76e596b in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i386-linux-gnu/libpthread.so.0 #2 0xb72e5d29 in PR_WaitCondVar (cvar=0xa70d3940, timeout=4294967295) at ../../../../../../mozilla/nsprpub/pr/src/pthreads/ptsynch.c:385 #3 0xb72e6107 in PR_Wait (mon=0xac5e2c80, timeout=4294967295) at ../../../../../../mozilla/nsprpub/pr/src/pthreads/ptsynch.c:582 #4 0xb532e2f8 in mozilla::ReentrantMonitor::Wait (this=0xa701ef6c, interval=4294967295) at ../../dist/include/mozilla/ReentrantMonitor.h:89 #5 0xb6146cff in Wait (interval=4294967295, this=<optimized out>) at ../../dist/include/mozilla/ReentrantMonitor.h:192 #6 nsEventQueue::GetEvent (this=0xa701ef6c, mayWait=true, result=0xa63fe250) at ../../../../mozilla/xpcom/threads/nsEventQueue.cpp:58 #7 0xb6148238 in nsThread::ProcessNextEvent (this=0xa701ef40, mayWait=true, result=0xa63fe29f) at ../../../../mozilla/xpcom/threads/nsThread.cpp:619 #8 0xb6116866 in NS_ProcessNextEvent_P (thread=<optimized out>, mayWait=true) at nsThreadUtils.cpp:238 #9 0xb6147dd6 in nsThread::ThreadFunc (arg=0xa701ef40) at ../../../../mozilla/xpcom/threads/nsThread.cpp:265 #10 0xb72ebf77 in _pt_root (arg=0xac5cf9d0) at ../../../../../../mozilla/nsprpub/pr/src/pthreads/ptthread.c:156 #11 0xb76e1d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0 #12 0xb74ead3e in clone () from /lib/i386-linux-gnu/libc.so.6 (In reply to Wayne Mery (:wsmwk) from comment #4) > Tom, thanks for the update. > I'm throwing this over to calendar Didn't you mean to update component as well?
Pretty sure this is calendar's bug, here's my analysis from the newsgroup: If I understand threads correctly, then when it is no longer required, Shutdown() must be called on the thread: http://hg.mozilla.org/mozilla-central/annotate/847e28c7ba67/xpcom/threads/nsIThread.idl#l30 From what I can tell, the calendar code does this, for each call to ParseICSAsync: - Creates a worker (nsIRunnable) to parse the ics value - Creates a new thread and dispatches the worker to it - When the thread gets the go-ahead, it runs the the parser - It them dispatches an event back to the main/original thread when it has completed. At no stage is Shutdown called, and a reference to the thread isn't kept either. There's two potential ways to fix this: 1) calICSService creates only one thread per application run, and dispatches all ParseICSAsync requests to that thread. The thread then can be shutdown once when the application shutdown. 2) calICSService continues creating multiple threads, but probably via passing additional arguments/references, keeps track of when thread completes and shuts them down appropriately. The first option is probably slightly better than the second if there's an acceptable trade-off that async ics requests are queued up if one is still running when another comes in.
Component: Untriaged → General
Keywords: qawanted → mlk
Product: Thunderbird → Calendar
Version: 17 → Trunk
Not sure if this helps, but given that the above comment implies that this is related to ICS calendars I disabled my only ICS calendar (there are also two CalDAV ones) and it is still leaking threads afterwards.
(In reply to Tom Hughes from comment #7) > Not sure if this helps, but given that the above comment implies that this > is related to ICS calendars I disabled my only ICS calendar (there are also > two CalDAV ones) and it is still leaking threads afterwards. How did you disable it? You have to open properties and disable from there. Just hiding the calendar won't work. I converted all my ICS calendars into CalDAV ones and thread count stopped growing.
You're quite right... I had just unchecked it - now that I have disabled it properly in the properties the thread leak has stopped.
Confirmed on Windows 7 using Thunderbird 17 + Lightning 1.9 and Thunderbird 21.0a1 + Lightning 2.3a1.
Status: REOPENED → NEW
OS: Linux → All
Version: Trunk → Lightning 1.9
Is this a regression? From which version? And does it result in any crashes?
Summary: Thunderbird leaks process threads → ICS calendars cause process threads leak and high memory
Well my original report in the Fedora bug tracker was a year ago, against Thunderbird 9, but I had originally observed it some time before that - my best guess was around the TB 7/8 time frame. Whether it was new then, or whether that was just when I first worked out what was going on I really don't know.
Oh and as for crashes well not in the classic sense, but once the number of threads get to high other programs start being unable to fork and/or start threads as a typical linux system will have a per-user limit of 1024 processes which includes all threads. At that point random programs will start crashing, erroring and/or generally misbehaving depending on how well they handle a failure to create a new thread/process.
Created attachment 710938 [details] [diff] [review] shutdown thread after parsing A quick and probably ugly hack for testing, basically based on comment 6 option 2. The new created worker thread reference is passed to |ParserWorker| and than passed to |ParserWorkerCompleter| that is dispatched back on the main thread once the parsing is completed. Back on the main thread we can shutdown the worker thread.
You might want to release the reference to the thread within ParserWorker after having passed it to ParserWorkerCompleter, otherwise you might end up with reference cycle counts keeping the thread object alive.
Stefan, Mark, could one of you upload a new version of this patch? I'd love to see this go in soon :)
Yes, I'll provide an updated patch soon.
Created attachment 711411 [details] [diff] [review] shutdown thread after parsing v2
Comment on attachment 711411 [details] [diff] [review] shutdown thread after parsing v2 Moving review to Mark as he has already analysed the situation. I believe the kind of release needed is some sort of NS_RELEASE though, not sure.
Attachment #711411 - Flags: review?(philipp) → review?(mbanner)
Just for reference: With v1 or v2 of the patch applied I don't get an increase in thread count anymore. But I don't know how to test if the nsIThread objects will be freed or not.
Comment on attachment 711411 [details] [diff] [review] shutdown thread after parsing v2 Review of attachment 711411 [details] [diff] [review]: ----------------------------------------------------------------- I've not tested it, but it looks fine.
Attachment #711411 - Flags: review?(mbanner) → review+
Attachment #711411 - Flags: review?(philipp) → review+
I have only tested in win32 platform and therefore hope the fix works on the other platforms too. Once this is verified I think we should consider porting back the fix for the next Lightning 1.9esr release. Pushed to https://hg.mozilla.org/comm-central/rev/949ebc4e6f47
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago → 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → 2.4
Created attachment 723228 [details] [diff] [review] patch for releases/comm-esr17 The original patch doesn't apply to releases/comm-esr17 but this one does.
Comment on attachment 711411 [details] [diff] [review] shutdown thread after parsing v2 Ok, lets do it.
Target Milestone: 2.4 → 2.3
Target Milestone: 2.3 → 2.2
Whiteboard: [wanted-1.9.x] → [fixed-1.9.2]
Hi, my name is Nathália and I'm from Brazil. I study mastersdegree at UFU (Federal University Uberlandia) and I'm doing a work based in ICS Calendar/Lightning in Mozilla Thunderbird. Could you help me? I need to know how many events on the calendar you create daily. And How long the application is running? Thank you!
You need to log in before you can comment on or make changes to this bug.