ICS calendars cause process threads leak and high memory

RESOLVED FIXED in 1.9.2

Status

Calendar
Provider: ICS/WebDAV
--
major
RESOLVED FIXED
5 years ago
3 years ago

People

(Reporter: Tom Hughes, Assigned: Stefan Sitter)

Tracking

({memory-leak})

Lightning 1.9
1.9.2
x86_64
All
memory-leak

Details

(Whiteboard: [fixed-1.9.2])

Attachments

(4 attachments, 1 obsolete attachment)

(Reporter)

Description

5 years ago
Created attachment 705274 [details]
Stack traces for threads present at startup

This report, and the attached traces, relate to Thunderbird 17.0.2, specifically the thunderbird-17.0.2-1.fc18.x86_64 Fedora packaged version but I have been seeing this leak since at least Thunderbird 9. A previous downstream report is here:

  https://bugzilla.redhat.com/show_bug.cgi?id=784193

Basically, over a period of days the number of threads in use by Thunderbird will increase to the point that it is using hundreds of threads and other programs start being unable to fork or create threads because the per-user process limit has been reached.

To illustrate I am attaching two dumps collected with gdb using "thread apply all bt full" to get a backtrace of all the threads.

The first was collected shortly after startup, when it was using 36 threads, and the second 24 hours later when it was using 85 threads.
(Reporter)

Comment 1

5 years ago
Created attachment 705275 [details]
Stack traces for threads present after 24 hours

Comment 2

5 years ago
Hi Tom.
Does it reproduce using a *mozilla* supplied package?
Flags: needinfo?(tom)
Summary: Thunderbird leaks threads → Thunderbird leaks process threads
(Reporter)

Comment 3

5 years ago
The simple answer to that question is no

The complicated answer is that doing that caused me to switch from a 64 bit to a 32 bit build, which led to lightning being disabled, which led me to figure out that it is actually lightning causing the leak, not thunderbird itself, so I am going to close this.
Status: UNCONFIRMED → RESOLVED
Last Resolved: 5 years ago
Flags: needinfo?(tom)
Resolution: --- → INVALID

Comment 4

5 years ago
Tom, thanks for the update.
I'm throwing this over to calendar

from the redhat bug...

Tom Hughes 2013-01-31 05:10:04 EST

After further investigation I have discovered that disabling the lightning extension stops the thread leak, so it seems that lightning is actually responsible and I am going to transfer this bug to the thunderbird-lightning component.

I have also done some comparisons of the collected stack traces, and it seems that all the leaked threads look like this:

#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165
#1  0x0000003916e23ab0 in PR_WaitCondVar (cvar=0x7f5d1d1b03c0, timeout=4294967295) at ../../../mozilla/nsprpub/pr/src/pthreads/ptsynch.c:385
#2  0x0000003916e23db5 in PR_Wait (mon=0x7f5d24431c80, timeout=<optimized out>) at ../../../mozilla/nsprpub/pr/src/pthreads/ptsynch.c:582
#3  0x00007f5d36155b9d in Wait (interval=4294967295, this=0x7f5d244fa390) at ../../dist/include/mozilla/ReentrantMonitor.h:89
#4  Wait (interval=4294967295, this=<synthetic pointer>) at ../../dist/include/mozilla/ReentrantMonitor.h:192
#5  nsEventQueue::GetEvent (this=0x7f5d244fa390, mayWait=true, result=0x7f5d1d0fedf8) at /usr/src/debug/thunderbird-17.0.2/comm-release/mozilla/xpcom/threads/nsEventQueue.cpp:51
#6  0x00007f5d36156878 in nsThread::ProcessNextEvent (this=0x7f5d244fa340, mayWait=true, result=0x7f5d1d0fee3f) at /usr/src/debug/thunderbird-17.0.2/comm-release/mozilla/xpcom/threads/nsThread.cpp:605
#7  0x00007f5d3612c82b in NS_ProcessNextEvent_P (thread=<optimized out>, mayWait=true) at /usr/src/debug/thunderbird-17.0.2/comm-release/objdir/mozilla/xpcom/build/nsThreadUtils.cpp:220
#8  0x00007f5d36157080 in nsThread::ThreadFunc (arg=Python Exception <class 'gdb.error'> Attempt to dereference a generic pointer.: 0x7f5d244fa340) at /usr/src/debug/thunderbird-17.0.2/comm-release/mozilla/xpcom/threads/nsThread.cpp:257
#9  0x0000003916e28e23 in _pt_root (arg=Python Exception <class 'gdb.error'> Attempt to dereference a generic pointer.: 0x7f5d24867570) at ../../../mozilla/nsprpub/pr/src/pthreads/ptthread.c:156
#10 0x0000003909207d15 in start_thread (arg=Python Exception <class 'gdb.error'> Attempt to dereference a generic pointer.: 0x7f5d1d0ff700) at pthread_create.c:308
#11 0x0000003908af246d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Unfortunately that looks pretty generic to me and hence probably doesn't give much clue as to the cause of the leak.
Severity: normal → major
Status: RESOLVED → REOPENED
Ever confirmed: true
Keywords: qawanted
Resolution: INVALID → ---
Not entirely sure if that's the same leak but I'm able to observe a similar one on Thunderbird 20.0a2 from 20130202042002 with a new profile with only Lightning installed. The number of threads seems to correlate very closely with the number of ics network calendars enabled and their refresh rate. For each calendar refresh one new thread appears. When it got to 322 Thunderbird became unusable, as in not opening windows any more and otherwise not functioning properly.

Threads look similar:
Thread 61 (Thread 0xa63feb40 (LWP 7639)):
#0  0xb7710424 in __kernel_vsyscall ()
#1  0xb76e596b in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xb72e5d29 in PR_WaitCondVar (cvar=0xa70d3940, timeout=4294967295)
    at ../../../../../../mozilla/nsprpub/pr/src/pthreads/ptsynch.c:385
#3  0xb72e6107 in PR_Wait (mon=0xac5e2c80, timeout=4294967295)
    at ../../../../../../mozilla/nsprpub/pr/src/pthreads/ptsynch.c:582
#4  0xb532e2f8 in mozilla::ReentrantMonitor::Wait (this=0xa701ef6c, interval=4294967295)
    at ../../dist/include/mozilla/ReentrantMonitor.h:89
#5  0xb6146cff in Wait (interval=4294967295, this=<optimized out>)
    at ../../dist/include/mozilla/ReentrantMonitor.h:192
#6  nsEventQueue::GetEvent (this=0xa701ef6c, mayWait=true, result=0xa63fe250)
    at ../../../../mozilla/xpcom/threads/nsEventQueue.cpp:58
#7  0xb6148238 in nsThread::ProcessNextEvent (this=0xa701ef40, mayWait=true, result=0xa63fe29f)
    at ../../../../mozilla/xpcom/threads/nsThread.cpp:619
#8  0xb6116866 in NS_ProcessNextEvent_P (thread=<optimized out>, mayWait=true) at nsThreadUtils.cpp:238
#9  0xb6147dd6 in nsThread::ThreadFunc (arg=0xa701ef40)
    at ../../../../mozilla/xpcom/threads/nsThread.cpp:265
#10 0xb72ebf77 in _pt_root (arg=0xac5cf9d0)
    at ../../../../../../mozilla/nsprpub/pr/src/pthreads/ptthread.c:156
#11 0xb76e1d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#12 0xb74ead3e in clone () from /lib/i386-linux-gnu/libc.so.6

(In reply to Wayne Mery (:wsmwk) from comment #4)
> Tom, thanks for the update.
> I'm throwing this over to calendar
Didn't you mean to update component as well?
Pretty sure this is calendar's bug, here's my analysis from the newsgroup:

If I understand threads correctly, then when it is no longer required, Shutdown() must be called on the thread:

http://hg.mozilla.org/mozilla-central/annotate/847e28c7ba67/xpcom/threads/nsIThread.idl#l30

From what I can tell, the calendar code does this, for each call to ParseICSAsync:

- Creates a worker (nsIRunnable) to parse the ics value
- Creates a new thread and dispatches the worker to it
- When the thread gets the go-ahead, it runs the the parser
- It them dispatches an event back to the main/original thread when it has completed.

At no stage is Shutdown called, and a reference to the thread isn't kept either.

There's two potential ways to fix this:

1) calICSService creates only one thread per application run, and dispatches all ParseICSAsync requests to that thread. The thread then can be shutdown once when the application shutdown.

2) calICSService continues creating multiple threads, but probably via passing additional arguments/references, keeps track of when thread completes and shuts them down appropriately.

The first option is probably slightly better than the second if there's an acceptable trade-off that async ics requests are queued up if one is still running when another comes in.
Component: Untriaged → General
Keywords: qawanted → mlk
Product: Thunderbird → Calendar
Version: 17 → Trunk

Updated

5 years ago
Component: General → Provider: ICS/WebDAV
(Reporter)

Comment 7

5 years ago
Not sure if this helps, but given that the above comment implies that this is related to ICS calendars I disabled my only ICS calendar (there are also two CalDAV ones) and it is still leaking threads afterwards.

Comment 8

5 years ago
(In reply to Tom Hughes from comment #7)
> Not sure if this helps, but given that the above comment implies that this
> is related to ICS calendars I disabled my only ICS calendar (there are also
> two CalDAV ones) and it is still leaking threads afterwards.

How did you disable it? You have to open properties and disable from there. Just hiding the calendar won't work.

I converted all my ICS calendars into CalDAV ones and thread count stopped growing.
(Reporter)

Comment 9

5 years ago
You're quite right... I had just unchecked it - now that I have disabled it properly in the properties the thread leak has stopped.
(Assignee)

Comment 10

5 years ago
Confirmed on Windows 7 using Thunderbird 17 + Lightning 1.9 and Thunderbird 21.0a1 + Lightning 2.3a1.
Status: REOPENED → NEW
OS: Linux → All
Version: Trunk → Lightning 1.9

Comment 11

5 years ago
Is this a regression? From which version?
And does it result in any crashes?
Summary: Thunderbird leaks process threads → ICS calendars cause process threads leak and high memory
(Reporter)

Comment 12

5 years ago
Well my original report in the Fedora bug tracker was a year ago, against Thunderbird 9, but I had originally observed it some time before that - my best guess was around the TB 7/8 time frame.

Whether it was new then, or whether that was just when I first worked out what was going on I really don't know.
(Reporter)

Comment 13

5 years ago
Oh and as for crashes well not in the classic sense, but once the number of threads get to high other programs start being unable to fork and/or start threads as a typical linux system will have a per-user limit of 1024 processes which includes all threads.

At that point random programs will start crashing, erroring and/or generally misbehaving depending on how well they handle a failure to create a new thread/process.
(Assignee)

Comment 14

5 years ago
Created attachment 710938 [details] [diff] [review]
shutdown thread after parsing

A quick and probably ugly hack for testing, basically based on comment 6 option 2. The new created worker thread reference is passed to |ParserWorker| and than passed to |ParserWorkerCompleter| that is dispatched back on the main thread once the parsing is completed. Back on the main thread we can shutdown the worker thread.
Attachment #710938 - Flags: review?
You might want to release the reference to the thread within ParserWorker after having passed it to ParserWorkerCompleter, otherwise you might end up with reference cycle counts keeping the thread object alive.
Stefan, Mark, could one of you upload a new version of this patch? I'd love to see this go in soon :)
(Assignee)

Comment 17

5 years ago
Yes, I'll provide an updated patch soon.
(Assignee)

Comment 18

5 years ago
Created attachment 711411 [details] [diff] [review]
shutdown thread after parsing v2
Assignee: nobody → ssitter
Attachment #710938 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #710938 - Flags: review?
Attachment #711411 - Flags: review?(philipp)
Comment on attachment 711411 [details] [diff] [review]
shutdown thread after parsing v2

Moving review to Mark as he has already analysed the situation. I believe the kind of release needed is some sort of NS_RELEASE though, not sure.
Attachment #711411 - Flags: review?(philipp) → review?(mbanner)
(Assignee)

Comment 20

5 years ago
Just for reference: With v1 or v2 of the patch applied I don't get an increase in thread count anymore. But I don't know how to test if the nsIThread objects will be freed or not.
(Assignee)

Updated

5 years ago
Attachment #711411 - Flags: review?(philipp)
Comment on attachment 711411 [details] [diff] [review]
shutdown thread after parsing v2

Review of attachment 711411 [details] [diff] [review]:
-----------------------------------------------------------------

I've not tested it, but it looks fine.
Attachment #711411 - Flags: review?(mbanner) → review+
Attachment #711411 - Flags: review?(philipp) → review+
(Assignee)

Comment 22

5 years ago
I have only tested in win32 platform and therefore hope the fix works on the other platforms too. Once this is verified I think we should consider porting back the fix for the next Lightning 1.9esr release.

Pushed to https://hg.mozilla.org/comm-central/rev/949ebc4e6f47
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago5 years ago
Resolution: --- → FIXED
Target Milestone: --- → 2.4
Whiteboard: [wanted-1.9.x]
(Assignee)

Updated

5 years ago
Attachment #711411 - Flags: approval-calendar-beta?(philipp)
Attachment #711411 - Flags: approval-calendar-aurora?(philipp)
(Assignee)

Comment 23

5 years ago
Created attachment 723228 [details] [diff] [review]
patch for releases/comm-esr17

The original patch doesn't apply to releases/comm-esr17 but this one does.
Attachment #723228 - Flags: review?(philipp)
Attachment #723228 - Flags: approval-calendar-release?(philipp)
Comment on attachment 711411 [details] [diff] [review]
shutdown thread after parsing v2

Ok, lets do it.
Attachment #711411 - Flags: approval-calendar-beta?(philipp)
Attachment #711411 - Flags: approval-calendar-beta+
Attachment #711411 - Flags: approval-calendar-aurora?(philipp)
Attachment #711411 - Flags: approval-calendar-aurora+
Attachment #723228 - Flags: review?(philipp)
Attachment #723228 - Flags: review+
Attachment #723228 - Flags: approval-calendar-release?(philipp)
Attachment #723228 - Flags: approval-calendar-release+
(Assignee)

Comment 25

5 years ago
Pushed to https://hg.mozilla.org/releases/comm-aurora/rev/005bd2638d17
Target Milestone: 2.4 → 2.3
(Assignee)

Comment 26

5 years ago
Pushed to https://hg.mozilla.org/releases/comm-beta/rev/962a5701b92f
Target Milestone: 2.3 → 2.2
(Assignee)

Comment 27

5 years ago
Pushed to https://hg.mozilla.org/releases/comm-esr17/rev/867d9e2b51a6
Whiteboard: [wanted-1.9.x] → [fixed-1.9.2]
Target Milestone: 2.2 → 1.9.2
(Assignee)

Updated

5 years ago
Duplicate of this bug: 859176

Updated

5 years ago
See Also: → bug 854101
(Assignee)

Updated

5 years ago
Duplicate of this bug: 854101

Comment 30

3 years ago
Hi, my name is Nathália and I'm from Brazil.
I study mastersdegree at UFU (Federal University Uberlandia) and I'm doing a work based in ICS Calendar/Lightning in Mozilla Thunderbird.

Could you help me? I need to know how many events on the calendar you create daily.
And How long the application is running?
Thank you!
You need to log in before you can comment on or make changes to this bug.