Closed Bug 854101 Opened 11 years ago Closed 11 years ago

Thunderbird leaks memory and slows to a crawl; nearly 200 threads are idle

Categories

(Calendar :: General, defect)

x86_64
Windows 7
defect
Not set
major

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 833720

People

(Reporter: benjamin.lerner, Unassigned)

References

Details

(Keywords: perf)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) Gecko/20100101 Firefox/19.0
Build ID: 20130307023931

Steps to reproduce:

I leave Tb running continually, bringing my laptop from home (on wifi) to school (on LAN and wifi) with various addons (primarily Lightning & Conversations).  After a while, it gets terribly slow, such that moving or deleting even a single message takes over 30 seconds, and composing messages takes a while.  And memory usage climbs to 1.2GB.


Actual results:

I looked in Process Explorer, and found 190+ threads, almost none of which showed signs of activity.  According to procexp, their stacks begin at MSVCR100.dll!_endthreadex+0x80, and their current call stacks all are

wow64cpu.dll!TurboDispatchJumpAddressEnd+0x6c0
wow64cpu.dll!TurboDispatchJumpAddressEnd+0x4a8
wow64.dll!Wow64SystemServiceEx+0x1ce
wow64.dll!Wow64LdrpInitialize+0x429
ntdll.dll!RtlIsDosDeviceName_U+0x24c87
ntdll.dll!LdrInitializeThunk+0xe
ntdll.dll!ZwWaitForSingleObject+0x15
kernel32.dll!WaitForSingleObjectEx+0x43
kernel32.dll!WaitForSingleObject+0x12
nspr4.dll!_PR_MD_WAIT_CV+0x8d
nspr4.dll!_PR_GetPrimordialCPU+0x79

I restarted TB, because it was unusable, so my current measurements are on a much shorter-running instance than before.  Now when I look, there's between 32 and 37 threads, many of which have similar stacks as above.  Some of them, though, do regain activity, with more stack frames below _PR_GetPrimordialCPU+0x79...

Obviously threads aren't inherently expensive enough to explain the 1.2GB memory usage.  But having 190+ idle threads doesn't seem right either.  My hunch is that some threads aren't being joined, possibly after network activity or gloda activity (I don't know how to tell which, since both forms of IO happen as part of checking mail...), and preventing some refcount from dropping to zero and freeing stale memory.

I tried installing the Gecko Profiler, but that gave a Cu.import error.  I've just installed ViewAbout so I can get to about:memory; I'll report that when I can get a decent profile.  From a quick look, it's warning me that the sqlite-related values are negative and therefore bogus, which is a pity since those might well be some relevant values... see bug 805023 ;-)

Possibly related: bug 787751, bug 805405, bug 818448, bug 796989,
seems likely to be a duplicate of bug 833720
Severity: normal → major
Keywords: perf
Seems likely.  I have 16 calendars: 7 ICS, 8 GCal, and one Thunderbirthday calendar.  When I suspended and resumed my computer just now as an experiment, my thread count went from 35 to 51, then subsided to 42.  That's an initial jump of 16 threads,  with a net jump of 7 threads.  Not conclusive, but it does seem likely!
You would need to try to run in TB safe mode temporarily to disable the calendar to see if that is the problem.
(In reply to Wayne Mery (:wsmwk) from comment #1)
> seems likely to be a duplicate of bug 833720

bug 833720 is fixed in lightning 1.9.2. But I don't know where to find it). 

Ben, Can you try http://ftp.mozilla.org/pub/mozilla.org/calendar/lightning/nightly/latest-comm-aurora/ for Earlybird available from http://www.mozilla.org/en-US/thunderbird/channel/ ?
Flags: needinfo?(benjamin.lerner)
I have an installation from the Daily channel, rather than Aurora; should that work, or should I reinstall from Aurora channel?

Using Daily, and the corresponding Lightning and GData extensions, I added the same set of calendars I have in my normal, everyday profile.  FWIW, like my normal profile, I set most of them to have offline support enabled, if that makes a difference.  Since loading the calendars, TB has been almost 100% unresponsive.  It hovers around 1% CPU usage, about 2.4MB/s disk IO (almost entirely writing) to cache.sqlite-journal, and has done so for the past 20 minutes.  (I still have a dialog box asking for a password for authenticating to one of the calendars, and I can't even type in the password because it's too unresponsive...)

But!  It's doing all this whatever-it's-doing using only 28 threads :)  So perhaps the thread-leaking problem is fixed, but something else has gone wrong, indeed.
Flags: needinfo?(benjamin.lerner)
(In reply to Ben Lerner from comment #5)
> I have an installation from the Daily channel, rather than Aurora; should
> that work, or should I reinstall from Aurora channel?

daily is fine

> Using Daily, and the corresponding Lightning and GData extensions, I added
> the same set of calendars I have in my normal, everyday profile.  FWIW, like
> my normal profile, I set most of them to have offline support enabled, if
> that makes a difference.  Since loading the calendars, TB has been almost
> 100% unresponsive.  It hovers around 1% CPU usage, about 2.4MB/s disk IO
> (almost entirely writing) to cache.sqlite-journal, and has done so for the
> past 20 minutes.  (I still have a dialog box asking for a password for
> authenticating to one of the calendars, and I can't even type in the
> password because it's too unresponsive...)
Component: Untriaged → General
Keywords: qawanted
Product: Thunderbird → Calendar
See Also: → 833720
Version: 17 → Trunk
Followup: perhaps that initial unresponsiveness was just initial calendar load.  It comes back occasionally, most often when I manually sync calendars, but sometimes on its own.  That's annoying, but it's probably a different problem than the one in this bug.

But I'm currently running both Tb17 and daily side-by-side, and they've been running the same length of time.  Tb17 is currently at 70+ threads and 700MB memory, while Daily is still at 30 threads and ~350MB memory.  (That is 2 more threads than before; I'll see if the number climbs again later...)
Status: UNCONFIRMED → RESOLVED
Closed: 11 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.