Closed Bug 429929 Opened 16 years ago Closed 16 years ago

CPU consumption while inactive too high

Categories

(Thunderbird :: General, defect)

PowerPC
macOS
defect
Not set
minor

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: traut, Assigned: mstange)

References

Details

(Keywords: perf)

Attachments

(3 files, 2 obsolete files)

User-Agent:       Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14
Build Identifier: version 3.0a1pre (2008041603)

TB takes 50 to 80% CPU load while being inactive.

Offline it's still 40-50%

There should not be that much need for cpu consumption while not fetching any mail or doing any processing.

Maybe TB does no longer mind about slow cpus and power consumption on portable devices? I feel that TB should be more aware or power efficiency.

Reproducible: Always

Steps to Reproduce:
1.
2.
3.
Version: unspecified → Trunk
Confirming with TB trunk - Dan & I are seeing this as well, though percentage of activity varies (Dan's is about 20%, mine is about 5-10%), we both have IMAP accounts active.

Emre, is this likely to be a problem with IMAP runaway, that Dan thinks you may already be looking at?
Status: UNCONFIRMED → NEW
Ever confirmed: true
Is this bug about a recent regression? At least for me seems something happened a few days back. Disabling lightning helped quite a bit, not sure if that's related.
I'm seeing 30% w/ lightning disabled.

marking as blocking tb3, although I think we'll want to block on a beta as well before then.
Flags: blocking-thunderbird3+
On a compiled debug trunk build version 3.0a1pre (2008042110),

With one IMAP folder at ~200 messages, running |top| in Terminal shows idle Thunderbird usage hovering at 8-10%.

On another IMAP folder at ~2,000 messages, usage is ~10%.
Attaching a sample from Activity Monitor.
Re comment 4: performance benchmarks are generally only useful when run against stuff built with options very very close to release builds.  Debugging code and compiler options make big differences here.
I am experiencing similar CPU usage numbers with IMAP accounts. Going to test IMAP with shark.

I am assigning this bug to myself.
Assignee: nobody → bugmil.ebirol
I've been noticing this too... Recent regression?
I don't think I'm seeing signifigant idle activity these days, but there was a perhaps similar bug 420254 a few weeks ago
Branch with 1 gmail imap account
Trunk with 1 gmail imap account

The major difference is the main thread of TB-Trunk makes calls to 
nsAppShell::ProcessGeckoEvents() method which is not the case for TB-Branch..

Is this related to Gecko 1.9 migration? Any idea if this could be the reason of this performance issue? 


---
Also, there is another noticeable difference in nsThread interface which looks like to me a major re-factoring after branching. Don't think it is related to this performance issue though.

- TB-Trunk calls nsThread::ThreadFunc(void*) and nsThread::ProcessNextEvent(int, int*) where TB-Branch calls nsThread::Main(void*) and nsXXXThread::Run() -- where XXX is a class implementing nsIThread interface
(In reply to comment #11)
> The major difference is the main thread of TB-Trunk makes calls to 
> nsAppShell::ProcessGeckoEvents() method which is not the case for TB-Branch..
> 
> Is this related to Gecko 1.9 migration?

http://mxr.mozilla.org/seamonkey/search?string=ProcessGeckoEvents&find=&findi=&filter=&hitlimit=&tree=seamonkey

Yes, ProcessGeckoEvents is a Cocoa thingy that is new in Gecko 1.9 and not in 1.8.1 (and consequently TB 2, which is a Carbon app)
Keywords: perf
Nick Kreeger has mentioned a bug that might be related to this one:
http://bugzilla.songbirdnest.com/show_bug.cgi?id=8711

We should find out that this bug is MAC specific or not.. Does anybody experience this bug on Windows or Linux?
(In reply to comment #13)
> We should find out that this bug is MAC specific or not.. Does anybody
> experience this bug on Windows or Linux?

Something I have noticed on Mac, which may or may not be related to this, is that with SeaMonkey browser open (i.e. haven't activated the mailnews part) and Chatzilla, I typically get a 5-10% CPU usage, similar to what I'm seeing on Thunderbird at the moment.

I have tried Linux and TB is at 0% when idle (I'll keep an eye on it for a few hours).

What I've also found is bug 341749 which indicates there is a problem with the Check For Updates window. I may be mis-analysing (as that bug is cross-platform), but I'm just wondering if they could be in any way related.
Keywords: relnote
We should add a release note about this issue as 3.0a1 suffers from it.
Couple tests on Mac OS 10.5.2
Initial profile state: No accounts

Start TB
No account just "Local Folders": CPU Usage ~1%
Add a News account
News Account without any subscription: CPU Usage ~1%
Subscribe to a news: CPU Usage ~8%
Restart TB
"Local Folders" is selected by default: CPU Usage ~1%
Select previously subscribed News channel: CPU Usage ~8%
Select "Local Folders": CPU Usage ~8% (deselecting News channel doesn't reduce CPU usage)
Restart TB
Add an IMAP account:  CPU usage ~1%
Select IMAP inbox: CPU usage ~15%
Restart TB
IMAP inbox is selected by default (password not entered): CPU usage ~15% (trying to make a connection to server)
Enter password, and get headers from the server. After all headers have downloaded : CPU usage ~8-9%
==

Timer events might be the suspect of this issue. The following is what I observe during a debugging session: 
After one of the mail folders (IMAP or News channel) is selected by the user - or at the app startup, some component (don't know which one yet) fires a timer event that triggers another one that triggers another one. This chain of events go on like that and makes Gecko event Q very busy. I have also observed that CPU cycle consumption goes up over time, slowly.

All timer events are fired/executed by TimerThread (TimerThread.cpp).Suspending this thread reduces the CPU usage down to <1%. 

Thoughts?
It can't be bug 304147, can it?
In order to test, try removing these lines from global.css:

progressmeter[mode="undetermined"] {
  -moz-binding: url("chrome://global/content/bindings/progressmeter.xml#progressmeter-periodic-redraw");
}
Commented out these lines and build mozilla/toolkit. The problem still exists. 

Please let me know if there are more steps I should do to include css changes into the binary.
Sorry, I have never built Thunderbird myself. When I want to include CSS changes into the Firefox binary I use:
make -C toolkit/ && make -C browser/
in my objdir, and when that doesn't work, I just do a full build.
I don't know to what extent this is applicable when building Thunderbird.
If you're on a Mac, you need to rebuild in mozilla/mail/app at least.
And our chrome packaging is less than perfect, so every now and then when you see no sign of your changes with a full build, you need to clobber :|
After I built mail/app also, TB stopped to use CPU intensively. Idle CPU usage is back to ~< 1%.

Markus, what do you recommend as a fix?  
Just FYI, I can confirm the that the test Markus suggests solves the CPU load on my build.  So there's likely two issues:

1) somehow Tb code isn't setting the progressmeter back to normal
2) having 10% of CPU dedicated to moving a progress meter seems like a perf bug in core.
Ok, here's a bit more analysis.

First, a test case.

If you use Fx3 and go to the XUL live editor (http://ted.mielczarek.org/code/mozilla/xuledit/xuledit.xul), and paste in the following XUL:

<?xml version="1.0"?>
<?xml-stylesheet href="chrome://global/skin/" type="text/css"?>
<window id="yourwindow" xmlns="http://www.mozilla.org/keymaster/gatekeeper /there.is.only.xul">
<progressmeter id="p" mode="undetermined"/>
<button label="determined" oncommand="document.getElementById('p').mode = 'determined'"/>
<button label="undetermined" oncommand="document.getElementById('p').mode = 'undetermined'"/>
</window>

then you can toggle the progressmeter between determined and undetermined.  Everytime you switch to undetermined, you'll notice an increase in idle CPU load.  That load does not go away when switching back to determined mode.

My guess is that the XBL binding for undetermined progressmeter (http://mxr.mozilla.org/seamonkey/source/toolkit/content/widgets/progressmeter.xml#57) doesn't clear its interval timer when the binding is destroyed.

I've tried to fix that binding to clean that up, but haven't figured it out yet.  I may still, but if someone else wants to beat me to it, that's fine.

As an different note, the interval timer on that progress bar is 20msec, which is unnecessarily fast, I suspect.  Something like 150 msec will likely work just fine and with much less CPU load.
(In reply to comment #24)
> My guess is that the XBL binding for undetermined progressmeter
> (http://mxr.mozilla.org/seamonkey/source/toolkit/content/widgets/progressmeter.xml#57)

You have to look at the binding named "progressmeter-periodic-redraw":
http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/toolkit/content/widgets/progressmeter.xml&rev=1.12&mark=114-134#114

The binding "progressmeter-undetermined" is used on other platforms; it draws a circulating chunk, while the Mac binding only asks for a redraw.

> doesn't clear its interval timer when the binding is destroyed.

Since the non-Mac binding ("progressmeter-undetermined") apparently doesn't have the bug, I must have missed something when I copied that code...

... and now I notice these lines:
var width = stack.boxObject.width;
if (!width) {
  // Maybe we've been removed from the document.
  var currentNode = stack;
  while (currentNode && currentNode != stack.ownerDocument) {
    currentNode = currentNode.parentNode;
  }
  if (!currentNode) {
    clearInterval(interval);
    return;
  }
}
... which are from bug 420254: "thunderbird often uses ~10% cpu when idle for no apparent reason"

So we'll have to find a similar fix for the new binding.

> As an different note, the interval timer on that progress bar is 20msec, which
> is unnecessarily fast, I suspect.  Something like 150 msec will likely work
> just fine and with much less CPU load.

I have tested with different speeds and have found that only 30fps look really smooth. 150ms (= 7fps) are very jittery.
Assignee: bugmil.ebirol → mstange
Status: NEW → ASSIGNED
Attachment #319153 - Flags: review?(Olli.Pettay)
This is so weird.

(In reply to comment #24)
> My guess is that the XBL binding for undetermined progressmeter
> doesn't clear its interval timer when the binding is destroyed.

The funny thing is that the binding isn't destroyed when you switch from undetermined to normal, even though the CSS rule that sets the binding no longer applies. 
(Is that a bug, Olli? I'll try to create a testcase.)

However, the elements in the binding's content element are removed from the document; thus, the spacer collapses which can be detected by its width being 0.

And another fun fact: When you add the following lines to global.css:
progressmeter:not([mode="undetermined"]) {
  -moz-binding: none;
}
even this doesn't happen any more - the binding and the content elements stay.

I'll do some more experimenting.
(In reply to comment #27)
> The funny thing is that the binding isn't destroyed when you switch from
> undetermined to normal, even though the CSS rule that sets the binding no
> longer applies. 
> (Is that a bug, Olli? I'll try to create a testcase.)

Apparently it's not a bug.

http://developer.mozilla.org/en/docs/XBL:XBL_1.0_Reference:Binding_Attachment_and_Detachment says:
> If at any time a resolution of style on the element determines that a
> different binding should be attached, the old binding (and all bindings
> that it explicitly extends in its inheritance chain) will be detached.

But we're not attaching a different binding.
I noticed that the whole binding collapses when the element is removed. Since all binding <field>s become undefined, we can use that for detecting it.

We're apparently dealing with bug 83635. Once that's fixed, we can move clearInterval into the destructor.
Attachment #319153 - Attachment is obsolete: true
Attachment #319177 - Flags: review?(Olli.Pettay)
Attachment #319153 - Flags: review?(Olli.Pettay)
Markus, thanks for all that.

It'd be good to have another bug with the proper fix and mark that as blocked by bug 83635.  Also wondering what to do about this bug which while marked a Tb bug is probably dependent on a Core bug.

I'm going to file two Core bugs:
1. One bug with my patch, blocking this bug, and
2. the "proper fix" bug, blocking 83635.
Depends on: 432028
Attachment #319177 - Attachment is obsolete: true
Attachment #319177 - Flags: review?(Olli.Pettay)
(In reply to comment #31)
> I'm going to file two Core bugs:
> 1. One bug with my patch, blocking this bug, and
--> bug 432028
> 2. the "proper fix" bug, blocking 83635.
--> bug 432035
Isn't this bug now fixed?
Yup.
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Keywords: relnote
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: