Closed Bug 420254 Opened 16 years ago Closed 16 years ago

thunderbird often uses ~10% cpu when idle for no apparent reason

Categories

(Thunderbird :: General, defect)

x86
Linux
defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: tuukka.tolvanen, Assigned: smaug)

References

Details

(Keywords: perf, regression)

Attachments

(8 files, 1 obsolete file)

linux trunk, this probably started fairly recently (a week or three?)
Thunderbird often starts using some 7--20% cpu for long periods of time (minutes? forever?) when idle for no apparent reason. I suck, so oprofile and gdb just tell me "it's doing js" so far ;)
have you narrowed the regression?

version 3.0a1pre (2008022803) windows - I'm seeing sustained 5-11% usage.   This instance had been running for about 12 hours.

** I set thunderbird offline and cpu continued unchanged.
   back online, no change
   => timer?

I've restarted and after 20 minutes I am unable to recreate cpu usage after going through: AB, news, virtual, send, save draft, advanced search, SA window, ...

At 30 minutes I'm in a modestly sustained 1% cpu usage - perhaps some with cumulative effect is occuring?
Keywords: regression
I get a range of http://bonsai.mozilla.org/cvsquery.cgi?date=explicit&mindate=2008-02-14+12:45&maxdate=2008-02-14+13:00 -> bug 409111, which sounds surprising to me, but I'm finding it hard to argue with it after double-double-checking with local backout, even if the test may not be totally reliable...
Blocks: 409111
I did a jprof enabled debug build on linux and let it run for 1 cpu minute. The bzip2'd file is still pretty large, and I can send it to anyone who is interested. The following did seem interesting though:

index Count    Hits Function Name
               8114 NS_ProcessNextEvent_P(nsIThread*, int)
                  2 NS_ProcessPendingEvents_P(nsIThread*, unsigned int)
                  1 nsBaseAppShell::Run()
...

              4100 nsBaseAppShell::Run()
               3883 nsXULWindow::ShowModal()
                 98 nsThread::ThreadFunc(void*)
                 34 nsSocketTransportService::Run()
 18402   0     4198 NS_ProcessNextEvent_P(nsIThread*, int)
               8114 nsThread::ProcessNextEvent(int, int*)
                  1 nsCOMPtr<nsIThreadObserver>::operator->() const

These are the quietest minute of the first 5 to 6 for a build with bug 409111 backed out, and not backed out, respectively. The loop was, for reference, like

tag=with-409111; i=0; while true; do
    sudo opcontrol --reset; sleep 60; pgrep thunderbird || continue
    opreport --symbols ~/src/moz/obj/thunderbird-debug/dist/bin/*{,/*} \
        > $tag-$i
    i=$((i+1))
done
Smaug, any idea why bug 409111 would cause this?
Something in TB JS code uses boxObjects with elements which aren't in document?
In 1.8 and after fixing regression bug 409111, xulelement.boxObject doesn't 
throw, unlike before bug 409111. Are there some try-catch blocks in TB js
which check .boxObject validity? That way the exception doesn't show up in
error console.
One could perhaps debug this by adding some warnings to
nsDocument::GetBoxObjectFor to check if element is in document.
The problem seems to be that some "undetermined" progressmeter has the interval running. Writing a patch. I think progressmeter needs a fix.
Attached patch possible patchSplinter Review
This is a possible patch. Would have to ask Enn and/or Neil what they think 
about this, but if someone could first test this to ensure that this really fixes the problem.
no, a307547 doesn't seem to fix this issue for me.
no? At least it fixes this testcase.
Is there still some other similar problem...
Couldn't see any other problematic boxObject use cases
Attached file better test
Shows that putting elements back to dom works.
looks like a307579 helps :)
Attached patch v3Splinter Review
This is a bit faster than the previous. If someone could verify that this still fixes the problem.
I believe this fixes it for me. Not having this load on thunderbird certainly improves the user experience greatly. 
Attached patch test width, not width >> 2 (obsolete) — Splinter Review
Attachment #307759 - Flags: review?(enndeakin)
Attached patch v4Splinter Review
Attachment #307759 - Attachment is obsolete: true
Attachment #307760 - Flags: review?(enndeakin)
Attachment #307759 - Flags: review?(enndeakin)
Attachment #307760 - Flags: review?(enndeakin) → review+
Attachment #307760 - Flags: superreview?(neil)
Comment on attachment 307760 [details] [diff] [review]
v4

>               spacer.left = width * position;
It makes no sense for this width to be zero, as then nothing will animate.

Also is there a reason to leave in the try/catch?
Attachment #307760 - Flags: superreview?(neil) → superreview-
You want me to start fixing other things in this code?

I use the !width test as a hint that element is removed from document and then
check if that is the case. This way progressmeter still works if it is hidden
and then set back visible.

try/catch is possibly needed in case document's presshell is destroyed but scripts are still running.
ftr, a307585 and a307760 also wfm.
Comment on attachment 307760 [details] [diff] [review]
v4

OK, I see your point.

>+                // Maybe we've been removed from the document.
>+                var currentNode = stack;
>+                while (currentNode && currentNode != stack.ownerDocument) {
>+                  currentNode = currentNode.parentNode;
>+                }
>+                if (!currentNode) {
>+                  clearInterval(interval);
>+                  return;
>+                }
I don't know what you think about this:
for (var currentNode = stack; currentNode != stack.ownerDocument;
                              currentNode = currentNode.parentNode) {
  if (!currentNode) {
    clearInterval(interval);
    return;
  }
}

Or maybe you could try to select the stack in a range; I understand that these days that's invalid if the stack isn't in a document :-)
Attachment #307760 - Flags: superreview- → superreview+
(In reply to comment #22)
> Or maybe you could try to select the stack in a range; I understand that these
> days that's invalid if the stack isn't in a document :-)
It was invalid, but not anymore.

That for loop is just doing the same as the while, so I don't see big difference.
|while| is vertically longer, 'for' horizontally. Matter of taste.

Attachment #307760 - Flags: approval1.9?
Comment on attachment 307760 [details] [diff] [review]
v4

a1.9=beltzner
Attachment #307760 - Flags: approval1.9? → approval1.9+
Assignee: nobody → Olli.Pettay
Checked in, but please verify
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
v
Status: RESOLVED → VERIFIED
Keywords: perf
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: