Open Bug 686495 Opened 13 years ago Updated 7 months ago

(Laptop battery life 40-50% shorter) TB 10-15% cpu with main network disconnected, but "other adapters" enabled and not responding

Categories

(MailNews Core :: Networking, defect)

x86_64
Windows 7
defect

Tracking

(Not tracked)

UNCONFIRMED

People

(Reporter: scovich, Unassigned, NeedInfo)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: perf, Whiteboard: [battery])

Attachments

(3 files)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0) Gecko/20100101 Firefox/6.0
Build ID: 20110811165603

Steps to reproduce:

While TB-v6.0.2 was running, unplugged from network (WIFI physically switched off) and started running off battery (not using or looking at TB). I have a win7-x64 laptop. 


Actual results:

TB constantly uses a small amount of CPU (5-10%), burning battery directly and preventing Windows from idling the CPU (which burns even more power indirectly).


Expected results:

TB CPU usage should be 0.0% when in background with no network access.
Summary: Laptop battery life 30-40% shorter with TB running → Laptop battery life 40-50% shorter with TB running
Is 'Work Offline' checked in the File menu when this happens (which indicates that Thunderbird notices that there's no network connection at all) ? And what happens if you check it manually ?
(In reply to Jo Hermans from comment #1)
> Is 'Work Offline' checked in the File menu when this happens (which
> indicates that Thunderbird notices that there's no network connection at
> all) ? And what happens if you check it manually ?

Interesting. I had assumed it was working offline because it always offers to send outbound messages when I plug back in. However, the icon bottom-left indeed indicates *online* and going offline manually seems to fix the problem. 

That begs the question, then... why does TB need so much CPU when idling in online mode? At any given time it should be (a) checking for and/or downloading messages (some CPU understandable, though not nearly as much as it seems to use), (b) waiting for a connection timeout (should use no CPU) or (c) truly idle (again, no CPU). Given that no messages are arriving over my online-but-physically-disconnected network, that leaves only (b) or (c).
Ryan are you network drivers up to date ? (we seem to not detect properly that you are offline/online)
Keywords: perf
I'm on a ThinkPad and Lenovo's tools at least think I'm up to date. I haven't tried to figure out what wireless card I have to go hunting down a bleeding edge driver from the manufacturer. 

Also, even when officially offline, compacting a folder (which I've set to happen automatically) seems to cause the same sort of long-term dribble of CPU usage. 

In any case, this sidesteps the real issue: why does TB need so much CPU to wait on the network? I suspect the answer lies inside code shared with FF, since it exhibits the same behavior (see comments in bug #683651). Right now TB/FF drain battery faster than running with wireless enabled.
Ryan, 

What model thinkpad?

Does this reproduce if you have thunderbird started in safe mode?

As a variation to comment 0, does it reproduce if you start thunderbird with network enabled, then disable network after startup is complete?

As for a connection to bug 683651, it's a leap to make a connection without having more symptoms. Especially since, I think, you imply the CPU is high there only while it's actually getting mail. (did I read it correctly?)

----

here is a list of "battery" bugs to give you a flavor (and there are more related bugs that won't mention the word battery) -- https://bugzilla.mozilla.org/buglist.cgi?type1-0-0=substring&list_id=1479360&field0-0-0=short_desc&type0-0-1=substring&field0-0-1=status_whiteboard&type1-0-1=allwordssubstr&resolution=---&resolution=FIXED&resolution=INVALID&resolution=DUPLICATE&value0-0-2=network&classification=Client%20Software&classification=Components&chfieldto=Now&query_format=advanced&chfieldfrom=2009-01-01&field0-0-2=component&longdesc=battery%20batteries&value0-0-1=battery&type0-0-0=anywordssubstr&value0-0-0=batter%20power%20network%20cpu%20processor&field1-0-0=short_desc&type0-0-2=substring&product=Core&product=Firefox&product=MailNews%20Core&product=Thunderbird&product=Toolkit&longdesc_type=anywordssubstr&field1-0-1=short_desc

less expansive list -- https://bugzilla.mozilla.org/buglist.cgi?type1-0-0=substring&list_id=1478830&field0-0-0=short_desc&type0-0-1=substring&field0-0-1=status_whiteboard&type1-0-1=allwordssubstr&resolution=---&classification=Client%20Software&classification=Components&query_format=advanced&value0-0-1=battery&type0-0-0=anywordssubstr&value0-0-0=batter&field1-0-0=short_desc&product=Core&product=Firefox&product=MailNews%20Core&product=Thunderbird&field1-0-1=short_desc
Summary: Laptop battery life 40-50% shorter with TB running → (Laptop battery life 40-50% shorter) TB 10-15% cpu with network disconnected
(In reply to Wayne Mery (:wsmwk) from comment #5)
>
> What model thinkpad?
T410s

>
> Does this reproduce if you have thunderbird started in safe mode?
Yes.

> 
> As a variation to comment 0, does it reproduce if you start thunderbird with
> network enabled, then disable network after startup is complete?
Yes. In fact, that's my normal usage pattern: fire up TB, download emails to read on the commute, go offline, notice battery draining fast...

As a more precise use case: 
1. Start TB in safe mode
2. Wait until everything loads and settles down
3. Physically switch off wireless
4. Click on a folder in a different account

This time the connection timed out after about 30 seconds, with CPU sitting at 8-10% the entire time. Sometimes it times out faster (or the CPU stops, at least), but usually it still hasn't given up after 30 minutes. Manually sending TB offline drops the CPU a lot but leaves dwm.exe busy. 

> 
> As for a connection to bug 683651, it's a leap to make a connection without
> having more symptoms. Especially since, I think, you imply the CPU is high
> there only while it's actually getting mail. (did I read it correctly?)
Now that I've been paying close attention to the CPU meter whenever on battery, I've identified several problems; I think the first two are related. 

1. a broken/timeout connection potentially sucks up CPU forever (this bug)
2. slow connections suck up CPU until the download completes (bug 683651)
3. often, keypresses register *very* slowly (300-700ms each) when composing an email in should-be-offline-mode.
4. even in offline mode, TB and FF both cause non-negligible activity in dwm.exe (not previously reported by me, but Mac users seem to see a similar issue, e.g. bugs 682693 and 254884)

Overall, waiting on network seems to consume considerable CPU regardless of bytes transferred. Based on the above, I came to the conclusion that the two bugs are related. I also suspect the CPU spike seen when loading messages over a fast connection is unnecessarily high. 

I guess, the times when the connection never times out is probably a different bug, because even if Windows tells TB there's Internet access it should eventually give up and close unresponsive connections. Minor by itself, but a real problem if those unresponsive connections burn battery in the meantime.
There are reports of issues like #1 - bug 554898 (unconfirmed as of now).  

If none of the bugs in [1] match your issue, then I guess it would be useful to have a protocol log showing any activity that might be occurring/attempted after the connection goes down. If you have ONLY imap accounts, please use instructions at https://wiki.mozilla.org/MailNews:Logging and setting of 
 imap:5,ImapAutoSync:5,nsHostResolver:5,nsSocketTransport:5,timestamp

take note of the time (compared to the log entries) when the network connection changed, where cpu changed, etc

you had indicated safe mode makes no difference. But to be absolutely certain there is no impact from direct2d problems, please first
do the following
- tools | options | advanced | general | config editor
- paste gfx.direct2d.disabled into filter and change it to "true"
- paste layers.acceleration.disabled preference and change it to "true" (create the boolean if necessary with right click)
- restart Thunderbird 

[1] connection performance bugs (does not include core bugs): https://bugzilla.mozilla.org/buglist.cgi?type1-0-0=nowordssubstr&keywords=perf%2C%20&keywords_type=allwords&list_id=1480498&field0-0-0=short_desc&bug_severity=normal&bug_severity=minor&type0-0-1=substring&field0-0-1=status_whiteboard&type1-0-1=allwordssubstr&resolution=---&resolution=FIXED&resolution=INVALID&resolution=DUPLICATE&value0-0-2=network&classification=Client%20Software&classification=Components&chfieldto=Now&query_format=advanced&chfieldfrom=2010-01-01&value1-0-0=news&field0-0-2=component&value0-0-1=battery&type0-0-0=anywordssubstr&value0-0-0=network&component=Backend&component=Database&component=Folder%20and%20Message%20Lists&component=General&component=Networking%3A%20IMAP&component=Networking%3A%20POP&component=Networking%3A%20SMTP&component=Search&field1-0-0=component&type0-0-2=substring&product=MailNews%20Core&product=Thunderbird&field1-0-1=short_desc
(In reply to Wayne Mery (:wsmwk) from comment #7)
> you had indicated safe mode makes no difference. But to be absolutely
> certain there is no impact from direct2d problems, please first
> do the following
> - tools | options | advanced | general | config editor
> - paste gfx.direct2d.disabled into filter and change it to "true"
> - paste layers.acceleration.disabled preference and change it to "true"
> (create the boolean if necessary with right click)
> - restart Thunderbird 

This seems to nail #4: Offline, safe mode, with those settings disabled, TB burns essentially zero CPU (task manager reports unchanging accumulated CPU time). If I go online, then TB accumulates 1 cpu-sec every couple of minutes. Normal mode (including a calendar extension), the amounts go up somewhat, but not much.

Meanwhile, I can't say I notice any performance difference. What would I give up by disabling d2d permanently?
(In reply to Wayne Mery (:wsmwk) from comment #7)
> useful to have a protocol log showing any activity that might be
> occurring/attempted after the connection goes down. If you have ONLY imap
> accounts, please use instructions at
> https://wiki.mozilla.org/MailNews:Logging and setting of 
>  imap:5,ImapAutoSync:5,nsHostResolver:5,nsSocketTransport:5,timestamp
> 
> take note of the time (compared to the log entries) when the network
> connection changed, where cpu changed, etc

OK, I fired up TB, killed the wireless in the middle of it looking up servers, and poked around a bit in the trace that resulted during the following minute. 

There are 725 calls to PR_Poll; at first there are six active sockets (strange, I have only four accounts!), but eventually that number drops to zero. The distribution of timeouts is fairly uniform over time, with the following value histogram:
 11    > 750ms
 48    350-750ms
  6    30-350ms
 42    1-30ms
618    0ms

Messages seen:
MSG_TIMEOUT_CHANGED 120
MSG_ENSURE_CONNECT 111
MSG_DNS_LOOKUP_COMPLETE 69

STS poll iter never gets above [1].

Functions called:
nsSocketTransport::PostEvent 311
nsSocketTransport::OnSocketEvent 311
nsSocketTransport::SendStatus 191
nsSocketOutputStream::CloseWithStatus 186
nsSocketInputStream::CloseWithStatus 186
nsSocketInputStream::Read 146
nsSocketOutputStream::AsyncWait 129
nsSocketTransport::OnSocketReady 110
nsSocketInputStream::OnSocketReady 105
nsSocketOutputStream::OnSocketReady 89
nsSocketInputStream::AsyncWait 82
nsSocketTransportService::RemoveFromPollList 75
nsSocketTransportService::DetachSocket 75
nsSocketTransport::RecoverFromError 75
nsSocketTransport::OnSocketDetached 75
nsSocketTransportService::RemoveFromIdleList 69
nsSocketTransportService::AttachSocket 69
nsSocketTransportService::AddToPollList 69
nsSocketTransportService::AddToIdleList 69
nsSocketTransport::InitiateSocket 69
nsSocketTransport::BuildSocket 69
nsSocketTransport::ResolveHost 56
nsSocketTransport::OpenOutputStream 56
nsSocketTransport::OpenInputStream 56
nsSocketTransport::Init 56
nsHostResolver::ResolveHost 56
nsSocketOutputStream::Write 16
nsSocketTransport::OnMsgInputClosed 6

Is there anything else in the log that might be useful?
Update: I tried again, this time killing wifi during the download of an email with five 3MB photos. CPU usage hovered at 40-50% as usual. Then I parsed the log entries for the minute that followed.

Lines of log data parsed: 167k
Active sockets: 16 (!)
PR_Poll calls: 4802
PR_Poll timeout histo: 
      9    > 100ms
     13    10-100ms
    472    2-10ms
    542    1ms
   3766    0ms

Messages seen:
   MSG_TIMEOUT_CHANGED 18
   MSG_ENSURE_CONNECT 4
   MSG_DNS_LOOKUP_COMPLETE 2

Hot functions (others account for only 215 calls):
    4464    nsSocketInputStream::Read
    3192    nsSocketTransport::OnSocketReady
    3181    nsSocketInputStream::OnSocketReady
    3170    nsSocketInputStream::AsyncWait
    1304    nsSocketTransport::SendStatus
(In reply to Ryan Johnson from comment #8)
> Meanwhile, I can't say I notice any performance difference. What would I
> give up by disabling d2d permanently?

nothing. and in fact as of version 9 d2d/hwa is disabled by default in thunderbird.

might you bel able to test with the "daily" development build?
ftp://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-comm-central/
Component: General → Networking
Product: Thunderbird → MailNews Core
QA Contact: general → networking
This problem continues unabated with TB 10.0. Please find attached a screenshot showing Thunderbird busily working away while in offline mode. 

I'm trying to get debug symbols and xperf profiling up and running, but hopefully this will at least confirm that there is a bug.
Oh, forgot to add: Task Manager reports 20% utilization of *four* cores, so one CPU is actually 80% utilized by whatever TB is doing.
Profiler results are now in. More than 75% of samples are due to obviously graphics-related functions like:
xul.dll!nsBoxFrame::BuildDisplayList
xul.dll!_moz_cairo_fill_preserve
xul.dll!nsDisplayList::ComputeVisibilityForSublist
xul.dll!nsIFrame::BuildDisplayListForChild
xul.dll!nsStyleContext::DoGetStyleDisplay
xul.dll!nsRuleNode::GetParentDisplay
xul.dll!pixman_blt_sse2
xul.dll!nsRuleNode::GetStyleDisplay
xul.dll!nsWindow::OnPaint
xul.dll!nsRect::ScaleToNearestPixels
xul.dll!nsViewManager::UpdateWidgetArea
xul.dll!mozilla::gfx::BaseRect<int;nsRect;nsPoint;nsSize;nsMargin>::Intersect
xul.dll!pixman_fill_sse2

The remaining functions are mostly memory management, nsAttribute-related, or other generic helpers that are probably called by graphics functions...

In the syscall department we have:
win32k.sys!NtUserBeginPaint
win32k.sys!hbmSelectBitmap

That's one expensive spinning circle...
can you try the "daily" (development) build ?
from ftp://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-comm-central/

it's generally stable, but you should backup your profile first
Whiteboard: [battery]
(In reply to Wayne Mery (:wsmwk) from comment #15)
> can you try the "daily" (development) build ?
> from
> ftp://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-comm-
> central/
> 
> it's generally stable, but you should backup your profile first

Ryan ?
See Also: → 683651
(In reply to Wayne Mery (:wsmwk) from comment #16)
> (In reply to Wayne Mery (:wsmwk) from comment #15)
> > can you try the "daily" (development) build ?
> > from
> > ftp://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-comm-
> > central/
> > 
> > it's generally stable, but you should backup your profile first
> 
> Ryan ?

Sorry, Teh Busy hit and I haven't had a chance to try this out. Is it still the dev build, or is there some beta I should now try?
> Is it still the dev build, or is there some beta I should now try?
I forget why I suggested you try it - so I think just go for ftp://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-comm-central/
(In reply to Wayne Mery (:wsmwk) from comment #18)
> > Is it still the dev build, or is there some beta I should now try?
> I forget why I suggested you try it - so I think just go for
> ftp://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-comm-
> central/

The only difference I can tell with Daily is that the idle CPU util is slightly lower (which is good), and that it takes the laptop 30-60s to suspend when Daily is running (which is bad). CPU util while attempting to use nonexistent network connections remains the same.
As requested, the profile data for Daily while it attempts to connect with no network connection available.
Here's a screenshot of Daily hogging CPU at the time I profiled it. Note that nearly two full CPU are busy, half that systime. To be fair, I had clicked on each of my accounts several times to make sure the profiler had plenty to chew on, but this is a recurring problem (I've taken to closing TB before suspending, because it does this on wakeup even if network *is* available).
Attachment #628170 - Attachment mime type: text/plain → application/x-bzip2
(In reply to Ludovic Hirlimann [:Usul] from comment #3)
> Ryan are you network drivers up to date ? (we seem to not detect properly
> that you are offline/online)

I just figured this one out: TB correctly notices that I'm offline when *all* network adapters are offline/disconnected. Problem is, I have several virtual adapters (two for VMWare, e.g.) that are always connected but never provide an internet connection.

Is there a way to make TB check explicitly for an Internet connection, rather than assuming there's internet access whenever any adapter is connected?
Not that I know of.
Ryan, can you instrument this using the modern Gecko profiler?

Download https://github.com/bgirard/Gecko-Profiler-Addon/blob/master/geckoprofiler.xpi?raw=true 

Install as addon to thunderbird, in bottom right status bar click "disabled", do the test work for a minute, then click "dump profile", wait for profile to build, click "upload" on the left, get the URL.
Flags: needinfo?(scovich)
Summary: (Laptop battery life 40-50% shorter) TB 10-15% cpu with network disconnected → (Laptop battery life 40-50% shorter) TB 10-15% cpu with main network disconnected, but "other adapters" enabled and not responding
See Also: → 678947
Removing myslef on all the bugs I'm cced on. Please NI me if you need something on MailNews Core bugs from me.
See Also: → 1249945
Blocks: 683651
See Also: 683651
See Also: → 1752641
Severity: normal → S3
Depends on: 1752641
See Also: 1752641
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: