Closed Bug 1753195 Opened 3 years ago Closed 1 year ago

Network progress bar causes CPU load even in offline mode (news)

Categories

(MailNews Core :: Networking, defect)

Thunderbird 91
defect

Tracking

(thunderbird_esr91 wontfix, thunderbird_esr115 unaffected)

RESOLVED FIXED
98 Branch
Tracking Status
thunderbird_esr91 --- wontfix
thunderbird_esr115 --- unaffected

People

(Reporter: manikulin, Assigned: rnons)

References

(Blocks 1 open bug)

Details

(Keywords: perf)

Attachments

(3 files)

Steps to reproduce:

Annoying bug causing increased fan noise appeared after recent update of system thunderbird package from 78 to 91.5 in Kubuntu-20.04 focal LTS.

Animated network progress bar in the status bar consumes enough CPU to make fan noisy. With my main profile (IMAP, RSS, NNTP) it is enough to leave thunderbird running for several hours. I suspect transient network problems.

It seems, the following should be enough for newly crated test profile:

  • Create news account for news.gmane.io NNTP server
  • Subscribe to some news group, e.g. gmane.comp.mozilla.thunderbird.user
  • Click on some message "1" to check that everything work
  • Add DROP firewall rule for 119 port (DROP is better than REJECT)
  • Click on some more messages "2", "3", "4", again e.g. "3"
  • Remove firewall rule. For thunderbird-91 it can be done almost instantly, for 97.0b3 it is better to wait for half of minute.

Another branch: instead of last step switch to offline regime.

Actual results:

  • Progress bar appears in status bar running back and forth (maybe taken from KDE theme).
  • CPU load increases, after some time cooler becomes more noisy.
  • Progress bar persists even after switching thunderbird to offline

For thunderbird-91 progress bar persists after clicking on any message, even on that ones that may be successfully fetched.

For thunderbird-97.0b3 the issue is more subtle, progress bar disappears when any message is successfully fetched, but on click on a message (likely "3" in the steps above) that is not downloaded, it appears again.

Expected results:

  • Progress bar should not cause CPU load.
  • Progress bar should disappear after switching to offline.

Reminiscent of bug 683651 where progress bar is also a suspect.
Max, what made you arrive at the conclusion that it's the progress bar element itself which is causing this, as opposed to other processes like network access, and the progress bar just being peripheral?

Flags: needinfo?(manikulin)
See Also: → 683651
Component: Untriaged → Networking
Product: Thunderbird → MailNews Core

Thanks for reporting, will make a fix to stop the progress bar when failed to fetch (in offline mode).

Assignee: nobody → remotenonsense
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true

Is a similar solution needed for bug 742697?

Flags: needinfo?(remotenonsense)
Target Milestone: --- → 98 Branch

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/7725d30c7aaf
Stop progress bar when fetching news article failed. r=mkmelin

Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED

(In reply to Thomas D. (:thomas8) from comment #1)

Reminiscent of bug 683651 where progress bar is also a suspect.

Yes, I have seen that bug yesterday. I decided to report new one since I had steps to reproduce. More:

Max, what made you arrive at the conclusion that it's the progress bar element itself which is causing this, as opposed to other processes like network access, and the progress bar just being peripheral?

  • I believe in people. Continuous polling in a tight loop is rarely used in JavaScript, I have a hope that select or poll is used for network operations in C++ code. (OK, there is @mozilla.org/network/sync-stream-listener;1 or something similar...)
  • If I remember correctly, thunderbird-78 did not have such feature as animated progress bar and cooler in my laptop was significantly more quiet. So I suspected designers and fancy progress bar in contribution to global warming.
  • I just confirmed by setting display: none in developer tools for the progress bar that CPU load generated by thunderbird process is decreased from ~30% to ~3-6% in idle (so there is some room for further improvements, maybe namely in network libraries as well).
Flags: needinfo?(manikulin)

Thank you for the prompt fix.

Can backport to version 91 be reconsidered? It seems, 91 will be packaged for Ubuntu-22.04 "jammy" long time support release: https://packages.ubuntu.com/search?keywords=thunderbird Having Ubuntu-20.04 focal LTS installing, I am suffering from new thunderbird-91 bugs just for a week. Before it was more stable 78 with not so annoying bugs.

Guys, do not you think that 30% CPU load due to just progress bar is rather high even for 7 years old laptop with i5? There are plenty of use cases when it can be shown for a long time, e.g. user is distracted for an hour just after password prompt appeared. I can create another issue to discuss it.

Yesterday I was quite disappointed when just invented workaround with offline mode did not help. It seems, I figure out what can be done:

Workaround

Unsure if a special theme with still progress bar may be created.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Thunderbird-91 almost certainly has a similar problem with either IMAP or RSS. In my main profile I have "Check for new messages every" disabled for the news account, but progress bar may appear while I work with other applications.

Sorry the patch only fixes nntp-js, I don't know how to fix 91.

Will take a look at bug 742697.

Flags: needinfo?(remotenonsense)

Pushed by geoff@darktrojan.net:
https://hg.mozilla.org/comm-central/rev/48afdbedd7e2
Stop progress bar when fetching news article failed. r=mkmelin

Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → FIXED

I am sorry, but I still see the issue with 98.0a1 (2022-02-06)

I added DROP iptables rule, clicked on several messages. At first there were 2 connections in ESTAB state, later one of it became LAST-ACK and disappeared from ss output. After ~15 min I switched to offline mode, second network connection was closed, but progress bar did not disappear.

The following message appeared in console

mailnews.nntp: Failed to send "QUIT" because socket state is closed 2 NntpClient.jsm:205:20

A had a hope to get some logs using MOZ_LOG="IMAP:5,NNTP:5,timestamp", but files are empty. The same variables work for thunderbird-91.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

You can set mailnews.nntp.loglevel to All to see logs in DevTools Console.

I only tested by switching to offline mode, I don't see progress bar, and the logs are like

mailnews.nntp: Connecting to news://news.mozilla.org:119 NntpClient.jsm:100:20
mailnews.nntp: 
error { target: TCPSocket, isTrusted: true, name: "ConnectionRefusedError", message: "Network", errorCode: 2152398861, srcElement: TCPSocket, currentTarget: TCPSocket, eventPhase: 2, bubbles: false, cancelable: false, … }
 ConnectionRefusedError Network 2152398861 NntpClient.jsm:175:18
mailnews.nntp: Failed to send "QUIT" because socket state is closed NntpClient.jsm:208:20
mailnews.nntp: Done with status=2152398861 NntpClient.jsm:784:18

The key here is ConnectionRefusedError, which should dismiss the progress bar.

I have tried a couple of times more.

  • Click on some message to check that everything works.
  • Insert a DROP rule to iptables for port 119
  • Click on 3 seen messages
10:24:42.178 mailnews.nntp: C: ARTICLE 143554 NntpClient.jsm:220:20
10:24:42.930 mailnews.nntp: Connecting to news://news.gmane.io:119 NntpClient.jsm:100:20
  • switch to offline, download "later"
10:24:59.197 mailnews.nntp: C: QUIT NntpClient.jsm:220:20
10:24:59.198 mailnews.nntp: Failed to send "QUIT" because socket state is connecting NntpClient.jsm:208:20
10:24:59.272 mailnews.nntp: Connection closed. NntpClient.jsm:184:18
10:24:59.273
mailnews.nntp: 
error { target: TCPSocket, isTrusted: true, name: "ConnectionRefusedError", message: "Network", errorCode: 2152398861, srcElement: TCPSocket, currentTarget: TCPSocket, eventPhase: 2, bubbles: false, cancelable: false, … }
 ConnectionRefusedError Network 2152398861 NntpClient.jsm:175:18
10:24:59.275 mailnews.nntp: Failed to send "QUIT" because socket state is closed NntpClient.jsm:208:20
10:24:59.276 mailnews.nntp: Done with status=2152398861 NntpClient.jsm:784:18
10:24:59.278 mailnews.nntp: Connecting to news://news.gmane.io:119 NntpClient.jsm:100:20
10:24:59.287
mailnews.nntp: 
error { target: TCPSocket, isTrusted: true, name: "ConnectionRefusedError", message: "Network", errorCode: 2152398861, srcElement: TCPSocket, currentTarget: TCPSocket, eventPhase: 2, bubbles: false, cancelable: false, … }
 ConnectionRefusedError Network 2152398861 NntpClient.jsm:175:18
10:24:59.288 mailnews.nntp: Failed to send "QUIT" because socket state is closed NntpClient.jsm:208:20
10:24:59.289 mailnews.nntp: Done with status=2152398861
  • Delete the iptables rule

Progress bar persists

  • Close thunderbird

thunderbird-bin process does not exit immediately, it crashes a half minute later.

(In reply to max from comment #14)

I have tried a couple of times more.

  • Click on some message to check that everything works.
  • Insert a DROP rule to iptables for port 119
  • Click on 3 seen messages
10:24:42.178 mailnews.nntp: C: ARTICLE 143554 NntpClient.jsm:220:20
10:24:42.930 mailnews.nntp: Connecting to news://news.gmane.io:119 NntpClient.jsm:100:20

I don't know how DROP works, seems it doesn't close the socket. If you wait more than 2 minutes here, is there a timeout error?

Tried another time without deleting of the iptables rule. I waited longer, but progress bar did not disappear. After that I removed the rule, enabled online mode again. Messages were loaded, progress bar persisted and was causing CPU load. There were messages concerning addon updates, but I did not noticed ConnectionRefusedError.

DROP silently blocks IP packages, Originally I tried it to simulate unreliable connection, e.g. problems with wifi (I suspect that thunderbird has problems when wifi adapter can not recover from e.g. power save regime or after automatic change of channel). Unlike for REJECT rule there is not ICMP "Connection refused" response, so it is harder to detect connection lost. That is why I suspect that either "ConnectionRefusedError" exception name either does not reflect all possible causes or it is not thrown at all due to silent lost of packets.

See Also: → 1754158
See Also: 1754158

OK, I can reproduce with these steps:

  1. insert a DROP rule, sudo iptables -A OUTPUT -p tcp --dport 119 -j DROP
  2. start TB, click on a news article
  3. wait 10 seconds, remove the DROP rule sudo iptables -D OUTPUT 1
  4. wait a few seconds, the article should load, but the progress bar doesn't disappear

The problem is nsMsgStatusFeedback::OnStateChange not called https://searchfox.org/comm-central/rev/823d3ee3bee3d0b4c9e051cfd78a3834f0649518/mailnews/base/src/nsMsgStatusFeedback.cpp#67,81,123, don't know how to fix it.

I do not think that nsMsgStatusFeedback is the source of problem. It looks like a proxy between implementations of network protocols and JS code. So the bug is more likely resides in a component that this class interconnects. In particular, I do not see any timeouts here.

On the other hand I see some code related to displaying of messages. So another old and annoying bug might be somewhere around. Sometimes (not really frequent, so hard to reproduce) I get blank pane instead of news message. It does not help to switch to another message and back. If I remember correctly, even thunderbird restart does not help. It is possible to read such message a couple of days later. Sometimes view source or even reply works correctly for such blank message panes. I am not familiar with thunderbird code close enough. I am just speculating that if signal to hide progress bar is missed, the signal to update cache and to display message may be missed as well.

It's hard to say what's the root cause. I added some printf to OnStateChange, what I observed was with normal connection, STATE_STOP is called correctly to dismiss the progress bar. But not called when DROP rule is applied and removed. OnStateChange is called from nsDocShell or nsDocLoader, and related to nsLoadGroup.cpp, I got lost after a while.

Ping Chen, you mentioned 2 minutes timeout, could you, please, a link to the code setting this timeout?

I have not looked into the sources more closely, but it seems I found another way to reproduce the issue. I have seen it several times with thunderbird-91, but was not sure that it is unrelated to IMAP since despite disabled "check new mail every..." option I have seen logs from IMAP rather close to NNTP logs.

So thuderbird-98 daily build, no firewall rules.

  • click on some message
  • wait 10 minutes.
  • Click on another seen message not accessed during the latest session.
    Message is displayed, progress bar persists
10:20:04.757 mailnews.nntp: Connection closed. NntpClient.jsm:184:18
10:41:14.872 mailnews.nntp: Connecting to news://news.gmane.io:119 NntpClient.jsm:100:20
10:41:15.349 mailnews.nntp: Connected NntpClient.jsm:122:18
10:41:16.286 mailnews.nntp: S: 200 news.gmane.io InterNetNews NNRP server INN 2.6.3 ready (posting ok) NntpClient.jsm:146:20
10:41:16.286 mailnews.nntp: C: MODE READER NntpClient.jsm:220:20
10:41:16.383 mailnews.nntp: S: 200 news.gmane.io InterNetNews NNRP server INN 2.6.3 ready (posting ok) NntpClient.jsm:146:20
10:41:16.383 mailnews.nntp: C: GROUP gmane.emacs.orgmode NntpClient.jsm:220:20
10:41:16.476 mailnews.nntp: S: 211 143511 1 143614 gmane.emacs.orgmode NntpClient.jsm:146:20
10:41:16.477 mailnews.nntp: C: ARTICLE 143554 NntpClient.jsm:220:20
10:41:16.604 mailnews.nntp: S: 220 143554 <93059fe2-12ff-46aa-9f09-01056d24df8b@gmail.com> article NntpClient.jsm:146:20
10:41:16.605 mailnews.nntp: Receiving 8688 bytes of data NntpClient.jsm:153:22
10:41:16.686 mailnews.nntp: Receiving 136 bytes of data NntpClient.jsm:153:22
10:41:16.687 mailnews.nntp: Done with status=0 NntpClient.jsm:784:18

A have one more question concerning logging. Version 91 shows state change lines repeating a lot of times

2022-02-10 08:13:58.018681 UTC - [Parent 90048: Main Thread]: I/NNTP (7fdfa6f39c00) Next state: SETUP_NEWS_STREAM
2022-02-10 08:13:58.018687 UTC - [Parent 90048: Main Thread]: I/NNTP (7fdfa6f39c00) Next state: NNTP_BEGIN_ARTICLE
2022-02-10 08:13:58.026935 UTC - [Parent 90048: Main Thread]: I/NNTP (7fdfa6f39c00) Next state: NNTP_READ_ARTICLE
2022-02-10 08:13:58.026990 UTC - [Parent 90048: Main Thread]: I/NNTP (7fdfa6f39c00) Next state: NNTP_READ_ARTICLE
2022-02-10 08:13:58.026999 UTC - [Parent 90048: Main Thread]: I/NNTP (7fdfa6f39c00) Next state: NNTP_READ_ARTICLE
2022-02-10 08:13:58.027006 UTC - [Parent 90048: Main Thread]: I/NNTP (7fdfa6f39c00) Next state: NNTP_READ_ARTICLE
2022-02-10 08:13:58.027012 UTC - [Parent 90048: Main Thread]: I/NNTP (7fdfa6f39c00) Next state: NNTP_READ_ARTICLE
2022-02-10 08:13:58.027018 UTC - [Parent 90048: Main Thread]: I/NNTP (7fdfa6f39c00) Next state: NNTP_READ_ARTICLE
2022-02-10 08:13:58.027025 UTC - [Parent 90048: Main Thread]: I/NNTP (7fdfa6f39c00) Next state: NNTP_READ_ARTICLE

Does it mean that logging in 98 was improved or I failed to enable verbose logging?

(In reply to max from comment #20)

you mentioned 2 minutes timeout, could you, please, a link to the code setting this timeout?

Just my impression, sometimes it takes 5 minutes. By default, we have 2 cached nntp connections, if a connection is idle for >2 minutes, I usually see the socket timed out.

mailnews.nntp: S: 503 time out 
mailnews.nntp: Connection closed.

So thuderbird-98 daily build, no firewall rules.

  • click on some message
  • wait 10 minutes.
  • Click on another seen message not accessed during the latest session.
    Message is displayed, progress bar persists

Good catch, I don't need to wait 10 minutes, as long as a cached connection timed out, I can reproduce. Seems 91 also has this problem.

Does it mean that logging in 98 was improved or I failed to enable verbose logging?

The nntp module in 91 was in C++, in 98 was rewritten in JS, so the logging code is totally different, let me know if you think something should be logged. The state changes logged by 91 are mostly internal processing states, not network protocol state changes.

My new patch should fix the two cases you found. Please help me test after it land to the nightly, thanks.

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/f12037141f5b
Remove nsILoadGroup.addRequest calls in NntpChannel to prevent persistent progress bar. r=mkmelin

Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → FIXED

I still can reproduce this issue using steps from comment #14 (DROP rule and switch to offline). The only positive difference is that 99a1 does not crash on quit.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

OK, I didn't test going offline last time, made a new patch.

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/0c7804ab5303
Call _actionDone when going offline to dismiss progress bar in NntpClient.jsm. r=mkmelin

Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → FIXED
Keywords: perf

Sorry, but I have to reopen this bug again.

I added DROP rule to iptables, clicked on several messages, switched to another group, clicked on some messages there and waited till connections disappeared from ss -tnp output. Progress bar was active. I removed the blocking firewall rule, but I was unable get previously unseen message till thunderbird restart.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Since it is hard to even fix all buggy code paths, I have created Bug #1757544 to consider another design of progress bar. After all, there are legitimate cases when thunderbird is waiting for some event.

(In reply to max from comment #28)
Sorry, but I have to reopen this bug again.

I added DROP rule to iptables, clicked on several messages, switched to another group, clicked on some messages there and waited till connections disappeared from ss -tnp output. Progress bar was active.

You can file new bugs so I can pretend to not see it ;-) It's not clear to me from which step the progress bar starts to appear.

I removed the blocking firewall rule, but I was unable get previously unseen message till thunderbird restart.

This sounds like a separate bug not related to progress bar?

(In reply to Ping Chen (:rnons) from comment #30)

(In reply to max from comment #28)
Sorry, but I have to reopen this bug again.

I added DROP rule to iptables, clicked on several messages, switched to another group, clicked on some messages there and waited till connections disappeared from ss -tnp output. Progress bar was active.

You can file new bugs so I can pretend to not see it ;-)

Please, fix at first the case from the original description. I have enough bugs to report in my backlog.

It's not clear to me from which step the progress bar starts to appear.

It usually appears in response to first click on another message after I block connection to the server. I do not try to dispute it appearance since thunderbird attempts to connect to the server.

Today I got immortal progress bar in scenario rather close to the original description.

  • Initial state: one "ESTAB" connection to the port 119, likely due to previous activity and playing with firewall rules.
  • Add the DROP rule
  • Click on message "1"
  • Progress bar appeared
  • Click on messages "2", "3", and "2" again
  • Second connection appeared in "SYN-SENT" state
  • Wait till connection timeout (no entries in ss output)
  • Remove the DROP rule
  • The progress bar persists.

I removed the blocking firewall rule, but I was unable get previously unseen message till thunderbird restart.

This sounds like a separate bug not related to progress bar?

Yes, it is similar to an old bug that survived NNTP reimplementation in JS. Or there is a bunch of bugs that makes some messages unreadable till restart or even longer.

Today I noticed bounced point in the tab header when thunderbird should be in idle state. There was no progress bar at this moment.

(In reply to Ping Chen (:rnons) from comment #30)

(In reply to max from comment #28)
I removed the blocking firewall rule, but I was unable get previously unseen message till thunderbird restart.

This sounds like a separate bug not related to progress bar?

I have file the Bug #1758100 for blank message pane however the source of problem may be close to this one: incorrect tracking of network connection state.

See Also: → 1758100

A few issues were fixed in this bug, so let's keep it closed. Bug 1758100 can continue with further problems.

Status: REOPENED → RESOLVED
Closed: 3 years ago1 year ago
Resolution: --- → FIXED
Blocks: 1918961
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: