Closed
Bug 1143480
Opened 10 years ago
Closed 8 years ago
TB hangs on network change
Categories
(MailNews Core :: Networking, defect)
Tracking
(Not tracked)
RESOLVED
INVALID
People
(Reporter: jhaar, Unassigned)
Details
(Whiteboard: [closeme 2016-10-01])
User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:36.0) Gecko/20100101 Firefox/36.0
Build ID: 20150306140254
Steps to reproduce:
I run Ubuntu-14.04 with TB-31.5 (the standard package) on a laptop, with two imap accounts. At home and at work we have "split mode" dns: ie inside each network, "imap.domain.name" resolves to a 192.168.X and 10.X address, and on the Internet they resolve to different Internet addresses
So when at home, one imap accounts resolves to my home 192.168 imap server, while my work resolves to its Internet address. Then I sleep my laptop, go to work, open the lid and connect to the work network, and then dns resolves my home imap to an Internet address and my work to a 10 address
Actual results:
What happens is that TB consistently hangs for up to 5 minutes. "lsof -ni|grep thunderbird" shows TB has kept IMAP connections open to the wrong addresses - ie it hasn't realised there's being a network change and so is still trying to connect to my home 192.168 address from work - where it should be using the Internet address. Similarly, it is still trying to exchange packets with my work Internet address instead of it's 10 address
Using nslookup I can see my local DNS resolution is correct - but TB has not actually re-done a DNS lookup. Either it's running from it's own cache, or is insisting on continuing to use these old, out of date ESTABLISHED connections
Just a FYI, but "lsof -ni" shows TB has ESTABLISHED connections from its old, non-existent 192.168 address even when it's on the 10 network - and doesn't have that address. That will be the "fault" of the Linux OS - not TB - but it is what it is
Expected results:
I think TB should keep an eye on hanging IMAP (POP too) sessions, and when it decides they aren't working, tear them down and re-initialize - including DNS lookups. Maybe it already does that, in which case all I have a problem with is that it takes 5+ minutes instead of (say) 30 seconds?
Here's what lsof shows 10+ minutes after TB was un-slept on my work network. You will see even now it still has "open" connections from it's old 192.168.8.11 address (which isn't assigned to any network card). They are now in a CLOSE_WAIT state and TB is now working correctly - but they shouldn't be there...
thunderbi 10464 jhaar 43u IPv4 3573819 0t0 TCP 192.168.8.11:54112->12.3.11.5:pop3s (CLOSE_WAIT)
thunderbi 10464 jhaar 44u IPv4 3573820 0t0 TCP 192.168.8.11:54113->12.3.11.5:pop3s (CLOSE_WAIT)
thunderbi 10464 jhaar 62u IPv4 3569181 0t0 TCP 192.168.8.11:54114->12.3.11.5:pop3s (CLOSE_WAIT)
thunderbi 10464 jhaar 64u IPv4 104193 0t0 TCP 192.168.8.11:48894->192.168.8.3:imaps (CLOSE_WAIT)
thunderbi 10464 jhaar 68u IPv4 121185 0t0 TCP 192.168.8.11:49022->192.168.8.3:imaps (CLOSE_WAIT)
thunderbi 10464 jhaar 69u IPv4 3912428 0t0 TCP 192.168.8.11:48080->198.84.60.198:http (CLOSE_WAIT)
thunderbi 10464 jhaar 77u IPv4 121353 0t0 TCP 192.168.8.11:49026->192.168.8.3:imaps (CLOSE_WAIT)
thunderbi 10464 jhaar 78u IPv4 3613703 0t0 TCP 192.168.8.11:54404->12.3.11.5:pop3s (CLOSE_WAIT)
thunderbi 10464 jhaar 82u IPv4 3911672 0t0 UDP 127.0.0.1:49500->127.0.1.1:domain
thunderbi 10464 jhaar 84u IPv4 2974525 0t0 TCP 192.168.8.11:41317->192.168.8.3:imaps (CLOSE_WAIT)
thunderbi 10464 jhaar 92u IPv4 27850665 0t0 TCP 10.8.2.21:46182->10.8.254.3:pop3s (ESTABLISHED)
thunderbi 10464 jhaar 104u IPv4 3913513 0t0 TCP 192.168.8.11:34083->65.54.226.151:http (CLOSE_WAIT)
thunderbi 10464 jhaar 117u IPv4 3913517 0t0 TCP 192.168.8.11:37632->92.52.96.89:http (CLOSE_WAIT)
thunderbi 10464 jhaar 118u IPv4 3914187 0t0 TCP 192.168.8.11:39166->107.6.106.82:http (CLOSE_WAIT)
thunderbi 10464 jhaar 130u IPv4 3914219 0t0 UDP 127.0.0.1:45928->127.0.1.1:domain
thunderbi 10464 jhaar 134u IPv4 27850668 0t0 TCP 172.16.12.4:51474->21.114.246.214:imaps (ESTABLISHED)
thunderbi 10464 jhaar 148u IPv4 27854245 0t0 TCP 10.8.2.21:46204->10.8.254.3:pop3s (ESTABLISHED)
thunderbi 10464 jhaar 162u IPv4 27851252 0t0 TCP 10.8.2.21:46175->10.8.254.3:pop3s (ESTABLISHED)
thunderbi 10464 jhaar 171u IPv4 27940241 0t0 TCP 172.16.12.4:52194->21.114.246.214:imaps (ESTABLISHED)
thunderbi 10464 jhaar 173u IPv4 27856007 0t0 TCP 10.8.2.21:46205->10.8.254.3:pop3s (ESTABLISHED)
thunderbi 10464 jhaar 179u IPv4 27848342 0t0 TCP 10.8.2.21:46152->10.8.254.3:pop3s (ESTABLISHED)
thunderbi 10464 jhaar 183u IPv4 27939607 0t0 TCP 172.16.12.4:52195->21.114.246.214:imaps (ESTABLISHED)
Updated•10 years ago
|
Component: Untriaged → Networking
Product: Thunderbird → MailNews Core
Comment 1•10 years ago
|
||
CLOSE_WAIT is normal state, because it's "server suddenly disappeared" for this PC. Timeout in TCP is approximately 10 minutes.
Tb has problem around DNS caching, so, if server's IP address is suddenly changed by you, Tb can't follow you quickly.
"Sudden server IP address change" is sudden server down from perspective of "client in a PC". It takes long to do "error detection, error recovery, clean up due to permanent error, retry from scratch, ...".
Do you see your problem by following procedure?
1. Before network change, Go "Work Offline" mode in Tb.
2. Network change.
3. When new network is usable, Go "Work Online" mode. Do network access such as imap folder access.
A reason why "takes long to swich to new network environment" :
If IDLE is used, imap cached connection for Inbox goes "Receive state" after IDLE.
Tb has problem in "connection loss while idling". Tb does do nothing when "connection loss while idling".
"Network change while Tb is running" == Forcing this "connection loss while idling" at a cached connection used for inbox.
Because idle timeout=29 minutes, next "DONE/IDLE cycle" is initiated after 30 minutes.
"Go Work Offline followed by Go Work Online" forces logout/connection close of Tb. Because connection is normally closed, next access to server is normally initiated. This is trick.
Reporter | ||
Comment 2•10 years ago
|
||
Sorry I failed to mention it before but I already tested with offline/online and that definitely helps. My real concern is that kind of trick is OK for me and you - but our fathers couldn't do it :-)
I've been seeing this problem for 10 years with Firefox and Thunderbird - I work around it myself - but think it's enough of a problem for "normal" people that TB should do something extra to reduce the impact
Your comment about IMAP IDLE makes a lot of sense. However, I have mail.server.default.use_idle=true and yet within Advanced settings have "Use IDLE command if the server supports it" unchecked... Which one wins? It sounds like IDLE isn't used - in which case your comments about IDLE cannot be happening?
Also, about:config shows network.tcp.keepalive.idle_time=600 - which is the 10 minutes I'm seeing before it's starts clearing up. If I reduced that to 300, should that speed up reconnections too?
Thanks!
Comment 3•10 years ago
|
||
(In reply to Jason Haar from comment #2)
> However, I have mail.server.default.use_idle=true
> and yet within Advanced settings have "Use IDLE command if the server supports it" unchecked...
> Which one wins?
The Advanced settings is saved in mail.server.server#.use_idle=false
mail.server.default.use_idle is default when mail.server.server#.use_idle is not defined, and/or is default upon imap account creation.
TCP keepalive : http://en.wikipedia.org/wiki/Keepalive
I also think small TCP keepalive is useful for quick network error detection, especially in WiFi environment.
If reliable network and connection, "small TCP keepalive" is merely annoyance for server. Default in TCP as transmission protocol is 2 hour.
Small imap idle timeout is another way.
Short idle timeout(done/idle cycle at an imap cached connection) is better than TCP keepalive, because done/idle cycle is done at 5 connections only in Tb, but TCP keepalive is applied to any TCP session.
Short idle timeout(done/idle cycle at an imap cached connection) is better than small new mail check interval for many folders for network error detection.
Another simple way.
Before go back home from office, terminate Tb, suspend PC.
At home, after resume of PC, restart Tb.
Before go to office from home, terminate Tb, suspend PC.
At office, after resume of PC, restart Tb.
Why problem occurs only on imap in Tb is: connection is established always if imap.
If pop3 or smtp or nntp, connection is closed after each access, and connection is establshed upon next access.
So, "forcing imap connection close at somewhere" is needed for quick/certain recovery from network error.
"forcing imap connection close" is done by : termination of Tb, or Go Work Offline.
Comment 4•8 years ago
|
||
Jason,
Much has changed in two years.
Do you still see this problem when using a current version 45?
In a rather generic bug query https://mzl.la/2cIkgBW I don't see anything that definitely sounds like your issue (except yours of course)
I do not use offline/online and I do not have any problems moving across networks. But I am not using linux.
Flags: needinfo?(jhaar)
Whiteboard: [closeme 2016-10-01]
Reporter | ||
Comment 5•8 years ago
|
||
Yeah, this was 18 months ago, so everything's changed. I moved off Ubuntu onto Fedora for starters, along with several OS updates and TB doesn't show this issue any more
Let's close this one :-)
Status: UNCONFIRMED → RESOLVED
Closed: 8 years ago
Flags: needinfo?(jhaar)
Resolution: --- → INVALID
You need to log in
before you can comment on or make changes to this bug.
Description
•