Closed Bug 737241 Opened 11 years ago Closed 4 years ago

"Get mail" always fails (waits forever) first time after reconnecting to network

Categories

(MailNews Core :: Networking, defect)

x86
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: matteosistisette, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [dupeme?])

User Agent: Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.79 Safari/535.11

Steps to reproduce:

Make sure you're not connected to network (in my case it's wifi, dunnow if this is relevant)
Open Thunderbird
Connect to network.
Check that the network connection works by opening a web page in a browser
Go to Thunderbird and click "get mail"


Actual results:

"looked up gmail.com"  (or whatever is the mail server domain) appears on the status bar and stays there for a while. Then it disappears.
Mail is not checked, and no error message is shown.
This happens systematically, 100% of the times.

Then I click "get mail" again, and it ALWAYS works the second time.


Expected results:

thunderbird should have connected to the mail server and checked for new messages the first time.

This issue has always existed ever since I've used Thunderbird, back to version 3 or something. On older versions it was only _slightly_ better, in that you would get a baloon notification (or is it called a toast?) saying "unable to connect" and a (wrong) message on the status bar saying "no messages to download" on the first try.
(In reply to matteo sisti sette from comment #0)
> Actual results:
> "looked up gmail.com"  (or whatever is the mail server domain) appears on
> the status bar and stays there for a while. Then it disappears.
> Mail is not checked, and no error message is shown.
> This happens systematically, 100% of the times.
> Then I click "get mail" again, and it ALWAYS works the second time.

This phenomenon can occur if IPv6 relevant problem exists in your environment.
- first DNS lookup(IPv6 address resolution) takes very long
  => timeout, Tb is unable to login.
- upon second DNS lookup by Tb, IPv6 address resolution is alrweady done by
  first request, then second DNS look up ends within a shor period.
This started to occur from Tb 3, because defaullt of network.dns.disableIPv6 was changef rom true to false by Tb 3.

What happens if network.dns.disableIPv6=true is set?
(restart Tb after setting change to avoid needless problems)
Because Wifi, wireless connection is established upon first network request from PC. It may take long and first DNS Lookup by Tb may timeout.

How long does "first ping(or tracert) imap.gmail.com after re-boot" take in your environment?
(1) Re-boot PC, (2) First "ping imap.gmail.com", (3) Second "ping imap.gmail.com", (4) Start Tb with network.dns.disableIPv6=false(default) or true, (5) Gmail IMAP folder access, Get Msgs etc.
I can confirm this is an issue in Thunderbird 12 on Ubuntu 12.04.
Status: UNCONFIRMED → NEW
Ever confirmed: true
matteo, can you reply to comment 2 and comment 1?
Flags: needinfo?(matteosistisette)
Vincent, Steve, can you reproduce?
Flags: needinfo?(el.cameleon.1)
Reply to comment 1: Exact same behavior after setting network.dns.disableIPv6=true (and restarting Tb) (tested twice)

Regarding comment 2: too much work to test, however the premise "connection is established upon first network request" doesn't make any sense to me. Also note the fourth step in "steps to reproduce".
Flags: needinfo?(matteosistisette)
Yes, I can reproduce with Thunderbird 17.0.2 on Ubuntu 12.04.
This needs more analysis. 

It would not surprise me if this is already described in a hard to find (duplicate) bug.
Keywords: qawanted
Summary: "Get mail" always fails (waits forever) first time after connecting to network → "Get mail" always fails (waits forever) first time after reconnecting to network
Whiteboard: [dupeme?]
Flags: needinfo?(el.cameleon.1) → needinfo?
 matteo, is not the precise wording of what you see "Failed to connect to server"?
(*not* "unable to connect")
Component: General → Networking
Flags: needinfo? → needinfo?(matteosistisette)
Product: Thunderbird → MailNews Core
Wayne, neither of the two!!!!
As I mentioned in my report, what I see is "Looked up gmail.com..." (which is complete nonsense) and then nothing. It doesn't show any error message at all.

IN SOME PREVIOUS VERSION it did show an error message, but I can't check now the exact wording; it may well have been "Failed to connect to server", as you say.
Flags: needinfo?(matteosistisette)
(In reply to matteo sisti sette from comment #10)
> As I mentioned in my report, what I see is "Looked up gmail.com..."
> (which is complete nonsense)

This message/situation is same as "when FQDN is not found in DNS", and is easily seen by POP3 definition with dummy/non-existent server such as x.x.x.
The messag is "DNS is looke up".
What is base of your "nosense" on message related to DNS look up?

> and then nothing. It doesn't show any error message at all.
> IN SOME PREVIOUS VERSION it did show an error message,
> but I can't check now the exact wording;
> it may well have been "Failed to connect to server", as you say.

With POP3 definition with dummy/non-existent server such as x.x.x, old Tb showed connection error message when server is not found in DNS.
Recent Tb stopped annoying connection error error message.
  While looking up DNS, "Looking up".
  After end of DNS look up, "Looked up".
  If server is not found, stop further action,
  because trying to connect to non-exstent server is nonsense.

Because phenomenon with Wifi and phenomenon is when Wifi router doesn't have server connection yet, it may be following.
- In PC, DNS server is defined as 192.168.0.1.
- Wifi router's local IP address is 192.168.0.1
  i.e. Wifi roter behaves as proxy server to DNS, or a DNS server.
- Wifi router gets actual DNS address of his provider upon connection
  establishment with his server.
- Wifi router returns "not found", if actual IP address of DNS of his
  provider is not known yet.

If you can reproduce your problem consistently, 
(A) Before you try to connect to server from Tb, 
connect to 192.168.0.1 from Browser and check Wifi router's WAN side status. Is DNS address set always? If DNS IP address is shown, do "tracert or ping the-ip-address-of-DNS" at Terminal. Is response returned within reasonable period?
Write ddown the actual DNS's IP address. 
(B) Actual DNS address is usualy not changed so frequently.
Before you try to connect to server from Tb, do "tracert written-down-ip-address-of-DNS" at Terminal. Is response returned within reasonable period?
(C) How about "tracert or ping ???.gmail.com"?
    IIRC, I already requested this to you in comment #2...
This has nothing to do with the wifi router. 
I see the exact same issue when I use a USB mobile broadband modem instead.
(whops, I thought I had already mentioned that, I hadn't).


Also note (again) my steps to reproduce the issue:

 Steps to reproduce:
 
 Make sure you're not connected to network
 Open Thunderbird
 Connect to network.
 CHECK THAT THE NETWORK CONNECTION WORKS BY OPENING A WEB PAGE IN A BROWSER
 Go to Thunderbird and click "get mail"

So, between connecting to the network and having TB check mail, I check that everything else works by surfing the web with a browser. No matter how much time I wait, the FIRST time TB tries to fetch mail it systematically fails, even if it is several minutes after connecting to network; the SECOND time it succeeds.

Frankly, I wouldn't look for the cause of this outside TB.

By the way, my DNS is NOT defined as 192.168.1.1, it is 8.8.4.4 (Google's DNS).


Regarding the other issue, the NONSENSE is that:
1. when connection to the server fails for whatever reason, no matter whether it is because the host is unreachable or because the name can't be resolved, it must show a message telling you what the failure is (such as "couldn't resolve domain name", or whatever), not a message that tells you what was the last thing it did and which doesn't even tell you whether it was a success or failure! ("looked up xxxx", as in "I looked up the domain. Don't ask me whether I found it or not").  When you tell a program to do something it must end in either or two ways: (a) "I'm finished doing what you asked me: the result is xxxx", or (b) "I couldn't complete the task because something went wrong: the error was xxxxx".
2. Even if you are not connected to the internet at all, it still says "looked up gmail.com"!!!! That's complete nonsense.
(In reply to matteo sisti sette from comment #12)
> I see the exact same issue when I use a USB mobile broadband modem instead.
> By the way, my DNS is NOT defined as 192.168.1.1, it is 8.8.4.4 (Google's DNS).

If problem happens when no route to internet(phone cable is pulled of from modem), it's perhaps one of next;
- timeout in DNS server access
- Tb's problem when PC's IP address is changed by DHCP retention,
  DNS cache is not cleared, ...
If problem happens when other software like broweser can access to intenet normally, other component of Tb like SMTP can send mail, other mail client can access ...gmail.com with no problem even though Tb fails with "looked up gmail.com", ..., etc.,
it's perhaps one of next;
- timeout in DNS server access i Tb
- Tb's problem when PC's IP address is changed by DHCP retention,
  DNS cache is not cleared, ... (known issues)

If you can reproduce your problem consistently, do following, please.
(i) Before try to access server from Tb, go Work Offline, then go Work Online. This disconnects from server, and restart network connection from scratch. IIRC, DNS cache related issue, IP address change relate issue, is bypassed by this.
(ii) Get DNS server access log with timestamp.
     See bug 402793 comment #28.
> Win example :
>  SET NSPR_LOG_MODULES=timestamp,nsHostResolver:5,imap:5,pop3:5
Timeout in DNS server access?



> 
> 
> Regarding the other issue, the NONSENSE is that:
> 1. when connection to the server fails for whatever reason, no matter
> whether it is because the host is unreachable or because the name can't be
> resolved, it must show a message telling you what the failure is (such as
> "couldn't resolve domain name", or whatever), not a message that tells you
> what was the last thing it did and which doesn't even tell you whether it
> was a success or failure! ("looked up xxxx", as in "I looked up the domain.
> Don't ask me whether I found it or not").  When you tell a program to do
> something it must end in either or two ways: (a) "I'm finished doing what
> you asked me: the result is xxxx", or (b) "I couldn't complete the task
> because something went wrong: the error was xxxxx".
> 2. Even if you are not connected to the internet at all, it still says
> "looked up gmail.com"!!!! That's complete nonsense.
(In reply to matteo sisti sette from comment #12)
> 2. Even if you are not connected to the internet at all,
> it still says "looked up gmail.com"!!!! That's complete nonsense.
(i) Open separate bug for error message improvement.
    Keep "one proble per a bug" at B.M.O, please.
(ii) Until message will be improved, read "looked up ..." as "looked up ..., but it failed", with complement string like ",but it failed", ",but it's not found", ..., as you like, in your brain, please.
(In reply to WADA from comment #14)
> (In reply to matteo sisti sette from comment #12)
> > 2. Even if you are not connected to the internet at all,
> > it still says "looked up gmail.com"!!!! That's complete nonsense.
> (i) Open separate bug for error message improvement.
>     Keep "one proble per a bug" at B.M.O, please.
> (ii) Until message will be improved, read "looked up ..." as "looked up ...,
> but it failed", with complement string like ",but it failed", ",but it's not
> found", ..., as you like, in your brain, please.

I also saw this too. I'm not sure if a bug exists for it - I started to look a few days ago but did not finish

wada, thanks for all the ideas. I hope to check some of them. However, my sense is the problem is thunderbrd, because I do not have trouble with firefox.

matteo, please be sure to use exact wording of what you see in messages and dialogs, not approximation.  exact wording allows us to better search bugzilla and source code.
Blocks: 678947
Sorry I missed this comment:

> matteo, please be sure to use exact wording of what you see in messages and dialogs, 
> not approximation.  exact wording allows us to better search bugzilla and source code.

I always try my best to do that. However:
- in this case I did use the exact wording of the relevant message being shown in the current version. Where I used an approximate wording it was a message that used to be shown in an age-old version that I didn't have any more, so the best I could do was to mention the approximate message as I remembered it (I certainly won't reinstall an older version just for the sake of figuring out the exact wording of a message)
- in some cases messages are displayed and then vanish. When I'm reporting a bug, I don't always have the time to repeat the steps _again_ in order to see the exact message

and most importantly:
- if you expect users reporting bugs to use the exact wordings, please make sure that all displayed errors, warnings and messages are copy-pastable.

Wayne, I don't see this problem doing these steps:

  1. Make sure ff and tb both working OK, network wise
  2. Switch off wifi and verify ff does not access anything.
  3. Shutdown tb (all windows)
  4. Start tb with wifi off
  5. Switch wifi on
  6. Verify ff working again, can access a page
  7. In tb click "Get Mail". Tb check mails, no waiting, no messages, see immediate connections occur in wireshark.

Tried this on 2 systems with same result using 60.*.

Thanks for the test

Status: NEW → RESOLVED
Closed: 4 years ago
Keywords: qawanted
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.