Last Comment Bug 173924 - Doesn't fetch new mail after a random period
: Doesn't fetch new mail after a random period
Status: RESOLVED FIXED
[asaP2]
: fixed1.8
Product: MailNews Core
Classification: Components
Component: Networking: POP (show other bugs)
: Trunk
: All All
: -- major with 6 votes (vote)
: ---
Assigned To: David :Bienvenu
:
Mentors:
: 206301 227617 234552 234567 236128 254258 254735 278283 279897 317558 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2002-10-11 04:44 PDT by boris marechal
Modified: 2009-01-22 10:17 PST (History)
23 users (show)
mscott: blocking‑aviary1.0-
asa: blocking‑aviary1.5-
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
add more logging to nsHostResolver (1.51 KB, patch)
2005-10-28 14:08 PDT, David :Bienvenu
darin.moz: review+
darin.moz: superreview+
Details | Diff | Splinter Review
proposed fix (3.12 KB, patch)
2005-11-04 13:24 PST, David :Bienvenu
no flags Details | Diff | Splinter Review
oops, this is the fix (1.11 KB, patch)
2005-11-04 16:15 PST, David :Bienvenu
mscott: superreview+
Details | Diff | Splinter Review

Description boris marechal 2002-10-11 04:44:23 PDT
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.1) Gecko/20020826
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.1) Gecko/20020826

My Mozilla only reply "no new mail" after a random period, if I restart it (all
not just mail reader) that good, my new mails are correctly detected and I can
receive there.

Note that I fetch mail every 2 minutes and I use 2 mail accounts contacted with
POP3 protocol (one internal and one external). When mozilla can't fetch, both
accounts are impacted at the same time.

Currently, I don't kown why and how reproduced pb, but it's a blocked bug to put
Mozilla in "production" enviromnent (in order to skip outlook).

Reproducible: Sometimes

Steps to Reproduce:
1. Let's mozilla fetch mail :(
2.
3.

Actual Results:  
Mozillla doesn't see new mail in server

Expected Results:  
Mozilla can receive mail all the time.
Comment 1 Ere Maijala (slow) 2002-10-11 06:49:38 PDT
Confirming. Seen the same problem using IMAPS server at times with quite recent
builds. No new messages are automatically received and pressing Get Msgs doesn't
work either. 
Comment 2 Navin Gupta 2002-10-30 18:37:12 PST
Is your profile stored on network drive ?
Comment 3 Kousik Nandy 2002-11-06 03:57:58 PST
Seeing this with profile on local disk. 

I'm not sure if this matters: Everytime this happens, I have to quit
mozilla and restart. Upon restart, it REBUILDS the Inbox Summary first
before doing anything else. So I suspect, somehow the Inbox summary
is getting corrupt, and that makes Moz to stop fetching new mail. It
could be unrelated, but can be explored.

Sometimes, compacting Inbox may help. But, sometimes, it doesn't.
Comment 4 Kousik Nandy 2002-11-08 01:06:14 PST
Most likely it is duplicate of bug 135318. Can anyone verify it please.
Comment 5 boris marechal 2003-04-25 03:08:44 PDT
Two notes,
first my profile is on local but i have a lot of network cut whose "freeze" my
win2k system during netbios connexion timeout (due to my network sharing disk).
Second, my mozilla doesn't reconstruct his index after stop fetch

I have always the problem in moz 1.1, 1.2.1, 1.3

Like I say, it's a blocked bug to put Mozilla in "production" enviromnent (in
order to skip outlook).
Comment 6 (not reading, please use seth@sspitzer.org instead) 2003-05-08 09:23:29 PDT
mass re-assign.
Comment 7 Max Alekseyev 2003-06-23 17:06:35 PDT
See the problem on OS/2 builds.
OS -> All
Comment 8 Christian Eyrich 2004-03-02 08:27:39 PST
*** Bug 236128 has been marked as a duplicate of this bug. ***
Comment 9 Asa Dotzler [:asa] 2004-03-09 17:00:33 PST
*** Bug 206301 has been marked as a duplicate of this bug. ***
Comment 10 Max Alekseyev 2004-08-04 22:52:36 PDT
*** Bug 254258 has been marked as a duplicate of this bug. ***
Comment 11 boris marechal 2004-08-04 23:19:00 PDT
Seems to be link with an ugly disconnection like a crash of a proxy between
mozilla  (currently i use thunderbird) and the SMTP server
Comment 12 Max Alekseyev 2004-08-23 23:12:51 PDT
*** Bug 227617 has been marked as a duplicate of this bug. ***
Comment 13 Max Alekseyev 2004-08-23 23:13:53 PDT
*** Bug 254735 has been marked as a duplicate of this bug. ***
Comment 14 Scott MacGregor 2004-09-29 21:10:16 PDT
minusing pending better steps to consistently reproduce
Comment 15 Larry Cook 2004-09-30 05:56:25 PDT
The symptoms mentioned in some of the comments might be bug 127461, a fix for
which got into 1.8a4.  See bug 127461, comment 34 for instructions on
determining if you are experiencing that bug.  Or just give 1.8a4 a try to see
if your problem still occurs.  Please report any results.
Comment 16 boris marechal 2004-10-06 14:44:54 PDT
Sorry I can't test moz 1.8a4 because i switched my account to thunderbird 0.8
But I have the same pb with it, 

I suspect it appeared when my corporate firewall shutdown all connection for the
night,
so it could be a brutal interruption during a pop3 fetch whom freeze it, like in 
 bug 127461, comment 34
Comment 17 boris marechal 2004-11-11 03:56:37 PST
After a short test period it seems to be ok in thunderbird 0.9 
so it could be the same pb than the bug 127461
Comment 18 Andrew 2004-11-14 14:05:35 PST
Unfortunately, I can confirm this bug still exists in thunderbird 0.9

I'm using Thunderbird 0.9 build 20041103, Windows XP SP2

Going offline then online again "fixes" it.

The manifestation is most noticeable when trying to send mail. After
aforementioned random amount of time, thunderbird refuses to connect to the smtp
 server. The logs generated are unhelpful, and only say "connecting to..."



(In reply to comment #17)
> After a short test period it seems to be ok in thunderbird 0.9 
> so it could be the same pb than the bug 127461

Comment 19 Colin 2004-11-27 09:21:13 PST
(In reply to comment #18)
> Unfortunately, I can confirm this bug still exists in thunderbird 0.9
> 
> I'm using Thunderbird 0.9 build 20041103, Windows XP SP2
> 
> Going offline then online again "fixes" it.
> 
> The manifestation is most noticeable when trying to send mail. After
> aforementioned random amount of time, thunderbird refuses to connect to the smtp
>  server. The logs generated are unhelpful, and only say "connecting to..."

I can confirm that this bug is not limited to Andrew.

Running Thunderbird 0.9, build 20041103, WinXP Pro SP1.

Exact same issues, hitting Get Mail after some random amount of time will create
no activity at the bottom as well as sending mail is rejected. Automatic mail
checking is a total failure. There are no error messages for getting mail,
although sending mail will give the same error as >18
Can be easily fixed by restarting Thunderbird. It's incredibly annoying though.
Comment 20 David :Bienvenu 2004-11-27 09:27:50 PST
if sending mail also fails, it sounds like you're unable to make new
connections, for whatever reason. It sounds like necko is failing, but never
timing out - I think the OS can cause this if it never responds to the
connection attempt, because necko doesn't have a connection timeout, and the
connection thread blocks, disallowing other connections. Perhaps going offline
and offline again clears things so that necko can make new connections again.
Comment 21 boris marechal 2004-11-28 03:28:06 PST
I say victory too quickly :(

It seems better (add pop timeout should be a good thing) but it's only a partial
fix. SMTP haven't this type of protection between connection cut ?

Note that bug has been rejected to block thunderbird 1.0, for me it's still
really annoying because I can't use thunderbird at my work (email is too
invaluable...)
Comment 22 :Gavin Sharp [email: gavin@gavinsharp.com] 2004-12-23 21:04:13 PST
*** Bug 234567 has been marked as a duplicate of this bug. ***
Comment 23 :Gavin Sharp [email: gavin@gavinsharp.com] 2004-12-23 21:04:18 PST
*** Bug 234552 has been marked as a duplicate of this bug. ***
Comment 24 Colin 2004-12-30 09:49:15 PST
Just to bump this up, the bug is still in Mozilla 1.0 and did not magically
disappear as some might have hoped.
Comment 25 Ben1265 2005-01-10 12:11:46 PST
I can reproduce this bug w/o fail. I'm a software tester, if you want my
assistance in reproducing/troubleshooting it, I can help.
Currently, I'm running Thunderbird 1.0 on XP with SP2.
Comment 26 James 2005-02-01 11:42:01 PST
I am having this same issue with Thunderbird 1.0 on a Windows XP SP2 machine;
the machine has an uptime of 3 weeks and this bug rares it's head every couple
of nights (unexpectedly).  I close thunderbird but a ghost remains (it's still
running as a process even when closign it properly); I kill the process and
re-load thunderbird and I recieve my mail properly.

I have all POP accounts, 7 of them.  I don't really see a reason for this error
as it effects all of my mail accounts (and they're on differant servers and
through differant providers...

I'd love to get this bug squashed as I work for a webhost so emaisl are
critical.  I'd hate to have to go back to Outlook.(In reply to comment #25)
> I can reproduce this bug w/o fail. I'm a software tester, if you want my
> assistance in reproducing/troubleshooting it, I can help.
> Currently, I'm running Thunderbird 1.0 on XP with SP2.
> 

Comment 27 Brian Carpenter [:geeknik] 2005-02-13 23:09:37 PST
I am having this problem on a 1ghz celeron machine w/ 512mb sdram, running
windows xp sp2.  closing and re-opening thunderbird does nothing, i have to
reboot the computer in order to fetch new mail.  i'm running norton antivirus
2004 (all latest updates applied) and the windows firewall only on a 56k dialup
connection.
Comment 28 James 2005-03-25 14:10:42 PST
(In reply to comment #27)
> I am having this problem on a 1ghz celeron machine w/ 512mb sdram, running
> windows xp sp2.  closing and re-opening thunderbird does nothing, i have to
> reboot the computer in order to fetch new mail.  i'm running norton antivirus
> 2004 (all latest updates applied) and the windows firewall only on a 56k dialup
> connection.

I'm still having this error, I've been able to "crash" it by closing the task in
Windows Task Manager then reload it and fetch my mail however just clicking the
X or using file/exit seems to close it however leaves an idle process (that must
be killed by task manager or rebooted).

Same situation as this guy only i'm using Norton Corporate Virus Scan/WinXP SP2
and ZoneAlarm.
Comment 29 James 2005-04-16 16:45:02 PDT
Still an issue as of Thunderbird 1.0.2
Comment 30 Yugo Nakai 2005-07-12 09:18:12 PDT
This is becoming ridiculous; the bug reports on this issue (see duplicates) go
back to version 0.3 and prior!  Can we up the Severity and/or set the target
milestone for the next release?

That said, I did not notice this problem until I started running multiple email
accounts.  Previously, it was just one IMAP; now it is one IMAP plus maybe 8 POP
accounts, each with their own separate Inbox.  I am using TB version 1.2, and I
have been using it since at least 0.9.

My machine is Windows XP Home SP2, on a cable modem via Linksys WRT54G router.
Comment 31 (not reading, please use seth@sspitzer.org instead) 2005-10-24 10:13:56 PDT
re-assigning to david (and cc mscott)
Comment 32 David :Bienvenu 2005-10-25 13:48:42 PDT
If you're running a 1.5 beta 2 build or later (*not* 1.0x), and this happens to all your accounts, not just one, such that you can't get new mail or send new mail at all w/o restarting, and you're not running a virus checker, etc, you could try to generate some logs.

In particular, host lookups and/or socket transport logs. Follow these instructions, but substitute nsHostResolver or nsSocketTransport for "protocol". (or nsHostResolver:5,nsSocketTransport:5 for both)

http://www.mozilla.org/quality/mailnews/mail-troubleshoot.html#imap

What would be interesting is to see if after you get in this situation, are we still trying to do host lookups, and are they succeeding? You can actually load the log file in a browser and perform an operation in thunderbird, like trying to send a message, and then reload the log in the browser and see if anything happened. 

My theory is that the socket transport thread is getting blocked for one reason or another. Once this happens, we can't do any network activity.
Comment 33 David :Bienvenu 2005-10-28 14:08:28 PDT
Created attachment 201192 [details] [diff] [review]
add more logging to nsHostResolver

this adds logging for when a lookup completes, and when a lookup has to wait for a thread to get free to run. This should help us see if this is where we're getting blocked.
Comment 34 David :Bienvenu 2005-10-28 14:50:36 PDT
additional nsHostResolver logging patch checked in.
Comment 35 David :Bienvenu 2005-11-04 09:50:50 PST
Michal has patiently generated a bunch of different logs for me, and I'm starting to get an idea about what's going on. I believe the socket transport service thread is eventually getting stuck, and no events get processed. What we see from a combination of pop3, nsHostResolver, and nsSocketTransport logging is that the host resolver keeps resolving hosts and posting events to the nsSocketTransport but it never receives them.

Looking at the log, we see that we're getting into NotifyWhenCanAttachSocket, which means that we think there are more than 50 idle or active sockets. This seems unlikely since Thunderbird doesn't cache pop3 connections. So something is getting confused; perhaps sockets aren't cleaned up correctly for some reason. I'll poke around the log some more.

860[40c5160]: nsHostResolver::ThreadFunc entering
860[40c5160]: resolving pop3-2.xxx ...
860[40c5160]: nsSocketTransport::PostEvent [this=41b4c90 type=1 status=0 param=41d1df8]
860[40c5160]: nsSocketTransportService::PostEvent [event=41cd718]
592[15f9538]: nsSocketTransport::OnSocketEvent [this=41b4c90 type=1 status=0 param=41d1df8]
592[15f9538]:   MSG_DNS_LOOKUP_COMPLETE
592[15f9538]: nsSocketTransport::InitiateSocket [this=41b4c90]
592[15f9538]: nsSocketTransportService::NotifyWhenCanAttachSocket
Comment 36 David :Bienvenu 2005-11-04 09:55:45 PST
we definitely have 50 idle sockets on our hands, from the log. Unfortunately, the part of the log I have starts with us having 49 idle sockets, so I can't tell how we ended up so many idle sockets. I'll try to get the top of the log.
Comment 37 Larry Cook 2005-11-04 11:52:06 PST
David, I'm just throwing out an idea here based on my experience with bug 127461.  Could the problem be that an earlier host lookup failed to get a response?  I mention this because bug 127461 was also leaving sockets around. This occurred if a POP3 response was never received.  Since the transport layer does not have any timeouts, I fixed the bug by adding a timeout to the POP3 layer.  And the number 50 sounds familar.  Is there a limit of 50 open sockets?  Each occurrance of the problem may be causing a socket to get left open, but it's not evident to the end user until the socket count reaches 50, at which point nothing can proceed since none of those 50 will ever be freed up. 
Comment 38 David :Bienvenu 2005-11-04 13:24:26 PST
Created attachment 201881 [details] [diff] [review]
proposed fix

From the nsSocketTransport log, it looks like the server is dropping the connection right after we send the password (perhaps a bad password, or the server is kicking us off for some reason) In any case, that causes necko to close the input stream, and leave the socket idle. Eventually, we end up with 50 idle sockets, and things grind to a halt. The fix is to call CloseSocket if we have an open socket when OnStopRequest gets called. Darin pointed out that we weren't closing the socket, and that OnStopRequest should be getting called. This fix will affect news,pop3, and smtp, but not IMAP.  IMAP uses blocking reads, so it should get an error on the read.
Comment 39 Christian :Biesinger (don't email me, ping me on IRC) 2005-11-04 13:53:28 PST
Comment on attachment 201881 [details] [diff] [review]
proposed fix

this doesn't look like the right patch...
Comment 40 David :Bienvenu 2005-11-04 16:15:44 PST
Created attachment 201890 [details] [diff] [review]
oops, this is the fix
Comment 41 David :Bienvenu 2005-11-04 20:48:37 PST
fix checked in to trunk, not 1.5 branch. I don't know if this will fix all instances of get new mail ceasing to work, but let's see if it fixes Michal's case, and anyone else. This fix will be in tomorrow's trunk build.
Comment 42 Wayne Mery (:wsmwk, NI for questions) 2005-11-10 12:44:46 PST
Perhaps this shouldn't have blocked 208741? 208741 was marked WFM a year ago but this only recently got fixed.
Comment 43 David :Bienvenu 2005-11-10 12:47:22 PST
that blocking status seems wrong, clearing. The bugs might be related or the same, but there's no blocking relationship in any case.
Comment 44 Peter Weilbacher 2005-11-17 04:05:17 PST
Comment on attachment 201890 [details] [diff] [review]
oops, this is the fix

Any chance to get this into the branch? Under OS/2 I have a user who experienced the same problem with SeaMonkey 1.0alpha+ and when I created a build with  this fix compiled in he reported that it didn't happen again.
Comment 45 Scott MacGregor 2005-11-17 08:30:57 PST
Comment on attachment 201890 [details] [diff] [review]
oops, this is the fix

thiss should be 1.8.1 not 1.8.0.1 (which is for security bugs only)
Comment 46 David :Bienvenu 2005-11-20 19:31:42 PST
*** Bug 278283 has been marked as a duplicate of this bug. ***
Comment 47 David :Bienvenu 2005-11-23 08:03:39 PST
*** Bug 317558 has been marked as a duplicate of this bug. ***
Comment 48 David :Bienvenu 2005-11-30 15:46:27 PST
we're going to put this in 1.5
Comment 49 Andrew 2006-03-25 10:38:29 PST
I'm using TB 1.5, 20051201 and I'm still seeing this behavior. Windows XP, sp2. The symptoms are exactly how the others describe. After a random period, check mail indincates no new mail, even though there are new messages. Going offline and then online redownloads all the headers as well as the new one. Restarting also seems to fix the problem.
Comment 50 Phil Ringnalda (:philor) 2008-08-06 21:54:37 PDT
*** Bug 279897 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.