Closed Bug 352530 Opened 18 years ago Closed 18 years ago

alert: exceeded maximum IMAP connections

Categories

(Thunderbird :: General, defect)

x86
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bob.lord, Assigned: Bienvenu)

Details

(Keywords: fixed1.8.1.1)

Attachments

(2 files)

On recent builds, I'm getting an alert saying that I have exeeded the maximum number of IMAP connections.

I didn't get this warning before, and I only get it at startup.

Is this a client, or a server problem?
Attached image screenshot of alert
I'm getting this also.
this is our best guess at the cause of the error, when the server drops the connection right after it's been established (which is what the courier server does when you've exceeded the max number of connections). I can't think of anything that has changed in the imap code that could likely cause this. Some unlikely suspects: https://bugzilla.mozilla.org/show_bug.cgi?id=333877 (checked in 8/18) https://bugzilla.mozilla.org/show_bug.cgi?id=332626, 8/25, which should only affect appending messages to an imap folder,  https://bugzilla.mozilla.org/show_bug.cgi?id=338243, checked in 9/11, but that should only affect changing the connection type in the account settings ui, and https://bugzilla.mozilla.org/show_bug.cgi?id=349895, 9/10, which changed the line buffering code, but I've been running with that patch for months and months.
I also get this error at home with other IMAP servers, so it's not just my work IMAP server.

Oddly, I have not seen this problem today at all.
version 2 beta 1 (20060916)

I'm seeing this again.  I changed the number of cached connections from 4 to 2, and I still see it on launch. 

I don't see anything on the Error Console.

Might this have something to do with operations that take place early in the life of the process.  Maybe spam processing?  I don't use client-side filters (the mail server does that for me).
It seems to come in bursts, and I've seen it reported for several accounts on different servers at once, so I tend to suspect something in TB (although it doesn't totally eliminate a network issue)
is SSL the common denominator? I say that because I just heard an other report about this, using an SSL connection, and I bet you use SSL for your work server, Bob :-)

that error comes up right after we connect to the server, so we haven't gotten a chance to do any kind of processing, spam or otherwise :-)
(In reply to comment #7)
> is SSL the common denominator? I say that because I just heard an other report
> about this, using an SSL connection, and I bet you use SSL for your work
> server, Bob :-)

I use TLS here and don't see this issue very often.  However, after switching to SSL, I saw it twice tonight.  Hope that helps...
version 1.5.0.7 (20060909)

I want to confirm that I am also having this problem a great deal with the latest public release.  Using the previous public release, I used to get it occasionally but only when switching from one account to the next.  Now, with this new version, I get it on a regular basis -- at startup and when switching just to different folders within the same account.  It seems to happen regardless of my cache number settings.  All accounts are using SSL.
an imap protocol log might help:

http://www.mozilla.org/quality/mailnews/mail-troubleshoot.html#imap

though we might be getting the error so early in the connection process that nothing much would be logged. Perhaps an error code will be logged (2.0 might have more interesting log information...)
I can not reproduce, connecting to a IMAP SSL server and having "check for messages on startup" enabled.
Not using thunderbird 1.5.0.7, not using latest pre-v2 thunderbird.
Linux.
Just to clarify, I do not get this all the time.  But I get it almost daily.  Today, I got it first thing this when I opened Thunderbird -- before I even did anything.  That was the first time that has ever happened.  Note that I installed the latest update only a few days ago.

It used to happen, with the previous public release, only when switching between accounts.
To the people who see this problem: are you running in FIPS mode?  

To check to see if you are running in FIPS mode:
Tools -> Options... -> Privacy -> Security -> Security Devices

If you have a button on the right which says "Disable FIPS", then you are in FIPS mode.  Disable it, restart and report back one way or another.
Just to note it (because it was asked for), I do NOT have FIPS enabled and have this problem now and then (although rarely lately, I admit).
(In reply to comment #14)
> Just to note it (because it was asked for), I do NOT have FIPS enabled and have
> this problem now and then (although rarely lately, I admit).

<aol>Me too, although I see it very rarely</aol>

There is *no* limit on the server.  Period.  (I run the server, I have never hit the predefined connection limit except when stress testing, and there is no per-account limit)
No FIPS for me. I don't get it very often, only occasionally. But the first I saw was about the time this bug was opened. (I do use SSL.)
I see this on 1.5.0.7. Seems to be a very generic message though, so could be anything connection related? I get this on a courier server, on specific folders. 

I have tried deleting and re-adding this IMAP-account but that only switched which folders get this error.
(In reply to comment #17)
> I see this on 1.5.0.7. Seems to be a very generic message though, so could be
> anything connection related? I get this on a courier server, on specific
> folders. 
> 
> I have tried deleting and re-adding this IMAP-account but that only switched
> which folders get this error.
> 

If I restart TB, I can click exactly 4 folders until I get this error. The fifth folder, no matter which it is, fails. Seems like connection caching to me, connections not being torn down properly? This is with "1" for mail.imap.max_cached_connections value. Tried with 200 and 10 aswell, same thing: always the fifth folder failing.
(In reply to comment #18)
> (In reply to comment #17)
> > I see this on 1.5.0.7. Seems to be a very generic message though, so could be
> > anything connection related? I get this on a courier server, on specific
> > folders. 
> > 
> > I have tried deleting and re-adding this IMAP-account but that only switched
> > which folders get this error.
> >
>
> If I restart TB, I can click exactly 4 folders until I get this error. The
> fifth folder, no matter which it is, fails. Seems like connection caching to
> me, connections not being torn down properly? This is with "1" for
> mail.imap.max_cached_connections value. Tried with 200 and 10 aswell, same
> thing: always the fifth folder failing.

Tried to tell my TB to log IMAP stuff from the page pasted above here, it only created en empty log. 

(In reply to comment #19)
> (In reply to comment #18)
> > (In reply to comment #17)
> > > I see this on 1.5.0.7. Seems to be a very generic message though, so could be
> > > anything connection related? I get this on a courier server, on specific
> > > folders. 
> > > 
> > > I have tried deleting and re-adding this IMAP-account but that only switched
> > > which folders get this error.
> > >
> >
> > If I restart TB, I can click exactly 4 folders until I get this error. The
> > fifth folder, no matter which it is, fails. Seems like connection caching to
> > me, connections not being torn down properly? This is with "1" for
> > mail.imap.max_cached_connections value. Tried with 200 and 10 aswell, same
> > thing: always the fifth folder failing.
> 
> Tried to tell my TB to log IMAP stuff from the page pasted above here, it only
> created en empty log. 

Sorry for comment-spamming a bit:

My solution was to set the mail.server.serverX.max_cached_connections (server2, server3 in my case) to more than the number of folders I have.

I still wonder why this behaviour started just now though, any ideas? 

> Sorry for comment-spamming a bit:

Once again...
 
> My solution was to set the mail.server.serverX.max_cached_connections (server2,
> server3 in my case) to more than the number of folders I have.
> 
> I still wonder why this behaviour started just now though, any ideas? 

Actually, this solution stopped working for me! Out of ideas.
Anyone else have a workaround?
I think I have some new insight.

There are two errors that can happen at the networking level, which are not fatal.
- PR_CONNECT_ABORTED_ERROR
- PR_CONNECT_RESET_ERROR

Those get mapped by Necko to our XPCOM error code
  NS_ERROR_NET_RESET

These failures seem to be common enough that Necko's default action for HTTP is to retry such connections:
http://lxr.mozilla.org/seamonkey/source/netwerk/protocol/http/src/nsHttpTransaction.cpp#555
(on fresh connections, if nothing got read or written, it will be retried)

When I just saw this code, I was surprised. From my earlier reading of that code, I had remembered that this automatic retry is not limited to HTTP, but is done by Socket Transport on all kinds of connection. (I probably was wrong, but maybe that got changed?)

So, the first reason why you'd see this error message: Somehow a connection got connected, but it got closed before any data was exchanged.

Recently, when I was travelling and were using an unstable wireless connection, I got this error message with the SSL IMAP server I'm using - for the first time ever. At that time I conclude the error message got triggered by a bad connection.

Please note the exact wording of the error message. It says "unable to connect". In fact, the error dialog is indeed coming up whenever the server connection is closing unexpectedly, and I'd say the text "exceeded number of connections" is just one of many possible reasons why the connection might have failed.

Having said that, I must mention there is one more situation where we'll get a PR_CONNECT_RESET_ERROR error, and that is directly related to SSL.

Whenever we connect to a broken SSL server, which is not confirming to the standards, we might fail to establish a connection. On several kinds of failures our SSL networking code will conclude the server is "broken" (called TLS intolerant) and will attempt to trigger an automatic retry with an older handshake mechanism, which supports less security features, but works more often. (We'll also remember the fact that server appears broken for the remainder of the session.)

The mechanism we trigger the retry is by signaling a PR_CONNECT_RESET_ERROR, knowing that the networking code will close the socket, and do one retry.

So, when using a secure connection to such a broken / TLS intolerant mail server, because of the fact that our mail code does not retry on PR_CONNECT_RESET_ERROR, you will get the reported error message for the first connection attempt in your session. And all future connections might work, because the mail server works in the fallback SSL communication mode.

What could we do?

The mail protocol code could implement an automatic retry on NS_ERROR_NET_RESET, if no data got ever transmitted over an connection.
Just some reference about the mail code location where this message is brought up:
code IMAP_SERVER_DROPPED_CONNECTION
File nsImapProtocol.cpp
Thx, Kai - that's very interesting.  We do normally retry after most connection errors, but not this one, because up until now, it was always caused by a problem for which retry wouldn't help at all (using up the max per ip connection limit on courier servers).

We could try a single silent retry. I'll look at the code.
If it helps any, I've got a pretty reproducible test case I can use to help debug this. In particular, I'm a victim of the mismatched domain host name certificate dialog when I connect to my dreamhost imap server over SSL. If I don't dismiss the domain mismatch dialog within a few seconds, I end up getting this alert dialog in thunderbird 2 and the trunk. 1.5 doesn't show this problem.
here's a patch to try - I think I got the logic right :-)
Assignee: mscott → bienvenu
Status: NEW → ASSIGNED
Attachment #247289 - Flags: superreview?(mscott)
Comment on attachment 247289 [details] [diff] [review]
proposed fix (untested)

so far this patch has fixed the problem I was seeing. Thanks David.
Attachment #247289 - Flags: superreview?(mscott) → superreview+
fixed on trunk and branch.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Keywords: fixed1.8.1.1
Resolution: --- → FIXED
David, to us, this bug is a regression.  Do you know how the regression
was introduced during Thunderbird 2 development?
Not everyone is only experiencing this on Thunderbird 2.  I have it and I'm running 1.5.0.8 (20061025).  This is not a TBird 2 introduced problem.  I was running 1.5.0.7 (20060909) when I first noticed it and joined for updates on this bug.  At least one other person reported seeing it in 1.5.0.7 first also.

 - John...
(In reply to comment #30)
> David, to us, this bug is a regression.  Do you know how the regression
> was introduced during Thunderbird 2 development?
> 

This bug has evidently always existed.  An update to NSS, which is the module that Mozilla products use to handle TLS and SSL, uses and additional TLS extension during the initial handshake.  Therefore, servers that don't really handle TLS correctly could suddenly be rejecting the initial connection and falling back to doing SSL where they previously did not, thus running into this condition.

The NSS change which triggered this was made sometime last May, so John Goggan is quite correct in pointing out that more recent 1.5 version were also affected by this.
Thanks David, in my testcase it suppresses the occurrence of that dialog, too.

(My testcase is: Configure an IMAP SSL server to kuix.de port 8444, which is actually a http server requiring a special SSL client cert. This will bring up error 12227 when not having a cert. To my surprise, the error does not come up on first connection attempt, but the mail engine now seems to silently give up. The expected testcase error comes up on all further connection attempts, though.)
(In reply to comment #32)

Bill, do you know what's the additional TLS extension that the new NSS sends?
Is it the Server Name Indication or Supported Elliptic Curves Extension?

We turned on the Supported Elliptic Curves Extension on the Mozilla trunk on
2006/06/22.
(In reply to comment #32)

Here are the dates of the NSS updates for the Mozilla trunk related to TLS
extensions.

We turned on the Server Name Indication Extension on the Mozilla trunk
on 2006/05/12 (mozilla/client.mk, rev. 1.276: upgrade from NSS 3.11 + fixes
to NSS 3.11.1 + fixes).  Note: Server Name Indication is a new feature of
NSS 3.11.1.

We turned on the Supported Elliptic Curves Extension on the Mozilla trunk
on 2006/06/22 (mozilla/security/manager/Makefile.in, rev. 1.68: compile
NSS with ECC enabled).

I'm interested in the TLS connection failures caused by these TLS
extensions (especially the Server Name Indication), and the servers that
don't handle TLS extensions correctly.  Thanks!
I really don't know when this regression manifested itself in TB 2.0 - sometime before Bob filed this bug, but I don't know how long before. I didn't notice at first since this is a common behavior with Courier servers with a MAX_PERIP of 4.
A TLS intolerant server is intolerant of ALL TLS connection requests, not just
every 4th one.  So, how does the TLS intolerance hypothesis explain the 
repeatedly-reported behavior of being a problem when the fourth connection is
made (e.g. comment 18 and comment 36) ?

Second question: how does one disable TLS (leaving SSL3 enabled) in TB?
(In reply to comment #37)
> A TLS intolerant server is intolerant of ALL TLS connection requests, not just
> every 4th one.  So, how does the TLS intolerance hypothesis explain the 
> repeatedly-reported behavior of being a problem when the fourth connection is
> made (e.g. comment 18 and comment 36) ?

Right. The TLS hypothesis explains only those reports, where the connection fails on first attempts and works on all remaining connections in the session.

The explanation for other occurrences during a session is an unstable network.


> Second question: how does one disable TLS (leaving SSL3 enabled) in TB?

preferences / advanced / general / config editor
type tls
disable

This will turn off tls, and will leave ssl 3 as is (enabled by default)

> The explanation for other occurrences during a session is an unstable network.

So you're suggesting that a bunch of people suddenly started having unstable networks at about the same time at about the same TB version?  Sorry, but I don't buy it.

My connection problems are not due to an unstable network.  And, TB will work for hours or days before, suddenly, I might get these errors.  Therefore, I don't believe it is the TLS hypothesis either -- since it doesn't fail on the first attempts and then works later -- it fails, unexpectedly, in the middle of usage (usually when switching among folders and/or accounts in my case).

In any case, I guess I just don't believe that this problem is due to network stability.  And the TLS hypothesis doesn't make sense to me if it is really supposed to be a "fails first time and then works later" situation -- since I commonly see the opposite of that.

 - John...
If there are additional causes for this error situation, I am not aware of them.

After you obtain a build that contains this fix, could you please give us feedback whether you still get the error message?
Hi:

Yesterday I upgraded my TB to version 1.5.0.9 and I have this problem trying to retrieve my mail. I don't believe that's because an unstable network neither since I was very happy with my very old version of TB until yesterday. 

I'm just another user of TB, I don't know what to do with the code you guys posted here, so, what do you recommend? Should I wait until version 2 come out or I need to downgrade my TB?

Brenda
Xal. Mex.



it's up to you - 2.0 beta 2 is coming out in a few days, and the final release should follow a month or so after.  You could even try running the 2.0 beta - it's very usable and we'd love feedback. Here's a link to our beta 2 candidate build.

ftp://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/2.0b2-candidates/rc1

If you're uncomfortable running a beta build, then your choices are going back to 1.5.0.8 or waiting for 2.0 (I'm surprised that this is new to 1.5.0.9, however, unless the underlying security libraries that handle SSL changed between 1.5.0.8 and 1.5.0.9)
I am an end user who is coming across this issue as well.  I am using TB 2.0.0.0 and am wondering, if this fix (if it is in fact a fix) has been incorporated into this version, and if not, how can I patch it myself using the file in the above post (if that method DOES work)?

Any help would be greatly appreciated.  A client of mine thinks this is due to an issue with my server.
Yes, the fix for this issue is in tb 2.0. You might also want to try comment 20.
Edited the subject so this old bug won't look like a new one.  :)
Summary: New alert: exceeded maximum IMAP connections → alert: exceeded maximum IMAP connections
Dave, would it be reasonable to change the error message text to say 
something like: "The server immediately dropped the connection." 
in addition to the bit about maximum connections?  

In the cases relevant to TLS, maximum connections is not relevant, IIRC.
The relevant issue is that the server drops the connection immediately.
Diagnosis would be improved with a more accurate description of the 
actual problem.
Nelson - we could definitely do that. I'll try to get to that.
I can confirm this bug with version 2.0.0.6 (20070802) on Linux.

I just've upgraded from Ubuntu Feisty to Gutsy (Tribe 5) and the new Thunderbird doesn't work with any of my IMAP accounts reporting this bug.

Limiting the number of cached connections to 1 or raising to 60 doesn't help. TLS/SSL changes don't help.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: