Closed Bug 202575 Opened 22 years ago Closed 15 years ago

hang and timeout when authenticating with LOGIN mechanism on comcast server

Categories

(MailNews Core :: Networking: POP, defect)

x86
All
defect
Not set
major

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: stephen.moehle, Unassigned)

References

Details

Attachments

(7 files)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4b) Gecko/20030418
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4b) Gecko/20030418

Recently, POP3 over SSL has stopped working for me.  I never get the
authentication dialog, and Mozilla seems to stay stuck spinning the throbber
until I hit Stop.  If I turn off SSL, I can then log in, and if I then turn SSL
back on, I can successfully do further Get Messages invocations.  I will attach
my POP3 log for the session.

Reproducible: Always

Steps to Reproduce:
Attached file POP3 log —
More observations: I can get it to sort of work in a very convoluted way. 
Starting with a new instance of Mozilla:

1) Click Get Msgs.  No authentication dialog.  Mozilla just spins forever.
2) Click Stop.
3) Change account to _not_ use SSL.
4) Click Get Msgs.  Get authentication dialog.
5) Click Cancel on authentication dialog.
6) Change account to use SSL again.
7) Click Get Msgs.  Get authentication dialog!  Netstat shows me that Mozilla
really is using port 995.
8) After this, everything works OK.

I will attach an annotated POP3 log.
I've had this happen recently with POP3 over SSL connections to mail.attbi.com,
both on Windows and on Linux. With SSL enabled I don't even get to the password
dialog box. If I turn SSL off, it connects just fine. With a different e-mail
account (IMAP over SSL, so a different port) everything is working correctly.

Both Eudora 5.1.1 (Windows) and Evolution 1.2.2 (Linux) can successfully connect
to mail.attbi.com's POP3 service over SSL, so it seems to be specific to
Mozilla. It's possible something has changed at the server end (what with the
Comcast takeover of ATTBI), but since the other programs are still successful it
does seem to be a Mozilla issue.

I was using Mozilla 1.3 originally, but uninstalled it and installed 1.2.1 - the
same problem occurs.

The systems I've observed this behavior on are: Mozilla 1.2.1/1.3 on Windows
98SE; Mozilla 1.2.1/1.3 on Red Hat 9 Linux.
Interestingly, while POP3 and SSL against mail.attbi.com works in Evolution, I
do get an error about a bad certificate.  This is the error:

Bad certificate for mail.attbi.com.
Self-signed certificate in chain.

I wonder if this certificate is what is causing Mozilla problems.  If it is,
Mozilla should behave like Evolution and warn the user and give the option of
using the certificate anyhow.
Confirming this bug due to the excellent reporting from Steven Moehle, two 
apparent duplicates in Bugzilla, and also two threads in 
  netscape.public.mozilla.mail-news
    "ATTBI/Comcast Email Problem Moz 1.3"
    "POP3 over SSL mail fetching problem"
Status: UNCONFIRMED → NEW
Ever confirmed: true
*** Bug 202685 has been marked as a duplicate of this bug. ***
*** Bug 200155 has been marked as a duplicate of this bug. ***
I experienced this bug in Mozilla 1.3 under Linux, too.
With debug build options I get the message
"WARNING: no registered socket provider, file nsSocketTransport2.cpp, line 698
Exception : In mail commands"

SSL works under Mozilla 1.3 for Windows, though.
I don't know if this is related to this bug, but POP over SSL doesn't work for 
me too. It says "the PASS did not succeed. Mail server port995.com said: login
failed." I tried without SSL, doesn't work too. I tried with a clean profile,
same thing. I tried with myrealbox.com accounts and it works! That's odd. It
only fails with port995.com accounts... The pass is correct, trust me :)
Yet another problem handling ssl error codes?
David,  

There is no SSL error here.  It is a POP3 protocol problem that occurs
occasionally with this particular server farm on port 995, and not with 
the server farm on port 110.

According to the log in comment 3, when you send the "AUTH LOGIN" 
command to the server on port 110, it responds with 

-ERR authorization not enabled

but when you send the same command to the server on port 995, it sends 
no response at all.  

Using a telnet-over-SSL program and the ssltap program with the server(s) on 
port 959, I found that after sending "AUTH LOGIN\r\n" there was no response.
After waiting a short while, I tried sending other commends, e.g. another
"AUTH\r\n" command, and the server responded to it immediately, as if it 
had already sent a response to the "AUTH LOGIN" command and had been waiting 
for the next command.  I did this test repeatedly and experienced the same
results each time.  I also found that the server number (in the initial +OK
message) changes each time, but the results were the same in all my tests 
done today with my test programs. 

I am also an attbi customer, and have experienced this same problem at home
using mozilla 1.3.  However, occasionally it works for me.  Perhaps it is 
the case that some of the servers in their server farm do respond to the 
"AUTH LOGIN" command, and others in the farm do not.  This might explain 
why it sometimes seems to work, but sometimes does not.  

The behavior described in the log with Comment 2 above seems to have the 
following explanation.  The first connection, to port 995, completed the 
AUTH command, but did not complete the "AUTH LOGIN" command.  When the user
changed the preferences to not use SSL, this did not cause the POP3 code to
reset its information about this server.  So, when the user attempted again
without SSL on port 110, mozilla remembered that it had completed the AUTH
command, and so did not send it again, but proceeded immediately to the 
AUTH LOGIN command, which succeeded (with a negative response).  The user
then reenabled SSL, and tried a third time.  This time, mozilla remembered
that it had previously completed both the AUTH and AUTH LOGIN commands, 
and that the server was incapable of AUTH LOGIN, so it proceeded immediately 
with the USER command, which succeeded.  It succeeded because it avoided 
sending the "AUTH LOGIN" command to the server on port 995.

Some suggestions: 
a) when the user changes the prefs for a server, reset all state info for
that server back to the beginning.  It's not reasonable to assume that 
the information previously learned about the server on port 995 also is 
true of the server on port 110, and vice versa.

b) timeout these commands after a short timeout.  If the server doesn't
appear to respond in, say, 30 seconds, give up and throw an error.
If I understand correctly, you are saying that the problem lies with with the
attbi POP3 servers and not with Mozilla.  I tried a little experiment.  I tried
to connect 5 or 6 times each with Mozilla and with Evolution, both using SSL. 
Mozilla failed every time.  Evolution worked every time.  If it was really a
server-side problem, I would not expect these results.  I will attach the POP3
log from Evolution in the hopes that it may illuminate why Mozilla is failing
unnecessarily.
The newest log file attachment confirms my hypothesis.  The other client does
not send the "AUTH LOGIN" command to the server, and hence does not experience
the hang.  

Yes, the POP3 server appears to not be responding properly to the AUTH LOGIN
command, and it should be fixed.  But mozilla's POP client can also learn to 
adapt to this problem by "timing out" the AUTH LOGIN command, recording that
that command failed, and then going on.  mozilla's POP client already has 
logic to remember that the command failed and to proceed to use the USER 
command because of that.  But mozilla doesn't seem to detect that the command
has failed, not even after it hangs a LONG time and then you press stop.  
It does seem to detect the failure when you change the configuration from 
SSL to non-SSL, but without that change, it doesn't appear to detect that.
changing bug summary to reflect diagnosed problem.
Summary: No authentication dialog for POP3 over SSL → hang attempting POP3 over SSL with attbi server
Am I right assuming, that I only can access mail.attbi.com if I'm in their
network/one of their customers?
I'd really like to help on this bug but every try to connect the server times out.
First please be aware, that with implentation of CRAM-MD5 authentication on
20030423, I've changed the login processing slightly but maybe deciding in this
case.
So builds since 20030426 (don't use older because of an other POP3 bug) try AUTH
CRAM-MD5 first when announced by the server (what the attbi server does
according to log in attachment 121016 [details]).
If the server answers correctly to an AUTH CRAM-MD5 try, the mail get will
succeed nevertheless the failure with AUTH LOGIN remains.

Ok, now for the actual work.
Like what Nelson wrote in comment #15, Mozilla doesn't recognize a timeout
(about this later on) and doesn't discard AUTH LOGIN but does this after
switching from SSL (port 995) to non-SSL (port 110).

But it's not the switching itself that convinces Mozilla to forget AUTH LOGIN
and try USER/PASS. The thing is, that the server answers "-ERR authorization not
enabled" to AUTH LOGIN (weird enough - after it announces "I support LOGIN" in
answer to AUTH) when communicating on port 110 and Mozilla then discards LOGIN.

Because Mailnews keeps the servers capabilities till restart, after switching
back to ssl it goes on with USER/PASS.
While this behaviour saves our asses by enabling a workaround in this case, I'd
agree with suggestion a) in comment #12 in general.

And also with suggestion b) although I don't know how to achieve this at the
moment. And also not, when we should switch back - if the connection times out
at each stage on the communication or only in the authorization state and if so,
when, only before exchanging username/password (the point where it hangs now) or
in the whole authorization state a.s.o.


Stephen, you wrote in comment #13, Evolution worked every time. That's not
remarkable, as it, at least in the third log, looks like, it only tried
USER/PASS as authorization mechanism although many others are advertised.
So it's really a server side problem but we should be able to work around it.

It would be interesting to see, how other mail applications handle this problem
if they run into (try LOGIN). Could anyone try this with KMail (there you can
configure which authentication mechanism to use).
I just upgraded to 2003-04-26-05 Linux trunk, and Mozilla is no longer hanging.
 For whatever reason, that problem seems to fixed.

Unfortunately, the CRAM-MD5 authentication very rarely works.  Perhaps only 1
out of 10 attempts to login succeeds.

Looking at the log file, which I will attach, I see that Mozilla is always
sending "AUTH CRAM-MD5".  If it gets back "+
PDExMzkwLjEwNTEzODY1NTBAYXR0YmkuY29tPg==", authentication will succeed.  If it
gets back "+ PDQ4NDAuMTA1MTM4NjU1MkBhdHRiaS5jb20+", authentication will fail.
As I wrote do newer Mozillas try CRAM-MD5 before LOGIN if advertised. The LOGIN
problem isn't gone, but you don't reach it anymore.

I can't see what's wrong now. The challenges from the log can be decoded by
Mozilla and matching the strings in angle-brackets from the greeting banner.

Could you send me a log with more tries (failed and one or two succeeded)? Or
could you make a log with ethereal? I've a idea but can't determine this with
Mozilla's log.
Re: comment 17 :

You need to be on attbi's network to access port 110, but not to access port 995.

Re: comment 18, and the remark that it's odd that the server would respond -ERR
to AUTH LOGIN after AUTH claims that it does LOGIN:

The server on port 110 returns the message "-ERR authorization not enabled"
to both AUTH and AUTH LOGIN.  This is further evidence for my suggestion a in
comment 12.  

As for handling of timeouts, I believe a timeout to the first request sent
(whatever it is) on a TCP connection should be handled differently than 
timeouts to subsequent requests.  On some systems, the server system can 
complete the TCP handshake but the server may not learn of the new 
connection for a long time thereafter.  So, a timeout to the first request
should not be interpreted as an inability of the server to handle that 
request,  but timeouts to subsequent requests may be interpreted that way, 
IMO.

Even if implementing a timeout is difficult, for some reason, it would be
nice if pressing stop, then retrying had a similar effect to a timeout.
problem with cram-md5 authentication failing for pop3 has been fixed in bug
203219 and should be in tomorrow's build.
This patch resets the servers capabilities stored in m_capabilityFlags if its
name or port has changed since last use. This is a) from comment #12.

For this I decided to build a "serverstamp" hostname:port and store it in the
nsPop3IncomingServer object. Each time a get mail starts it compares the
current stamp to the previous. If unequal capabilities are reset and the saved
serverstamp set to the current one.

It works but I'm unsure on the string classes used and if there's no nicer way
to get the port to the string.
actually, the preferred way to do something like this is to register a listener
for the pref changing, and when the pref changes, re-init. 

I can't tell for sure, but it looks like you've put your code in GetNewMessages
- that's not sufficient, because biff could also fire (::PerformBiff) - for your
code to work, you'd have to check in multiple places. That's why we tend to use
pref change listeners, even though it's kind of a pain.

Look at some sample code from mailnews/base/src/nsMsgDBView.cpp

nsresult nsMsgDBView::AddLabelPrefObservers()
nsresult nsMsgDBView::RemoveLabelPrefObservers()
NS_IMETHODIMP nsMsgDBView::Observe(nsISupports *aSubject, const char *aTopic,
const PRUnichar *someData)

Also, you could just save the port, and add an override of

nsMsgIncomingServer::OnUserOrHostNameChanged to nsPop3IncomingServer(), 

that calls the base class and also clears the capability flags. Then you'd just
need to check if the port changed and not have to do the string stuff...

mass re-assign.
Assignee: naving → sspitzer
Ack!  This has regressed and is no longer working.  The last build in which
using SSL against mai.attbi.com was 051308 Linux trunk.  In 051322 it is broken
again.  Looking at the pop3 logs (both of which I will attach), they are
identical up to the point where the working version sends "AUTH CRAM-MD5" and
the non-working 051322 sends "AUTH LOGIN".
This is from build 051308 Linux trunk.
This is from build 051322, Linux trunk.
Oh, we forgot you, guys.
As I wrote in comment #21, this bug was never "fixed" but got hidden by
implementing CRAM-MD5 authentication.
But because so much faulty servers on the net have in turn probs with CRAM-MD5,
we switched it off by default from 20050513 on (see bug 205003). And so you're
experiencing the old problems again.

So, how to work around?
Eiter switch mail.server.default.useSecAuth to true or fiddle out which server
number stands for the attbi server in your config and create an entry
mail.server.server*.useSecAuth with value true, where server* is server1,
server2 or so.

After 1.4final we'll implement a UI for this.
That works.  I hope all this gets prominent billing in the release notes for 1.4
because no one is ever going to figure this out on their own.
I comment 15, I observed that mozilla's POP3 code has logic to remember that an
AUTH command failed and so not to repeat it.  But a long timeout isn't treated
as a command failure.  Seems like that should be fixed.  Then mozilla would
learn how to recover from and deal with this server, and users would not need
to hack preferences, and perhaps no new preference UI would be needed.  
Doesn't that seem better?  

Oh, one more question: Has suggestion a in comment 12 been implemented? or not?
> But a long timeout isn't treated as a command failure.  Seems like that should
> be fixed.  Then mozilla would learn how to recover from and deal with this
> server

We should react in any way, yes. But interpreting a timeout as a fail of an
particular command? No, I don't agree.

The new pref and UI hasn't anything to do with this bug (only so far that it's
checkin made this bug visible again).

> Oh, one more question: Has suggestion a in comment 12 been implemented? or not?

No. The patch I've submitted wasn't that good as bienvenu pointed out. The task
to make a better one is to big for me at the moment.
On interpreting timesouts as command failures: yes, for authentication commands,
not for message retrieval commands.  
Additional comments about this bug:

1. It has come to my attention that "AUTH LOGIN" is a non-standard POP3 
protocol extension that was invented at Netscape.  It was formerly 
documented on Netscape's web site, and Netscape's email servers supported 
it.  It is NEVER necessary for the successful completion of a POP3 login, 
and there are flawed implementations of it out there "in the wild".

I therefore recommend that mozilla control this feature with a preference
(even if there is no UI for it), so that use of this troublesome protocol
extension can be disabled by the users for whom it is not satisfactory.

2. According to the notices I've been receiving in the mail from comcast,
on or about July 1, 2003, the attbi mail accounts will become comcast.net
accounts.  They will be served by a different set of mail servers, the
comcast.net domain mail servers.  So, the issues related to the attbi.com
mail servers may come to an end at that time.

3. As of today, it appears to me that the pop3.comcast.net mail server is 
not accepting TCP connections on port 995, so I surmise that pop3-over-SSL 
will not be available after June 30, 2003. 
Okay, we can officially say that (perhaps unfortunately for the owner of this
bug!) mail.comcast.net does indeed support SSL mail. So this bug still needs to
be dealt with.

Comcast states it supports POP3 over SSL at
http://online.comcast.net/faqs/faq-detail.asp?intFaqID=47

I can report that the behavior still looks the same. I reconfigured my wife's
Eudora mailer to her new comcast.net e-mail address using SSL (just like
before), and it worked fine.

It has been reported in usenet newsgroups that _today_, mail.comcast.net 
resolves the the same exact IP addresses that yesterday were mail.attbi.com.
So, it is not surprising to find that mail.comcast.net port 995 today behaves
the same as previously reported for mail.attbi.com port 995.  How much 
longer that will continue is unknown.  
if you turn on pop3 secure authentication, does this work for anyone? That
should make us go through cram-md5 instead of auth-login, and there seem to be
reports that does work. Edit | Mail & News Account Settings, pop3 account,
server settings, use secure authentication. It's off by default now.
Nope, just tried it. You immediately get a pop up that states "this server does
not support secure authentication".
It would be great if this bug could get fixed.  I just switched to a Mac, and while Mail can do this I 
find I still prefer Thunderbird (warts and all).

If it would help the Mozilla folks working on the bug if they had an actual account to work with, I'd 
be willing to set up a comcast.net e-mail address for the temporary use OF THE DEVELOPERS 
ONLY.  As long as the activity level was low, I don't think it'd be a problem.
Sure, either Christian or I can look at it if we get a test account. Christian
would be better at it, but I'm willing to try as well.
Okay, just let me know how best to get the account info to you.  I know that normally all 
transactions go through this interface, but posting account login info here probably isn't optimal.  
:-)
I'll test this bug with a account (see mail).
But I think I won't be able to implement suggestions a) or b) from comment 12. I
already tried a) (see comment #24) and failed.

And for b) I've not idea because the POP code has access to time since last command.

Maybe there's another way.
David and Christian,

I've replied to Christian's e-mail directly (just commenting here for the sake
of completeness). Thank you for the quick responses!
Ok, I tried to log in mail.comcast.net with the data Travis gave me. And it
works flawlessly.

Firstly, the server admin is quite dump. If the connection is unencrypted the
server doesn't provide any secure authentication mechanisms but clear text
username/password. 
If the connection is encrypted, the server provides secure authentication
mechanisms. As some RFC and the common sense say it should be vice versa.

But ok, because it uses USER/PASS it is no problem to log in via an unencrypted
connection (port 25).
If SSL is activated (port 995), the server provides CRAM-MD5, PLAIN and LOGIN.
I've tested that LOGIN still has the same flaw - the server just answers nothing
and so we wait and wait.

But the workaround David wrote in comment #38, switching on "Use secure
authentication" (and thus use CRAM-MD5) works.
And because Mozilla supports the PLAIN mechanism since 20030913 (see bug 218766)
it works even without "Use secure authentication".

Travis, with what version of Mozilla did you test it the last time?

If one configures Outlook (Express) to use secure password authentication (SPA),
it will use the LOGIN mechanism and should have the same problem as (previous)
Mozillas.
Unfortunately do both versions I tested (OE6 and Outlook 6) have another problem
with the servers AUTH list and stop before login.
I can confirm that accessing mail.comcast.net port 995 now works with 1.5rc2, if
I select "use secure authentication" from the Account Server Settings panel.
Christian, I'm sorry but with hopping around from computer to computer the last
several months I have no idea what version I was using when I reported that
CRAM-MD5 didn't work for me. However I can tell you that until now I'd
misunderstood what comment #18 was suggesting. I had previously only tried "use
secure authentication" with SSL turned off. With both "use secure connection
(SSL)" and "use secure authentication" selected, I can now make a secure
connection to mail.comcast.net.  Wonderful!

I realize I should probably be testing this with the stock Mozilla, but I'm
currently using Thunderbird 20031008 RC3.
Hm, I don't know if TB RC3 already knows PLAIN for POP3. Just try SSL without
secure authentication. If it works it has PLAIN.
Nope it doesn't. I'd tried it already, but figured that is just a case where TB
will catch up once it's using post-20030913 Mozilla code.

But anyway my remark in #39 was only referring to non-SSL connections, which
makes sense given the #45 observations. I didn't make the mental connection that
the server might offer different authentication options for SSL and non-SSL.
Ok, so nobody should currently have a problem using mail.comcast.net.

I know that this is no solution but makes me feel better because I still have no
solution to get around this LOGIN hang. 
Summary: hang attempting POP3 over SSL with attbi server → hang and timeout when authenticating with LOGIN mechanism on comcast server
*** Bug 228933 has been marked as a duplicate of this bug. ***
Product: MailNews → Core
Just an update on the current status of this bug for me:

I unfortunately wasted a lot of time messing around with Thunderbird and the
Comcast servers before finding this bug and applying the "Use secure
authentication" workaround, but this workaround does still work.

The only thing I wanted to mention was that the situation on the Comcast servers
with the broken AUTH LOGIN has not only not changed within the past
year-and-a-half, but it has gotten *worse*.  In the latest version of the
Maillennium servers that Comcast is using (V05.00c++) not only is AUTH LOGIN
broken, but so is AUTH PLAIN.  AUTH PLAIN *is* sending a response to the client,
but it is rejecting perfectly valid credentials.  Thus, if the client is not set
to "use secure authentication", it tries AUTH PLAIN and gets an "-ERR" message
for no good reason, and then it falls back to AUTH LOGIN, which leads to the
same old hang as before.  So far, I have not had any problems using AUTH CRAM-MD5.

I'm using Thunderbird 1.0.6 on Windows XP SP2.
We run our own Exim server. I was just looking at some packets with Etherpeek, and got the following:

Line  1 :  -ERR Unknown AUTHORIZATION state command <CR> <LF> .. 100

this is followed by:

TCP Checksum: 0x4877 Checksum invalid. Should be:  0x0724


Any idea if this is the right bug? Mail seems slow, but does come through. Thanks.
sorry for the spam.  making bugzilla reflect reality as I'm not working on these bugs.  filter on FOOBARCHEESE to remove these in bulk.
Assignee: sspitzer → nobody
Filter on "Nobody_NScomTLD_20080620"
QA Contact: esther → networking.pop
Product: Core → MailNews Core
is this still a hang?
Severity: normal → major
OS: Linux → All
Whiteboard: closeme 2009-06-28
WFM per comments in the vicinity of comment 45, and others.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → WORKSFORME
Whiteboard: closeme 2009-06-28
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: