Closed Bug 459716 Opened 17 years ago Closed 9 years ago

SMTP connection timeout to smtpout.secureserver.net almost every time

Categories

(MailNews Core :: Networking: SMTP, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: nelson, Unassigned)

References

(Depends on 1 open bug)

Details

Attachments

(3 files)

Using SM trunk nightly from 20080930003201 This bug is filed against component "networking" because there IS NO SMTP component in "core mailnews". :( I've recently switched to a new mail service provider, godaddy, whose outgoing SMTP server is smtpout.secureserver.net. I use SMTP-over-SSL on port 465. Almost every time I go to send an email, after I finish going through the spell checker, when I click the "Send" button, I get a pop-up error dialog that reads: Send Message Error Sending of message failed. The message could not be sent because the connection to SMTP server smtpout.secureserver.net timed out. Try again or contact your network administrator. Retrying immediately always succeeds. I wonder: a) Is the SMTP connection being opened BEFORE the spell checker runs, so that the connection times out while the spell checker is running? b) Couldn't this retry be automated?
Ahem.
Component: Networking → Networking: SMTP
Product: MailNews Core → Core
QA Contact: mailnews.networking → networking.smtp
Thanks, Joshua. The help text in b.m.o recommends using the SMTP component, but doesn't explain that it's a component of another product. Note that I have accounts with numerous SMTP servers, and I use SMTP-over-SSL with all of them, but only one exhibits this problem.
Certainly point b is most likely going to be fixed by bug 440794.
Depends on: 440794
You could try to generate a SMTP log : https://wiki.mozilla.org/MailNews:Logging
Here is the requested log
I've just spent some time with our smtp fake server looking at this. (In reply to comment #0) > I wonder: > a) Is the SMTP connection being opened BEFORE the spell checker runs, so > that the connection times out while the spell checker is running? I can confirm this does not occur - no connection is attempted until after spell check has been run ("SMTP Connecting to: localhost" also does not appear until afterwards). Additionally, if I kill the server and attempt to send the message, then I get the same message as you do in comment 0, and I only get the "SMTP Connecting to: localhost" message. This leads me to think the problem could be at your service provider end. Perhaps they are doing things like denying so many connections in a set time? I think there may be a way to get more in-depth debugging of sockets via nspr logging, Bienvenu probably knows that debug better than I do.
there's an nsSocketTransport log module you could try. I don't know if it's related, but I see something similar - I'm afflicted by the accept cert dialog issue with at&t. If I'm not really quick to accept that dialog, I see the same timeout - it's much less than the 60 second timeout we have on connection.
Based on comment 7, I found http://mxr.mozilla.org/mozilla/source/netwerk/base/src/nsSocketTransportService2.h#53 I'll try that. I am using SMTP-over-SSL, but I'm not having any cert issues.
I tried to get a log using set NSPR_LOG_MODULES=smtp:5;nsSocketTransport:5 and the log file was created but is empty. Perhaps I need a special build that defines PR_LOGGING when building the code in netwerk/base ? Or maybe it doesn't work to specify logging of two modules at the same time? I'll try without smtp:5
The logging problem evidently is that trying to combine multiple log module specifications together (e.g. smtp:5;nsSocketTransport:5 ) as suggested in https://wiki.mozilla.org/MailNews:Logging#Environment_Variables_to_set doesn't work. Specifying either one by itself works, but specifying both together does not (at least, on Windows). So I have produced a log with nsSocketTransport:5 but it's enormous. There is evidently a lot of background socket activity going on, even when it appears idle. I will see if I can find just the relevant part, and attach it.
This log file appears to contain just the part relevant to the SMTP connection that failed (the one and only SMTP connection attempt in the log).
Googling makes it look like multiple modules are supposed to be separated by a comma, not a semicolon.
I used a packet sniffer. The sequence of events is: client server ------ ------ 1. TCP SYN -> includes TCP Maximum Segment Size option and TCP SACK Permitted option 2. <- SYN, ACK 3. ACK -> 4. ACK, Data -> SSL 3.0 client hello in SSL v2 format 5. <- RST Server resets connection This is definitely a server bug. The question is: Do we want to try to work around it in the client, via something like automatic retry? From the perspective of the SMTP protocol, no application protocol (SMTP) data was ever sent or received on the connection. It is as if the connection was reset immediately following the completion of the TCP connect.
Another facet of this bug. The error message being displayed is wrong. It reports timeout, but no timeout occurs. The problem reported by NSS (as seen in the log file) is that the connection was reset by the peer. There is (or was) a separate error message for that problem. Thunderbird does everyone (especially its developers) a disservice by misreporting the error. If the error message reported had said that the connection was reset, I wouldn't have even filed this bug.
wrt the automatic retry, I think that would be covered by bug 440794 though ensuring we have an enhancement bug dependent on that one would be a good idea. wrt the error message, I've just had a look in nsMsgSend.cpp: 3365 case NS_ERROR_NET_INTERRUPT: 3366 aExitCode = NS_ERROR_SMTP_SEND_FAILED_INTERRUPTED; 3367 break; 3368 case NS_ERROR_NET_TIMEOUT: 3369 case NS_ERROR_NET_RESET: 3370 aExitCode = NS_ERROR_SMTP_SEND_FAILED_TIMEOUT; 3371 break; So we're mapping reset to timeout. Which does seem wrong. NS_ERROR_SMTP_SEND_FAILED_INTERRUPTED currently is displayed as: The message could not be sent because the connection to SMTP server %S was lost in the middle of the transaction. Try again or contact your network administrator. whereas NS_ERROR_SMTP_SEND_FAILED_TIMEOUT as we know is: The message could not be sent because the connection to SMTP server %S timed out. Try again or contact your network administrator. Looking at appstrings.properties the string for netReset there is "The document contains no data." IMHO this isn't so useful either. I would be very tempted therefore to make NS_ERROR_NET_RESET (I'm assuming this could happen almost at anytime during a connection) equate to NS_ERROR_SMTP_SEND_FAILED_INTERRUPTED and re-use the same string there. Thoughts? If you don't think they are close enough to be the same, then a user-description of what the net reset one is would be useful.
I agree that the string for FAILED_INTERRUPTED is better for the RESET case than the timeout string, much better. But I have a (smaller) problem with the FAILED_INTERRUPTED string too. It says "in the middle of the transaction". That's an inappropriate thing to say, IMO, if we haven't even logged in yet. The old message, which apparently is no longer used even though it is still retained in the code, was NS_ERROR_COULD_NOT_LOGIN_TO_SMTP_SERVER which wasn't too helpful, either, but it did convey the important (or at least useful for diagnostic purposes) message that the problem happened at the beginning of the connection, not "in the middle". Still, even if that is not fixed, I think the FAILED_INTERRUPTED string is better for RESET than the FAILED_TIMEOUT string.
Version: 1.9.0 Branch → Trunk
I've seen this problem, too. Retrying immediately worked as well. FWIW, I was not using SMTP-over-SSL. Also, apple mail did not have this problem with the same configuration.
Attached image screen shot
one more data point, in case it helps: from http://products.secureserver.net/email/email_thunderbird.htm NOTE: "smtpout.secureserver.net" is an SMTP relay server. In order to use this server to send e-mails, you must first activate SMTP relay on your e-mail account. Log on to your Manage Email Accounts page to set up SMTP relay. If you do not have SMTP relay set up and your Internet Service Provider (ISP) allows it, you can use the outgoing mail server for your Internet Service Provider. Contact your Internet Service Provider to get this setting.
I have numerous SMTP servers to which I connect regularly, and only one of them exhibits this problem. The one that does, smtpout.secureserver.net, has this problem almost every time. THat is, I nearly always must try at least twice in a row to successfully send an outgoing email. Today, I had to try 7 times before succeeding for one message. I do not have this problem for any other outgoing SMTP server. I'm quite sure my account is setup correctly. While it might be nice if TBird could automatically retry for certain failures like this, except for the error message being somewhat wrong, I really cannot fault TBird here. This is really a server problem. :-/ Seth, I am considering switching to PolarisMail.com as my mail service provider.
Nelson, I have the same problem as you: "I nearly always must try at least twice in a row to successfully send an outgoing email." tbird's imap protocol call code appears to try to retry in this scenario. from http://mxr.mozilla.org/comm-central/source/mailnews/imap/src/nsImapProtocol.cpp#4538 See https://bugzilla.mozilla.org/show_bug.cgi?id=196095 Could we try a similar trick and on NS_ERROR_NET_RESET silently retry?
OS: Windows XP → All
Hardware: PC → All
Summary: SMTP connection timeout almost every time → SMTP connection timeout to smtpout.secureserver.net almost every time
Product: Core → MailNews Core
No longer depends on: 440794
The problem with smtpout.secureserver.net seems to have stopped a few weeks (maybe months) ago. I think it would still be a good idea to see Mozilla's SMTP mail sending be made more robust in this area, but now it will probably not be possible to test it with smtpout.secureserver.net
WFM per comment 22. Feel free to file followup bugs.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: