Closed Bug 62836 Opened 24 years ago Closed 18 years ago

Mozilla should be kinder to mail servers in the event of a failure (disconnects without sending QUIT to SMTP server)

Categories

(MailNews Core :: Networking: SMTP, enhancement, P3)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: teilo+bugzilla, Assigned: ch.ey)

References

Details

(Keywords: fixed1.8.1)

Attachments

(3 files)

If you send a mail to a local user who does not exist mozilla terminates the
connection.  Mozilla should be gentler to the SMTP server and issue a QUIT
command instead of just exiting.

e.g just now,

Out: 220 mailhost.teilo.net ESMTP Postfix
 In:  EHLO teilo.net
 Out: 250-mailhost.teilo.net
 Out: 250-PIPELINING
 Out: 250-SIZE 10240000
 Out: 250-ETRN
 Out: 250 8BITMIME
 In:  MAIL FROM:<teilo@teilo.net>
 Out: 250 Ok
 In:  RCPT TO:<sefsdf@teilo.net>
 Out: 550 <sefsdf@teilo.net>: User unknown
note that this session ended prematurely due to a lost connection

would like to change this so that it becomes,

Out: 220 mailhost.teilo.net ESMTP Postfix
 In:  EHLO teilo.net
 Out: 250-mailhost.teilo.net
 Out: 250-PIPELINING
 Out: 250-SIZE 10240000
 Out: 250-ETRN
 Out: 250 8BITMIME
 In:  MAIL FROM:<teilo@teilo.net>
 Out: 250 Ok
 In:  RCPT TO:<sefsdfsdf@teilo.net>
 Out: 550 <sefsdf@teilo.net>: User unknown
 In:  QUIT
 Out: 221 Bye

Note that this exists cleanly and would be kinder to the SMTP server.
Marking NEW. This is a good idea in my opinion, always pays to be nice.
Status: UNCONFIRMED → NEW
Ever confirmed: true
See also bug 93210 for the same problem wiht IMAP (logout). This causes problems
on my company's mailserver, because connections aren't claused correctly on the
server (TIME_WAIT etc ...)

*** This bug has been marked as a duplicate of 93430 ***
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → DUPLICATE
Not a dupe re-opening.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Attached file Mail log
Log showing a succesfull mail transfer with local user existing and a QUIT
being sent, and a transfer where the user doesn't exist *without* the QUIT
[RFE] is deprecated in favor of severity: enhancement.  They have the same meaning.
Severity: normal → enhancement
tested with argosoft - server log with patch applied
1/20/2003 1:25:18 PM - Requested SMTP connection from 127.0.0.1
1/20/2003 1:25:18 PM - (     8) 220 ArGoSoft Mail Server, Version 1.8 (1.8.1.2)

1/20/2003 1:25:18 PM - (     8) EHLO 127.0.0.1
1/20/2003 1:25:18 PM - (     8) 250-Welcome [127.0.0.1], pleased to meet you
1/20/2003 1:25:18 PM - (     8) 250-SIZE 5242880
1/20/2003 1:25:18 PM - (     8) 250 HELP
1/20/2003 1:25:19 PM - (     8) MAIL FROM:<luke@127.0.0.1>
1/20/2003 1:25:19 PM - (     8) 250 Sender "luke@127.0.0.1" OK...
1/20/2003 1:25:19 PM - (     8) RCPT TO:<test@127.0.0.1>
1/20/2003 1:25:19 PM - (     8) 550 User unknown
1/20/2003 1:25:20 PM - (     8) QUIT
1/20/2003 1:25:20 PM - (     8) 221 Aba he
1/20/2003 1:25:20 PM - SMTP connection with 127.0.0.1 ended. ID=8
Product: MailNews → Core
*** Bug 317466 has been marked as a duplicate of this bug. ***
A quick note from my duplicate report in <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=317466">bug 317466</a>, this happens on all kinds of failures - see bug 317466 for some more debug information I put together.

I'd love to see this get fixed, and I consider it more than just an enhancement - this behavior is in violation of RFC 2821, section 4.1.1.10, which states:

   The receiver MUST NOT intentionally close the transmission channel
   until it receives and replies to a QUIT command (even if there was an
   error).  The sender MUST NOT intentionally close the transmission
   channel until it sends a QUIT command and SHOULD wait until it
   receives the reply (even if there was an error response to a previous
   command).

I'd love to see Mozilla get even more RFC-compliant!
Assignee: mscott → nobody
Status: REOPENED → NEW
OS: Windows 2000 → All
QA Contact: esther
Hardware: PC → All
Summary: [RFE] Mozilla should be kinder to mail servers in the event of a failure. → Mozilla should be kinder to mail servers in the event of a failure (disconnects without sending QUIT to SMTP server)
yes, definitely, thx for bringing this to my attention.
Assignee: nobody → bienvenu
(In reply to comment #10)
> yes, definitely, thx for bringing this to my attention.
> 

Wow, I can't believe this was reported back in 2000, the last comment was in 2005, and we still haven't addressed this RFC compliancy issue.

Folks, speaking as a SMTP Server developer, SMTP compliancy is becoming more important these days for many reasons, especially in the era of addressing the highly abusive and malicious spammer market.

SMTP Clients acting in rogue and legancy manners will become scrunitized more and more and this will TBIRD in a category no one will be comfortable, including putting pressure of SMTP vendors to support their customer when CLIENTS do not behave as expect.  This is reason I found myself in the world of BUGZILLA.

As I noted in a comment for BUG 360118 why a QUIT is a RFC requirement:

The reason is straight forward.  While SMTP servers are required (a MUST) to
have a IDLE timeout (a minimum of 5 minutes) at each STATE, the concept of
having a DROP detector is an IMPLEMENTATION design issue, not a PROTOCOL
requirement.

Hence, when clients DROP (or not DROP) a connection, it is quite possible for servers to have hanging sessions that will not time out until 5 minutes later.  This can have an effect on scalability and load balancing issues at servers.

All SMTP Clients MUST issue a QUIT in their client/server transactions.

Christian Eyrich is|has addressed the EHLO/HELO negative response issue in BUG 360118.  This bug needs to be addressed as well as BUG 244030.

Thanks 

Hector Santos, CTO
http://www.santronics.com
Attached patch proposed patchSplinter Review
This patch changes ProcessProtocolState()'s error handler.
With this, we always try sending out QUIT on error. Only if that SendData() fails, we simply disconnect.

I was only able to test this in few error conditions, so please test who's able to.
Assignee: bienvenu → ch.ey
Comment on attachment 246013 [details] [diff] [review]
proposed patch

thx for the patch - this looks good. I'd just want to make sure we don't get here if we never connected to the server, or if the server disconnected in mid-stream...
Attachment #246013 - Flags: superreview?(bienvenu)
(In reply to comment #12)
> Created an attachment (id=246013) [edit]
> proposed patch
> 
> This patch changes ProcessProtocolState()'s error handler.
> With this, we always try sending out QUIT on error. Only if that SendData()
> fails, we simply disconnect.
> 
> I was only able to test this in few error conditions, so please 
> test who's able to.

Is there a compiled beta we can test?  Does this patch apply to TB 2.0? or TB 1.5x?

Looking at the patch, this patch would only be correct if the status variable is less than error.  Again, I am not familar with mozilla coding, but a quick compile test run of a simple code to display the error numbers, shows that NS errors are in the range of 0x8xxxxxxx hence are less than zero in a 32bit enviroment.

So this looks ok.  But I wish to emphasize:

With the SMTP PROTOCOL, an QUIT command MUST *always* be issued regardless of SUCCESS or FAILURE statements.   The only time, of course, it will not be sent is when the connection has been dropped for whatever reason.

Also, I might success that this nsSmtpProtocol.cpp logic is too much locked in on specific response values.  It should use ranges.

This is an example, but it is common in the code, in the SenderReceipientResponse(), you have:

 if(m_responseCode != 250 && m_responseCode != 251)
 {
   rv = nsExplainErrorDetails(m_runningURL,
    (m_responseCode == 452) ? NS_ERROR_SMTP_TEMP_SIZE_EXCEEDED :
   ((m_responseCode == 552) ? NS_ERROR_SMTP_PERM_SIZE_EXCEEDED_2 :
                              NS_ERROR_SENDING_FROM_COMMAND),
                              m_responseText.get());

   NS_ASSERTION(NS_SUCCEEDED(rv), "failed to explain SMTP error");

   m_urlErrorState = NS_ERROR_BUT_DONT_SHOW_ALERT;
   return(NS_ERROR_SENDING_RCPT_COMMAND);
 }

What should be added is a function:

/*
   return the range for a given SMTP response:
   range should be either 10 or 100
   examples:
      ResponseSet(252,10) --> 250
      ResponseSet(551,10) --> 550
      ResponseSet(553,100) --> 500

 */
int ResponseSet(int code, int range)
{
    if (range == 0) return code;
    return (code/range)*range;
}

and now the above codes becomes:

 if(ResponseSet(m_responseCode,10) != 250)
 {
   rv = nsExplainErrorDetails(m_runningURL,
    (ResponseSet(m_responseCode,10)==450)?NS_ERROR_SMTP_TEMP_SIZE_EXCEEDED :
   ((ResponseSet(m_responseCode,10)==550)?NS_ERROR_SMTP_PERM_SIZE_EXCEEDED_2:
                              NS_ERROR_SENDING_FROM_COMMAND),
                              m_responseText.get());

   NS_ASSERTION(NS_SUCCEEDED(rv), "failed to explain SMTP error");

   m_urlErrorState = NS_ERROR_BUT_DONT_SHOW_ALERT;
   return(NS_ERROR_SENDING_RCPT_COMMAND);
 }

In other words, you should look for response categories, and not specific values since response values are just RFC guideliens and there is no guarantee to have specific meanings to be the same across all systems.  However, what is consistent across the board is the RESPONSE CATEGORY that all SERVERS and CLIENTS must follow:

   25x -->  State OK for any x reason
   55x -->  permanent REJECTION for any x reasons 
   45x -->  temporary REJECTION for any x reasons

Finally, one last thing:

According to what I see in your state machine for handling the DATA stage,  the QUIT command is issued in the SendMessageResponse() if a 354 or 250 is not issued.  What's with the 354?  That is not expected at this point.

The only thing expected is a 25x, 45x or 55x.

The response code check should change to:

 if(((ResponseSet(m_responseCode,10) != 250)) 
     ||  CHECK_SIMULATED_ERROR(SIMULATED_SEND_ERROR_12)) {
   ....
 }

I hope I didn't go overboard. That is not my intent.  But I can help. You got my attention here and if I can get a compiling environment going for my Windows development here (I don't wish to install CYGWIN or whatever else I need),  I would definitely fix this code up 100%.

In any case, specific to this BUG report, Chris, it does seem that your patch will do the job for now.

Hector Santos
(In reply to comment #13)
> I'd just want to make sure we don't get here if we never connected to the
> server,

Here I don't know what you mean with connected. The TCP/IP level or the SMTP level (i.e. receive of the initial greeting).
ProcessProtocolState() is called by two functions in nsMsgProtocol and none looks (to me) it can call it if there's no connection.

> or if the server disconnected in mid-stream...

We don't get there if the connection goes away while we're outside of nsSmtpProtocol, i.e. if we're waiting for data. Or if we leave nsSmtpProtocol to wait for data after it goes away.
But even if it goes away right between a function returns with error and we enter the new code, nothing bad will happen.
(In reply to comment #14)

Hector, I don't wanna muzzle you and you may have some good points. But please don't mix different issues or bring new in one bug. That makes it complex and difficult to follow.

And no, there aren't any binaries provided with patches applied. Developers do that for themselves..
(In reply to comment #16)
> (In reply to comment #14)
> 
> Hector, I don't wanna muzzle you and you may have some good points. But please
> don't mix different issues or bring new in one bug. That makes it complex and
> difficult to follow.
> 
> And no, there aren't any binaries provided with patches applied. Developers do
> that for themselves..
> 

You already did try to muzzle me by writing me privately and I personally don't appreciate it.  

Look, I really don't give a hoot.  What you have is KLUDGY code amd if you want to clean it up, clean  it up correctly.  If you don't want to, don't.  This spagetti code corrections just makes it harder down the road for the next guy.

TB is the one hurting by not properly following SMTP specs.  Not us.  But now that I have dug deeper into it, at the very least I have the proof  to explain whats wrong TB when it doesn't behave correctly. But I really don't care.  I'm trying to help here and soon I'll be gone. My time is too vulnerable for this. I'm not going to be jumping all over the place to help address these TB SMTP issues.

Hector Santos
Comment on attachment 246013 [details] [diff] [review]
proposed patch

Thx, Christian, I'll land this patch, along with the handling of 502 return code.

Hector, we'd appreciate the benefit of your expertise, but in a project this large, we have various processes that we follow to keep things relatively sane. For example, limiting comments in a bug to the issue in the bug helps us find comments after the fact - it's not an effort to muzzle anyone.

I'd be happy to help you set up a build environment, but you do need cygwin on windows - I'm not sure why that would be a show-stopper - I don't have it in my default path; I just run a batch file in a dos prompt to include cygwin in the path when I'm using the dos prompt for building TB.
Attachment #246013 - Flags: superreview?(bienvenu) → superreview+
fix landed on trunk and branch, Thx!
Status: NEW → RESOLVED
Closed: 22 years ago18 years ago
Keywords: fixed1.8.1
Resolution: --- → FIXED
(In reply to comment #18)

> I'd be happy to help you set up a build environment, but you 
> do need cygwin on windows - I'm not sure why that would be a show-stopper 
> - I don't have it in my default path; I just run a batch file in a 
> dos prompt to include cygwin in the path when I'm using the dos prompt 
> for building TB.

Well, 90% of the problem is I havn't done "*nix" since my Sun Workstation days in the 80s and I tried to follow the WIN32 website instructions.  I installed CYGWIN with the recommended components, but beyond that it isnt' very clear.  I'm have not gotten past the ".MOZCONFIG" file(s?) setup.

Once I open a CYGWIN window box (with the CYGWIN.BAT modified per instructions),  I run this and this is what I get:

Administrator@hdev1 /cygdrive/y/mozilla
$ make -f client.mk build
: command not foundonf/mozconfig2client-mk: line 39:
: command not foundonf/mozconfig2client-mk: line 44:
mozilla/build/autoconf/mozconfig2client-mk: line 45: syntax error near unexpecte
' token `{
'ozilla/build/autoconf/mozconfig2client-mk: line 45: `print_header() {
client.mk:356: /cygdrive/y/mozilla/.mozconfig.mk: No such file or directory
make: *** No rule to make target `/cygdrive/y/mozilla/.mozconfig.mk'.  Stop.

Administrator@hdev1 /cygdrive/y/mozilla
$

Anyway, if you would like to move to private and explain what I need to do here to get going, that would be wonderful.  PS: I did create the .MOZCONFIG file in the /cygrive/y/mozilla folder.

# This file specifies the build flags for Thunderbird.  You can use 
# it by adding:
#  . $topsrcdir/mail/config/mozconfig
# to the top of your mozconfig file.

. $topsrcdir/mail/config/mozconfig
# mk_add_options MOZ_CO_PROJECT=mail
# ac_add_options --enable-application=mail

I even tried the EXPORT method described in the web site.

---
HLS
backing up a little, have you been able to pull the source code, other than mozilla/client.mk and mozilla/mail/config? E.g., if you do

make -f client.mk 

does it try to pull the source code?

You don't need to be in a bash shell to run make (I'm guessing you are since you have a $ in your prompt) - a plain old dos prompt works fine, as long as cygwin and friends like moztools are in the path.

No unix expertise is required, I promise - I haven't used Unix since then either, for the most part :-)
(In reply to comment #21)
> backing up a little, have you been able to pull the source code, other than
> mozilla/client.mk and mozilla/mail/config? E.g., if you do
> 
> make -f client.mk 
> 
> does it try to pull the source code?
> 
> You don't need to be in a bash shell to run make (I'm guessing you are since
> you have a $ in your prompt) - a plain old dos prompt works fine, as long as
> cygwin and friends like moztools are in the path.
> 
> No unix expertise is required, I promise - I haven't used Unix since then
> either, for the most part :-)

I tried your suggestions and I still get the same result. I don't think I have this "MOZCONFIG" or ".MOZCONFIG" right.

I would really like to be able to recompile TB. :-)

---
HLS
Ok, It worked!!!  Bug fix verified.

**************************************************************************
Wildcat! SMTP Server v6.1.451.9
SMTP log started at Tue, 05 Dec 2006  23:56:37
Connection Time: 20061205 23:56:37  cid: 00002F4E
SSL Enabled: NO
Client IP: 65.11.245.77 (unknown)
23:56:37 S: 220 winserver.com Wildcat! ESMTP Server v6.1.451.9 ready
23:56:37 C: EHLO [192.168.1.102]
23:56:37 S: 501 Invalid EHLO client address.
23:56:41 C: QUIT
23:56:41 S: 221 closing connection
23:56:42 ** Completed

It immediatly respected the negative (50x) response and issued a proper QUIT.  For some odd reseason it waited 4 seconds to issue the QUIT, but I guess that is minor.  

This BUG FIX will help eliminate any false erroneous "appearance" of a SPAMMING client trying to get in by ignoring all negative responses.

Good Job Christian!  TB QUALITY has been enhanced. :-)  Now to fix the HELO/EHLO issue and my job is done with you guys. :-)

--
HLS

Hector Santos wrote:
> Christian Eyrich wrote:
>>> Once I get a quick and simple answer, I will be to test right way.
>>
>> Sorry it took some time but I'm quite busy at the moment.
> 
> No problem.  I'm trying to end this "TB project" myself and move on. :-)
> 
> In fact, once we get a version that addresses the HELO/EHLO invalid 
> address issue, I will direct our field testers to you or whoever TB 
> programmer is handling this.  They will give you immediately feedback. 
> I'm sure there were will many others that will be giving you feedback on 
> this as well.   This is the biggest issue with TB right now because it 
> limits TB usage unfortunately.  In short, this alone will increase user 
> usage of TB.  Yes, Yes, I know. One thing at a time. :-)
> 
>> Just extract the whole archive to some new place and start the
>> thunderbird.exe there. Replacing only an exe or dll in your current
>> setup might work but the chance to crash or even corrupt something
>> increases the older your current one is (and in terms of this, I'd say
>> 20060724 is quite old).
>>
>> Using the whole new folder is the easiest. It will use your current
>> profiles.
> 
> Got it. I will try this now and get back to you very soon. I have to 
> swap our WCSMTP server because the currentm gamma version has a new 
> option to ignore the HELO/EHLO issue based on the SUBMIT PROTOCOL (PORT 
> 587) authentication login requirements. Hence since this fixes this 
> issue, this new version will not see the failed QUIT flow issues. 
> Reverting will allow me to see how TB 2.01b handles the QUIT.
> 
> Be back soon.
> 
> ---
> HLS
> It immediatly respected the negative (50x) response and issued a proper QUIT.

Great it worked for you, thanks for the feedback.

> For some odd reseason it waited 4 seconds to issue the QUIT, but I guess that
> is minor.

I guess that's the time you needed to click away the alert. The alert function is blocking - it stops the protocol handling. That can be positive but is unfortunate here. There currently is no non-blocking alert available.
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: