Closed Bug 425652 Opened 16 years ago Closed 3 years ago

Bad HELO argument (due to RFC2821-disallowed characters in hostname)

Categories

(MailNews Core :: Networking: SMTP, defect)

defect
Not set
major

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: iane, Unassigned)

References

()

Details

User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_2; en-us) AppleWebKit/526.1+ (KHTML, like Gecko) Version/3.1 Safari/525.13
Build Identifier: User-Agent: Thunderbird 2.0.0.12 (Macintosh/20080213)

On a Mac connected to a wireless network, the hostname command yields "unknown001ec2aceb9c.Luke's House"

Thunderbird is using this as the argument to the HELO command, and it clearly isn't rfc2822 compliant. Both the apostrophe and the space are illegal.

I'm not sure if this problem occurs in the latest developer build - I'm reporting the experience of one of my users. His claims are supported by our SMTP log entries.

Reproducible: Always

Steps to Reproduce:
1. Connect a mac to a network with a name that isn't a valid HELO string argument.
2. Send email to an Exim server with default configuration.
3.
Actual Results:  
Email is rejected because the HELO argument is invalid.

Expected Results:  
The hostname should be used to construct a valid argument. Failing that, the IP address should be used.
A similar case was reported on Fedora Core 5 in the following forum thread, http://forums.mozillazine.org/viewtopic.php?t=579151 involving an Exim server.
Apparently the host name is taken as greeting argument even though it is not a correct fully qualified domain name.

See bug 68877 and nsSmtpProtocol::AppendHelloArgument() in nsSmtpProtocol.cpp
And, there is also bug 279525 which may be relevant.
Component: Preferences → Networking: SMTP
Product: Thunderbird → Core
QA Contact: preferences → networking.smtp
The space part got fixed in bug 411132. 
Summary: Bad HELO argument → Bad HELO argument (due to apostrophe ' in hostname)
Could the fix in attachment 296709 [details] [diff] [review] be extended to include other special
characters not allowed in domain names by the standard? I'm thinking more of a positive list rather than a negative list. While this would resolve the issue of disallowed characters, the question remains if a hostname which is not registered in DNS should be taken as greeting argument.
Yes, it needs to be. RFC2821 permits only letters, digits, hyphens and dots in the argument string. The string must not contain a hyphen at the start, or at the end, or adjacent to any dot.

This is more restrictive that the general definition for Internet domain names. So, even a domain name in with a DNS entry might not be syntactically correct for the HELO argument.

It's easy to determine whether a domain name is syntactically correct, and to fix it - even if you have to fall back on a default like example.com.

However, given that Thunderbird is often used on private networks with all sorts of wierd names, it's probably impossible to expect to find a domain name that is technically correct.
So, what is the correct order then to get the greeting right?

1. Check for hello_greeting, this overrides anything else.
   Nevertheless, invalid characters should be removed.

2. Check IP address for local RFC1918 range. Use the IP address
   then or the local host name (made valid) instead, if set?
   The same case applies for 127.0.0.0/8 (localhost) connections.

3. Look up DNS entry for the IP address. If there is a match in
   both directions, use this as a start, but then also remove
   any invalid combination of characters.

4. If IP address is not properly registered, treat like case #2.

There is also substantial discussion in bug 244030, which introduced the explicit preference for case #1.
I was trying to reproduce this on Windows XP, but it wouldn't let me specify an apostrophe or dot within the host name. This would explain why it is only seen with OSX or certain configurations of Linux, referenced to in comment #1.

The respective nsSmtpProtocol::AppendHelloArgument() function is using PR_GetSystemInfo(PR_SI_HOSTNAME_UNTRUNCATED) to determine the host name.
This is mapped in NSPR to gethostname() for Windows and Linux/Unix, and OTInetGetInterfaceInfo() in OSX. All of those are obviously just returning whatever was specified as the host name during system setup, which may not be related at all to the registered DNS entry. As long as it is not empty and contains a dot, it's considered a FQDN and used as greeting argument (applies to both reported cases). The IP address is used if the name has no dot (which is always so in Windows and mostly on standard Linux setups, thus I wasn't able to reproduce that). To implement case #3 correctly, gethostname() would have to be replaced by gethostbyaddr() based on the IP address of the interface.

> This is more restrictive that the general definition for Internet domain names.
> So, even a domain name in with a DNS entry might not be syntactically correct
> for the HELO argument.

This gives me some doubts if all the effort of mapping against DNS would be worth it. Most likely, there will always be some spam filter with a certain configuration giving a penalty to whatever argument is provided. Modifying a correct DNS name to satisfy the RFC2821 restrictions may give you a higher score in the end than just reverting to the IP address as EHLO argument.

A feasible "lite" version of a fix for this bug is probably to extend the patch for bug 411132 with the whitelist of allowed characters and additional checking of invalid hyphen/dot combinations to avoid syntactically incorrect arguments for the greeting. This sanitation should preferably also be applied to any given hello_argument.
I'm a bit surprised to find this still unconfirmed, thus I'm confirming it now based on the statements here (for Mac), forum reports (two for Linux), own code review, and the discussion in bug 411132. This doesn't apply to Windows which won't allow specification of a malformed host name.

While one may argue that it is the operating system's responsibility to ensure that the host name is correct, at least by DNS standards, Thunderbird shouldn't generate a HELO/EHLO greeting argument which is syntactically incorrect. I have generalized the summary respectively.
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Mac OS X → All
Hardware: Macintosh → All
Summary: Bad HELO argument (due to apostrophe ' in hostname) → Bad HELO argument (due to RFC2821-disallowed characters in hostname)
Version: unspecified → Trunk
Product: Core → MailNews Core
(In reply to comment #0)
> User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_2; en-us)
> AppleWebKit/526.1+ (KHTML, like Gecko) Version/3.1 Safari/525.13
> Build Identifier: User-Agent: Thunderbird 2.0.0.12 (Macintosh/20080213)
> 
> On a Mac connected to a wireless network, the hostname command yields
> "unknown001ec2aceb9c.Luke's House"
> 
> Thunderbird is using this as the argument to the HELO command, and it clearly
> isn't rfc2822 compliant. Both the apostrophe and the space are illegal.

The hostname returned is required to be RFC-1066 compliant, which this isn't.

The issue is your DHCP configuration.
Many platforms may not even consult the DHCP (if active) to get the hostname and just use whatever was entered when the machine was set up. This seems to be the case with Mac OSX used by the reporter here.

Regardless of how the hostname was determined, it should/must be verified that it is in compliance with the applicable RFC(s), and if not, has to be either sanitized or replaced with the IP address as fallback for the HELO greeting.
(In reply to comment #10)
> Many platforms may not even consult the DHCP (if active) to get the hostname
> and just use whatever was entered when the machine was set up. This seems to be
> the case with Mac OSX used by the reporter here.
> 
> Regardless of how the hostname was determined, it should/must be verified that
> it is in compliance with the applicable RFC(s), and if not, has to be either
> sanitized or replaced with the IP address as fallback for the HELO greeting.

My point is that GIGO:  if the hostname is being set to something bogus, then strange things will happen in general.

It probably makes more sense to modify the sethostname() system call to enforce a DNS-friendly name.

TB in this case is the symptom, not the cause.
Frankly, I don't think the user cares about GIGO and just needs things to work. Sure, you can blame it on the operating system for allowing incompliant host names, or blame Exim for being overly picky on the greeting, but neither helps the user who gets cryptic error messages and has no clue about the standards. Thus, even though TB is not on the "cause" side, we should be able to identify garbage and avoid using it, at least not literally as is done now.
Philip: There's an old adage in programming wrt protocols... "be liberal in what you receive and conservative in what you send" :)
(In reply to comment #13)
> Philip: There's an old adage in programming wrt protocols... "be liberal in
> what you receive and conservative in what you send" :)

Yup, Jon told me that many a time when I was revising RFC's.

Version 91 has all new smtp backend code. If you can still reproduce this issue, please file a new bug report https://bugzilla.mozilla.org/enter_bug.cgi?product=MailNews%20Core&component=Networking%3A%20SMTP

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → INCOMPLETE

(In reply to Philip Prindeville from comment #14)

(In reply to comment #13)

Philip: There's an old adage in programming wrt protocols... "be liberal in
what you receive and conservative in what you send" :)

Yup, Jon told me that many a time when I was revising RFC's.

He also said it before the proliferation of malware that exploits inadequate input validation...

You need to log in before you can comment on or make changes to this bug.