Closed Bug 127399 Opened 23 years ago Closed 12 years ago

Allow sending emails with IDN based email addresses

Categories

(MailNews Core :: Internationalization, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 20.0

People

(Reporter: wil, Assigned: mnyromyr)

References

(Depends on 2 open bugs, )

Details

(Keywords: intl, Whiteboard: [gs])

Attachments

(3 files, 6 obsolete files)

Now that IDN support is available in necko, we can start thinking about how to
enable IDN email addresses. There is only one draft on IDN email so far, and
is apparently not being updated. Just keep in view.
Keywords: intl
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.2
Target Milestone: mozilla1.2alpha → Future
Lots of things have changed since then, there are some Internet Drafts and an
active mailing list on it. Please see http://www.imc.org/ietf-imaa/index.html

William, the mailing list and IDs are about non-ASCII characters in the local
part of email addresses. That's not what IDN is about.
Should this bug be treated as meta-bug for IDN in MailNews?

I propose to split the task of implementing IDN support into three steps with at
least three bugs:

1. Enable the user to send messages to users with IDN in their email addresses.
   - The envelope To should get ACE coded when sending the mail.

2. IDN of email addresses in outgoing messages get ACE coded, mail servers
(SMTP, POP, IMAP) with IDN become usable.
   - Server names and mail addresses in identities should be saved in UTF-8
     (not currently done as I've noted to my surprise).
   - Testing vor valid server names (currently done at least for SMTP servers)
     should be relaxed or done after the server is ACE coded.

3. Email addresses and any other header informations to be displayed should be
presented unencoded as IDN to the user.
Another issue is that MailNews and Thunderbird STRIP non-ASCII characters in the
domain name before contacting the SMTP server.

Here's the output from the mail log when attempting to send to
stale@blåbærsyltetøy.no:


Feb 17 10:57:12 lakepoint sm-mta[2554]: i1H9vCda002552: to=<stale@blbrsyltety.no
>, delay=00:00:00, xdelay=00:00:00, mailer=esmtp, pri=30424, relay=blbrsyltety.n
o, dsn=5.1.2, stat=Host unknown (Name server: blbrsyltety.no: host not found)


Oops.


I suggest that until IDN support is present, MailNews/Thunderbird should respond
with an immediate error message to the user when non-ASCII characters are entered.


As it is now, the user will not get an error message in case the stripped name
is a valid domain name ("røed.net" -> "red.net" is a concrete example) and the
local part is valid at that domain.
Christian: IMAA tries to internationalize the entire email address, which
includes  IDN on the right hand side of the '@' sign. The effort is currently
focused on the local part because it is quite a pain in the a**.

But since the title of this bug is 'IDN in email addresses', I suppose you are
right that we should focus on just that, ignoring the local part.


Jan: I agree we should warn the user when a non-ASCII address is used until a
proper IDN patch is in.
Need someone familiar with MailNews to lead this...

I've shuffled Christian's points into the following bugs, does this make sense?

1. Agreed, this bug stays as meta-bug for IDN in email addresses.

2. IDN support for mail server hostnames used in account settings. This includes
storage (utf-8 or unichar) and validation as outlined in Christian's (but this
does not deal with IDNs in message handling)

3. Message sending - preparation of 822 headers, envelope sender and recipients.

4. Message display - (Christian's point 3) rendering of IDN within email
addresses (IDNA ToUnicode would be necessary).
I splitted envelope To from the rest (envelope From and 822 headers) because the
former is quite easy to do and would instantly (I've already such a patch)
enable our users to send messages to IDN email addresses.

I don't think we'll have full support until 1.7, but enable sending should IMHO
be available starting with 1.7b (.com and .net domains are already available
with IDN and .de, .at and .info will start next month).

But your split-up makes sense, yes.
Christian: why don't you post your patch?

I have not followed mozilla development for quite some time, don't know if
things still work the same way now..
Is Naoki still actively working on the project?
Attached patch send ACE encoded envelope to (obsolete) — Splinter Review
It's a working patch to make mail sending work right now.

The encoding part (with find, convert, truncate and append) should be useable
for further work (with one or two modifications). But it might go into an own
helper function to make encoding email addresses more easy.

I think the place where this conversion happens will move as soon as email
addresses and server names are saved as UTF-8.
taking. Christian, do you think it's a realistic goal to make this work before
1.7beta (perhaps 3 weeks from now)? I'm setting the target milestone 1.8alpha
for the now, but we might make it 1.7final (if not 1.7beta)
Assignee: nhottanscp → jshin
Status: ASSIGNED → NEW
Target Milestone: Future → mozilla1.8alpha
I think this will work before 1.7b.
I'm about to finish step 2 (see comment #6 for what it includes) and step 3
shouldn't be that hard.
I'll open a new bug for my patch to step 2 later this day to discuss the code I did.
(In reply to comment #2)
> William, the mailing list and IDs are about non-ASCII characters in the local
> part of email addresses. That's not what IDN is about.

Are we sure about this? Is there unanimity of opinions about using IDN on the
righ side of the "@"? Is that what J. Klensin's propsal allows?

http://www.ietf.org/internet-drafts/draft-klensin-emailaddr-i18n-02.txt

As I recall, there was a big discussion about Hofmann vs Klensin proposals at
the IETF meeting last Fall. Does anyone know what was settled there and if IDN
is the agreed upon fromat for the domain part?

Use of IDN on the right side of email addresses seems to be not unquestioned in
that document.
And having different encodings on the left and the right side in email addresses
is something I dislike too. But IDN on the right side works right now and UTF-8
or other 8 bit encodings are unsafe to use.

To this the big part of my changes (make UTF-8 useable internally for mail
addresses and hostnames) are necessary for whatever case.

I vote for using IDN on right side of email addresses now as proposed. If there
will be another standard for encoding the left side, or even the email address
as whole, in the future, we can simply replace the ACE encoding/decoding
function used in my patch by another one (or remove it, if plain UTF-8 will be
the standard).
I agree with Christian, don't see any reason why not. Of course, this will be
considered an experimental feature.
Depends on: 235312
Let me play a devil's advocate. Is there a strong reason to implement this now?
Why should we be creating an experimental e-mail address format and potentially
have these addresses archived and kept for some time even before the standard is
established? Why is there this rush? Unless the standard question is settled one
way or the other, ISP's are not going to allow such maill addresses requiring an
IDN or UTF-8 solution.
As I wrote before, more and more TLD registrars allow IDN domains, and users
start wanting to use them.

As stated, step one is needed anyways and enables our user to send mails to
users with IDN domains in their emails and to connect to incoming and outgoing
servers with IDN names.

To apply ACE coding to domains in mail headers isn't really needed (besides the
fact, 8 bit characters aren't allowed in mail headers), right.

But though haven't found an RFC on IDN in email addresses I didn't doubt ACE
encoding the domain part is the right way because it's a domain.
And please see RFC 3490, 2. "Examples of domain name slots include: [...] the
part of an email address following the at-sign (@) in the From: field of an
email message header".
So I think it's safe enough to use ACE encoding for domainparts of To:, From:,
Cc: and Bcc fields in mail headers.
(In reply to comment #16)
> As I wrote before, more and more TLD registrars allow IDN domains, and users
> start wanting to use them.

For web page URLs, not for e-mail addresses. ISP's should not start mail
services without the standard question settled. Do you know any mail services
now offering e-mail addresses using IDN?

Does IE now support IDN directly? (At may last check, they didn't. A plug-in
exists but no direct support to the best of my knowledge.)
How about Outlook Express or any other mailers? Will Mozilla be the only
mailer(s) supporting IDN mail address? 
> For web page URLs, not for e-mail addresses.

Who forbid this?
I know two IDN domains reachable by email. My attached patch is enough to enable
Mozilla to send a message to this address (already tested). Although the mail
headers are illegal (because 8 bit), but works.

> Will Mozilla be the only mailer(s) supporting IDN mail address? 

I read about Mutt is IDN enabled, but couldn't find anything concrete.
Maybe Mozilla will the first mailer, and?
I'm not talking about some proprietary extension for Mozilla so mail will only
work between Mozilla users. As far as I can read (see RFC cited) I don't even
make crude assumptions.
And to say it again, it works with the current infrastructure.
(In reply to comment #18)
> > For web page URLs, not for e-mail addresses.
> 
> Who forbid this?

All the standards for IDN in web domain names are now in place. It makes sense
to officially support it. The e-mail address part is under discussion now. No
credible mail service will be implementing it. I'm saying that we should be very
careful before we start generating msgs with such a temporary status of the
proposal(s). Given the current state of the use of non-ASCII e-mail addresses, I
see no reason to rush. Perhaps it is not a bad idea to put in some codes now to
eventually deal with it, but I would like to caution against generating e-mail
messages without more progress on the standards for it.
I have to agree with Kat more or less. I shouldn't have been too excited about
this  :-). Although using 'punycode' at the right hand side of '@' (the domain
part) would work most of cases (MTAs wouldn't have a trouble dealing with it
because DNS servers would be able to resolve  them transparently ), reading the
draft refered to in comment #12 (see section 7. the author is clearly  reserved
about using 'punycode' as well as 'utf-8') convinced me that it's premature to
go ahead at this point. Besides, non-Mozilal recipients wouldn't be fond of 
their email addresses (domain part)in punycode. Although not in their native
scripts, ASCII-only domain names make some sense, but domain names in punycode
must be very cryptic to them. Well, you can tell them to switch to
thunderbird/mozilla, then :-) However, I guess we shouldn't take that road
(implementing something that's not yet completely agreed upon and using it as
our selling point)

We can still work on it and prepare ourselves 'ready' but even if the patch goes
in, it should go in blocked with '#ifdef 0' (or blocked by a preference entry
disabling it by default). I'm still excited about this possibility and glad that
you came up with a patch. However, working on it is one thing and enabling it
now seems to be another.
Using IDN in email addresses is possible. If someone wants to use them now, one
has to type the mail address e.g.
webmaster@xn--internetdomnen-gib.com.
This will be the envelope and the To: header field.

Using the server internetdomänen.com requires the user to enter
xn--internetdomnen-gib.com in host fields.

Why should we be careful to generate such headers and envelopes automatically
and do what IDNA is about: hiding ACE/puny coded domains from the user.

The changes as outlined in comment #6 won't create non-standard behaviour, not
for now and not for the future.
webmaster@xn--internetdomnen-gib.com is and will stay a valid address, may the
cited draft become a standard or not.

If you don't want to see ACE coded domains in mail headers, you should check a
patch in that makes it impossible for the user to create messages with these
addresses even by hand.
I pondered over the arguments, and read Klensin's draft again. 

If Klensin's draft was accepted (a concensus will probably not be reached for a
long time to come), there is probably no need to ACE encode the domain since it
might be UTF-8 (comment #20). However, the MUA obviously has to be modified to
support it, regardless of the outcome from the IMAA group.

In the meantime, there are already a number of mail services and client plugins
(Verisign's iNav: http://www.idnnow.com/).

I am definitely for the standpoint that we start working towards the goal of
supporting IDN in email addresses, keeping an eye on the IMAA development in
parallel. After all, I don't think it is a small job right?

I would also vote for checking in code with with a preference item disabled by
default. This would encourage people to test the feature without having to
compile their own. That's what we did for IDN, _way_ before the IDN RFCs were
published. Also, it is less likely for other changes to break the build if it
was merely commented out by #ifdef's.

By abstracting the encoding/decoding of i15d email addresses into separate
functions, it would be easy to switch the implementation when the standards are
finally out.

I don't see the urgency of getting it out before 1.7b though. If we make it
that's nice. If we don't, we don't.
Today I finished patches to enable on-the-fly convert from/to ACE when sending
(3.) and displaying (4.) to support IDN.

The en-/decoding is done in separate functions, make them to short-circuit in-
and output depending on a pref is not big thing.

But I'm still hanging in the folder/path thing (see bug 235312).
So it's not possible to use IDN in the incoming server if the incoming server
was specified when creating the account (workaround is to create an account with
a ASCII dummy and then rename the server).

Enabling to use UTF-8 for hostname made it necessary to change the parameter
type of GetRealHostName(), GetHostName() and GetEmail() (resp. the type of the
attributs in IDL), ifdef'ing all calls to this functions is a huge job and the
result is confusing and unnecessary. The changes are done once and won't
influence any other code.
Blocks: IDN
Depends on: 238016
Just to let you know, Opera's mail part is IDN enabled since at least 7.23 (the
version I tested).
Ok. I agree that using #ifdef is not such a good idea. Why don't you add a pref.
entry which is off by default? 
Requesting blocking1.7 because Mozilla1.7 is the next long-lived milestone
branch. Although some parts of this bug (like support of IDN-mailservers) may
not be so important, sending E-Mails to wrong domains is (see comment #4).


As Mozilla is silently stripping all special characters, it is impossible to
properly (re)encode the e-mail-address at the local mail relay. You can do this
to enable many other mail-clients like Forte Agent.
Flags: blocking1.7?
I certainly don't think we're going to block on adding IDN support to mail. The
sending mail to a modified (stripped of non-ascii chars) domain is a bit
concerning. I've asked mscott to look at that aspect of this bug (it should
probably be filed separately from this if we are going to talk about blocking a
release on it). 
*** Bug 241649 has been marked as a duplicate of this bug. ***
Flags: blocking1.7? → blocking1.7-
Flags: blocking1.8a?
Flags: blocking1.8a? → blocking1.8a-
Flags: blocking1.8a2+
only drivers can set (+) blocking flags.  you can request (?) them
Flags: blocking1.8a2+
oh sorry! made a mistake

*** Bug 256395 has been marked as a duplicate of this bug. ***
i would appreciate to add this kind of functionality and it don't raise any
objections.

Resolving/Processing the domainpart in a browser or in an mua should be the
same? Don't you agree? (view from the user)

on the serverside the admins should take responsibility to add the punycode
domainpart to the mailservers and the world would be fine - again :)
Product: MailNews → Core
Yes, I know, this might be considered spam, but anyhow. More and more clients
support IDN domain names via punycode now and more and more domains appear with
such a name.

Any ideas about when this patch might be landed on trunk?
Flags: blocking1.8a6?
Flags: blocking1.7.6?
Flags: blocking1.8a6?
Flags: blocking1.8a6-
Flags: blocking1.7.6?
Flags: blocking1.7.6-
Nearly one year ago I demonstrated how to enable users to send mails to
addresses like test@müller.de without the need to enter the domain part ACE encoded.

The patch only affects code in the SMTP protocol handler to ACE encode the
envelope. The patch is simple - only drawback would be, that the header of the
sent mail contains the recipients address as UTF-8 string.

It would be nice to hear if this is not wanted or wanted in general. Do we want
to wait for all specs eventually adopted and a full blown solution implemented -
or does that patch have a chance? Yes or no?
mail headers are only allowed to be ASCII, aren't they?
(In reply to comment #35)
> mail headers are only allowed to be ASCII, aren't they?

Yes, because of that I mentioned that the header of the sent mail would contain
the recipients address as UTF-8 string.
(In reply to comment #36)

Sorry for the delay. I definitely want Mozilla mail/TB to support IDNs in email.

> (In reply to comment #35)
> > mail headers are only allowed to be ASCII, aren't they?
> 
> Yes, because of that I mentioned that the header of the sent mail would contain
> the recipients address as UTF-8 string.

 I guess that's certainly a blocker. We can't break RFC (2)822/STD 11 in what
send out on the wire. What's the status of bug 235312 and bug 238016? We may
have to land all of them at once (comment #6)

Assignee: jshin → ch.ey
(In reply to comment #37)
>  I guess that's certainly a blocker. We can't break RFC (2)822/STD 11 in what
> send out on the wire. What's the status of bug 235312 and bug 238016? We may
> have to land all of them at once (comment #6)

I was nearly done with patches last April. But I discontinued it, since from the
comments here it didn't look like the work is welcome until IDN specs for mail
have been adopted.
*** Bug 307674 has been marked as a duplicate of this bug. ***
>  I guess that's certainly a blocker. We can't break RFC (2)822/STD 11 in what
> send out on the wire. 
Shouldn't it just be encoded like e.g. subjects are? Like this:
To: =?ISO-8859-1?Q?stale=40bl=E5b=E6rsyltet=F8y=2Eno?=
*** Bug 332259 has been marked as a duplicate of this bug. ***
Copied from bug 332259 comment #1:

IMA overview :
http://www.ietf.org/internet-drafts/draft-klensin-ima-framework-01.txt

SMTP extensions :
http://www.ietf.org/internet-drafts/draft-yao-ima-smtpext-02.txt

Most documents aren't published yet, but you can find back pointers at the end
of these 2. Several of them have already expired though.
https://www1.ietf.org/mailman/listinfo/ima : the mailing list for IMA / IEE. (a resurrected version of the one in comment #1 ??)

Updating the new IETF working group charter in the URL. The new working group's name is EAI - Email Address Internationalization.
QA Contact: ji → localization
So do I understand that in more than 5 (five!) years since opening this bug, this has not been resolved !? This should be complicated to implement (see Bug 387903 for how this should be implemented, Gmail is implementing this too...).
Product: Core → MailNews Core
I can't believe this bug is still unresolved.

It's a SERIOUS bug, guys!! There are thousands of such domains now,
and NONE of them can be addressed with Thunderbird.

I've just installed another email program, and I'm sure many more people will do once people or maybe some press people realize that their problems addressing certain domains are deriving from the email client they used!
The Internet standard document for this (RFC 5336/5337) only were published in September 2008, so even though we knew for five years we want something there, we couldn't even know what to implement until 2 months ago. That's not the fault of anyone at Mozilla but of the IETF working group that needed that long to get something defined - and if you read those RFCs (linked at the bottom of the page referred to by this bug's URL), you'll notice that it's not that simple to do.
I can't speak for those doing the actual coding (I'm a mere project manager in the volunteer SeaMonkey team), but I wouldn't expect this to be included in Thunderbird 3 or SeaMonkey 2 - unless someone of the people interested here comes up with a patch. Our developers are currently busy with higher priority issues, but surely would be happy to help anyone who's actively working on this.
Robert Kaiser wrote : "The Internet standard document for this (RFC 5336/5337) only were published in September 2008, so even though we knew for five years we want something there, we couldn't even know what to implement until 2 months ago."

I wonder if there is not a bit of confusion between the domain name (IDN in email) and fully internationalized email addresses (which include le LHS of the email, the local part before the "@").

IDN in email addresses could be supported several years ago I believe (they were certainly by other products).
I think you are both right. The Problem is, that AFAIK using IDN-Domains in SMTP wasn't explicitly specified. None the less it worked fine to enter the domain in puny code. IMHO a simple puny code converter would have done the job for the domain of the email (which is one option suggested by the RFC if I understood right). This is probably the way other clients went.
QA Contact: localization → i18n
This is still an issue. In addition the Error message in my case wasn't exactly helpfull. All non-IDN characters were stripped from the error message and thus the address in the mail and the error message didn't match.
Flags: blocking1.9.0.17?
Certainly not blocking any 1.9.0.x, since we're not shipping mailnews off
1.9.0. Nor will it block (or be likely to be accepted for) any security release
of any other already-shipping branch, nor will it block Tb3 days before it
ships, so the only flag to twiddle is Tb3.1.
Flags: blocking1.9.0.17? → blocking-thunderbird3.1?
not blocking 3.1 - marking wanted.
Flags: wanted-thunderbird+
Flags: blocking-thunderbird3.1?
Flags: blocking-thunderbird3.1-
Just curious, any idea when/if this will be implemented? The RFC is no longer "just released", and, at least for us non-native English speakers, it would be a very welcome addition. And since it is part of what could be called "core functionality", ie. the sending of email to a legitimate address, perhaps it could be done reasonably quickly instead of RSN?
(In reply to comment #59)
> Just curious, any idea when/if this will be implemented? The RFC is no longer
> "just released"

Actually, all the RFCs seem to be in the "experimental, this isn't yet a standard state". I think as there's probably a lot of work to do, it could probably be worked towards, but there may be issues with servers not supporting it for quite a while.

At the moment it also needs someone to step up and work on it. I'm changing some flags to make that more likely to happen. I'm assuming that Christian isn't actively working on this as well.

Note that for this to happen it isn't just a matter of updating the send mechanisms. We also need to do various things, a few of which are:

- Ensure hostnames can cope with idn (maybe covered by bug 235312, and actually probably separate to this bug anyway)
- Ensure account manager, saving passwords etc can cope with idn (bug 235312)
- Make sure address book email addresses can cope with idn.
- Make sure the various parsing algorithms for sending/reading message headers also cope with idn.
- May sure the protocol implementations cope with idn.

This list isn't meant to discourage people from working on this bug, but I'd suggest that if people want to, then this bug is broken down into those areas and each bit fixed at a time, as I suspect there will need to be quite a few areas touched to fix this bug.
Assignee: ch.ey → nobody
Keywords: helpwanted
Target Milestone: mozilla1.8alpha1 → ---
I think Mark is wrong , seeing so many problems:

IDN is a pure front-end mapping of domain-names beginning with 'xn--...' to the human readable version with the non-ASCII-character; and vice versa. It needs _no_ modification in the transport mechanism.
The _real_ adress of
    lieschen@müller.de
is


The only things to do in the first step are:

  1.Every address entered in the FROM-, TO-, CC- or BCC-Field, containing
    non-ASCII-characters in the domain-part, has to be converted according 
    the punicode rules.

  2.Every adress from the FROM-, TO-, CC- or BCC-Fields of the message,
    begining with 'xn--' should be converted to the human readable form
    with non-ASCII-characters.

Everything else ()
Sorry,
I accidently hit the wrong button. So my unfinished posting was sent. :-(

I wanted to clarify that the _real_ address of
    lieschen@müller.de
is
    lieschen@xn--mller-kva.de

and only the MUA has to deal with the conversion and any non-ASCII-parts of the domain name.

And finally there may bei some more subtleties and you may add some goodies and comfort to the MUA concerning IDN, but the two things (converting to and from the adress fields) is, what the people need to write and receive mail to/from IDN-domains.

I'm not a great friend of IDN-domains, but the IDN was released more than seven years ago and it's poor that Thunderbird doesn't yet support this standard.
Not really, you have to deal not only with IDN domains but also with IDN local-part of email and this require SMTP server support to understand envelope address because it contain non ASCII symbols.
(In reply to comment #63)
> Not really, you have to deal not only with IDN domains but also with IDN
> local-part of email and this require SMTP server support to understand envelope
> address because it contain non ASCII symbols.

I was talking about IDN (Internationalized _Domain_ Name) according to RFC 3492 and nothing else. A domain name has no local part and as far as I know there is no acknowledged RFC for the local part of an email address.

But RFC 3492 is of March _2003_. And you can get domains with IDN-names since at least 2004 and in the meantime there grow lots of people, who don't even know, that there is anything special on the domain 'müller.de'.

If some fools even think, that the address 'peter.müller@müller.de' would be nice, then they have to wait some more years until there is a released standard to handle this wish.
But I think it should be possible now (7 years after the released RFC) to write an email to 'info@müller.de' without messing around with an external calculator to get the real address 'info@ xn--mller-kva.de'.
(In reply to comment #64)
> If some fools even think, that the address 'peter.müller@müller.de' would be
> nice, then they have to wait some more years until there is a released standard
> to handle this wish.

Yeah I was talked about at rfc5336 and UTF8SMTP extension. Sorry for confusion about IDN.
Incidentally, there is a new version of IDNA : IDNA2008.

The draft IDNA2008 specification is defined by a cluster of IETF RFCs:

    * Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework
      http://tools.ietf.org/html/rfc5890
    * Internationalized Domain Names in Applications (IDNA) Protocol
      http://tools.ietf.org/html/rfc5891
    * The Unicode Code Points and Internationalized Domain Names for Applications (IDNA)
      http://tools.ietf.org/html/rfc5892
    * Right-to-Left Scripts for Internationalized Domain Names for Applications (IDNA)
      http://tools.ietf.org/html/rfc5893

There is also an informative document:

    * Internationalized Domain Names for Applications (IDNA): Background, Explanation, and Rationale
      http://tools.ietf.org/html/rfc5894

Also check UTS 46 "Unicode IDNA Compatibility Processing", http://www.unicode.org/reports/tr46/.

I'm a bit hesitant to help code this (is the code easy to build and understand ?), otherwise I'm willing to help come up with the code.
Depends on: 614930
This bug has been open quite a few years now. Is there any progress here? Most top level domain registries has allowed registering IDN domain names for years already, and some IDN top-level domains like .egypt in arabic spript and .rf in cyrillic script has opened in 2010. Apple Mail and Outlook supports sending email to email-addresses on IDN domains, although some of Microsoft's and Apple's other email products don't yet support IDN email addresses. It's now 2011 and it would be nice if Thunderbird could support IDN domain names.
Here I found a list with the programs that support IDN:

http://www.denic.de/domains/internationalized-domain-names/idns-und-sicherheit/idn-faehige-programme.html

Unfortunately, Thunderbird is not here yet ...
Thank you Rainer, this is a good overview!
what is the status on IDN addresses?  even after adding (each of: .pe, xn--77h, and  xn--77h.pe) to network.IDN.whitelist.<x>, thunderbird strips off the domain and leaves just the .tld which of course is rejected by the server as invalid.
Firefox can handle IDN. Where is the problem?
(In reply to comment #72)
> Firefox can handle IDN. Where is the problem?

Do you really know the problem?
We talk about e-mail-adresses like info@jürgen.example and not about
info@xn--jrgen.....!
(In reply to comment #72)
> Firefox can handle IDN. Where is the problem?

We are talking about Thunderbird, not Firefox.
See my link above.
Supporting punycode as an first step would make many people happy. That is better than waiting years.
Sorry for the dupe did not find it. sorry for the following me too rant:

I am a little confused that this quite old bug is listed with "help wanted" but there seems unclear what person or what part of thunderbird needs improving.

I totally agree that hostname encoding is not trivial, might have security problems or functional problems. But I don't get that it not even the problem scope is distilled.

If there is a way to help, write a test suite or do anything to improve (paypal someone some USD), please post it here!

from last year:
"Actually, all the RFCs seem to be in the "experimental, this isn't yet a standard state". I think as there's probably a lot of work to do, it could probably be worked towards, but there may be issues with servers not supporting it for quite a while.

At the moment it also needs someone to step up and work on it. I'm changing some flags to make that more likely to happen. I'm assuming that Christian isn't actively working on this as well."

Microsoft Outlook supports it. Not that I suggest taking their software quality as great inspiration. But if there would be problems with it, it would ring some bells.

The server part: It is not a problem. Really. If you use the status-quo 7-bit encoding (xn-...) everything is fine. If not that's a problem of the user/administrator not of the client software. Their setup would be borked. 
With this argument you end up that nobody is using it because the clients don't work...
To make the 7bit thing clearer. It would be enough to convert those foreign language signs automatically to the punycode (which is easy), but VIEW the eMail with the matching UTF-8 representation. 

E.g. someone enters -> info@xn--jrgen-winter-dlb.de

firefox checks the address book while typing anyway. you would view it as info@jürgen-winter.de but you would still use info@xn--jrgen-winter-dlb.de if you send the message.

This is not as hard as it sounds. You need an hook if you view the email message to check for IDN hostnames.

Note: this covers only the hostname part, as defined in RFC 5890 (former 3490)
Additinal the (upcoming) RFCs that will cover the localpart with UTF8 signs:

http://datatracker.ietf.org/wg/eai/ (esp. 3629, rfc5336bis (via UTF8SMTPbis))

The document above names is a bunch of RFCs that would affect Thunderbird (UTF8 for IMAP, POP3, etc.)
Trying to get some traction on this.

There are several separate issues in current mailnews backend code :
1. We don't allow UTF-8 in the localpart of email addresses.
   This is covered by RfC 5336, and almost no server in the wild supports it
   (AFAICT; I did not make an extensive survey).
2. We don't allow UTF-8 in the domainpart of email addresses.
   This is technically a non-issue, since the DNS doesn't support UTF-8 either,
   and nobody is going to change that any time soon. Instead, IDN-aware software
   replaces non-ASCII names by their ACE representation to funnel it through
   existing (and legacy) systems. This is what this bug here is about.
3. We need to take care of the display of IDN.
4. We need to take make sure that no headers contain unencoded characters.
   Again, no technical hurdle as such, we already handle MIME quite decently.

While we probably can still ignore (1) safely, we can handle (2) with little effort here and without breaking anything that currently isn't broken anyway. You even can enter email addresses containing UTF-8 characters without any complaints by the UI (neither by SM nor by TB) currently!
(3)+(4) are rather independent of each other and especially of (2). 

I'm willing to work on (2) under these guidelines to get the train going.
Assignee: nobody → mnyromyr
(In reply to Karsten Düsterloh from comment #80)
> I'm willing to work on (2) under these guidelines to get the train going.

Fantastic! This has been itching me, but I haven't had the free cycles to look at it.
Attached patch prepare ACE for RCPT TO: (obsolete) — Splinter Review
This doesn't try to fix the world, just the specific IDN problem.

The most notable change here is that if an email address contains invalid characters, it's now either dropped completely or fixed. (The old code just removed invalid characters and then sent your mail to innocent eavesdroppers!)
OTOH: Is it really okay to silently drop addresses or shouldn't we error out in this case?

(Also, my Mozilla string foo is weak these days, please double check. ;-) )
Attachment #141810 - Attachment is obsolete: true
Attachment #576607 - Flags: review?(dbienvenu)
(In reply to Karsten Düsterloh from comment #80)
> I'm willing to work on (2) under these guidelines to get the train going.

Thanks for looking at this.

(In reply to Karsten Düsterloh from comment #82)
> The most notable change here is that if an email address contains invalid
> characters, it's now either dropped completely or fixed. (The old code just
> removed invalid characters and then sent your mail to innocent
> eavesdroppers!)
> OTOH: Is it really okay to silently drop addresses or shouldn't we error out
> in this case?

I don't think its ever acceptable to silently drop addresses - something should be checking that - probably the UI code before you hit send.


This patch will only allow sending to email addresses with IDN domains. It won't cover the fact that identities can't specify an IDN based email address, nor can you save correctly an IDN based email address in the address book (if you can save it, then there's probably circumstances when it won't be read correctly, see nsIAbCard.primaryEmail).

Are you planning on dealing with those in this bug or will this need to be follow-up bugs? I think we need to be clear on what we're landing here.
(In reply to Mark Banner (:standard8) from comment #83)
> This patch will only allow sending to email addresses with IDN domains. It
> won't cover the fact that identities can't specify an IDN based email
> address, nor can you save correctly an IDN based email address in the
> address book (if you can save it, then there's probably circumstances when
> it won't be read correctly, see nsIAbCard.primaryEmail).

So I just realised that the identities issue is covered by bug 235312, but I still think we may want to re-title this depending on what we're actually covering here.
(In reply to Mark Banner (:standard8) from comment #83)
> > OTOH: Is it really okay to silently drop addresses or shouldn't we error out
> > in this case?
> 
> I don't think its ever acceptable to silently drop addresses - something
> should be checking that - probably the UI code before you hit send.

UI *before* might be hard, dunno.
I'll look into it - there're other error situation along the way which already stop sending.

> This patch will only allow sending to email addresses with IDN domains. It
> won't cover the fact that identities can't specify an IDN based email
> address, nor can you save correctly an IDN based email address in the
> address book (if you can save it, then there's probably circumstances when
> it won't be read correctly, see nsIAbCard.primaryEmail).
> 
> Are you planning on dealing with those in this bug or will this need to be
> follow-up bugs?

Anything else (like "don't send email headers with invalid characters") should go into follow-up bugs.
(In reply to Karsten Düsterloh from comment #85)
> Anything else (like "don't send email headers with invalid characters")
> should go into follow-up bugs.

Ok, thanks, updating the title.
Summary: IDN in email addresses → Allow sending emails with IDN based email addresses
Comment on attachment 576607 [details] [diff] [review]
prepare ACE for RCPT TO:

can you revert this part? The line should be wrapped since it's > 80 chars.

-        parser->ParseHeaderAddresses(addrs1.get(), nsnull, &addrs2,
-                                     &m_addressesLeft);
+        parser->ParseHeaderAddresses(addrs1.get(), nsnull, &addrs2, &m_addressesLeft);

I think we should have an assertion or warning if we drop an address, since that's pretty unexpected. Other than that, this looks like it does what it wants to do...
Attachment #576607 - Flags: review?(dbienvenu) → review+
It's amazing to watch the progress! Does TB99 solve this problem?
Yes.
Attached file TB add-on
Take a look at this, this did work for TB 1.5 to TB 2.0.
Attached image screenshot
international_e-mail_addresses-1.0.0-tb.xpi on TB2
Attached image always the same... (obsolete) —
Basically the same as before, but now it will throw an error box as soon as an illegal localpart is found on an address (v1 would have silently dropped this address!). This error box tells which address is affected.

Unfortunately, this error is recognized deep down in nsSmtpProtocol.cpp where we don't (always) (seem to) have a reference to the compose window, hence the error box (like other SMTP error boxes) is tied to the mailnews main window. :-/
OTOH, I could bring up an error box in the compose window context, but that wouldn't know which address is affected. ("One of the 200 recipients given has an illegal localpart. Happy searching!" probably isn't any helpful, thus I went the other way.)

The patch includes needed locale stuff for both SM and TB. 
Better wording proposals are welcome. ;-)
Attachment #576607 - Attachment is obsolete: true
Attachment #639947 - Attachment is obsolete: true
Attachment #652177 - Flags: superreview?(neil)
Attachment #652177 - Flags: review?(mbanner)
Comment on attachment 652177 [details] [diff] [review]
ACE for RCPT TO: and error boxes for illegal localparts

>-      // Extract just the mailboxes from the full RFC822 address list.
>-      // This means that people can post to mailto: URLs which contain
>-      // full RFC822 address specs, and we will still send the right
>-      // thing in the SMTP RCPT command.
>       if (!addrs1.IsEmpty())
>-        parser->ParseHeaderAddresses(addrs1.get(), nullptr, &addrs2,
>-                                     &m_addressesLeft);
>+        parser->ParseHeaderAddresses(addrs1.get(), nullptr, &addrs2, &m_addressesLeft);
So what happens if I try to send to mailto:"Karsten%20D%C3%BCsterloh"%20%3Bmnyromyr@tprac.de%3D (for example)? I haven't debugged the code but it looks as if this might possibly check for nonprintable characters before parsing the address out :-(
I think you mean URIs like 
    <mailto:%22Karsten%20D%C3%BCsterloh%22%20%3Cmnyromyr%40tprac.de%3E>?

If you run such URIs, they'll end up in the compose window, translated into proper Unicode. You might as well enter 
    Karsten Düsterloh <mnyromyr@tprac.de>
into the addressing field of the compose window.

Anyway, ExtractHeaderAddressMailboxes will extract only the mailbox part of the address and only that will be checked for illegal characters before the @.

BTW: If anyone needs a valid testing address with IDN, feel free to use <bug127399@düsterloh.eu">bug127399@düsterloh.eu>. ;-)
(In reply to Karsten Düsterloh from comment #97)
> <bug127399@düsterloh.eu">bug127399@düsterloh.eu">bug127399@düsterloh.eu">bug127399@düsterloh.eu>. ;-)

Whatever Bugzilla did try here, it wasn't sane. ;-)
(In reply to Karsten Düsterloh from comment #98)
> Whatever Bugzilla did try here, it wasn't sane. ;-)

You might actually want to file a Bugzilla bug on that. :)
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #99)
> You might actually want to file a Bugzilla bug on that. :)

Not while sleeping. :-P
→ bug 783771.
Comment on attachment 652177 [details] [diff] [review]
ACE for RCPT TO: and error boxes for illegal localparts

Your code just looks a bit unusual because of all the reparsing that's going on; first to extract the mailboxes, then to check for IDN, then to deduplicate, then to generate the final null-separated list.

Not that anyone's going to write a new header parser any time soon...
Attachment #652177 - Flags: superreview?(neil) → superreview+
Comment on attachment 652177 [details] [diff] [review]
ACE for RCPT TO: and error boxes for illegal localparts

Trying David instead, as with v1.
Attachment #652177 - Flags: review?(mbanner) → review?(mozilla)
Attachment #652177 - Flags: review?(mbanner)
Comment on attachment 652177 [details] [diff] [review]
ACE for RCPT TO: and error boxes for illegal localparts

*sigh* Okay, Irving, you've won, then. ;-)
Attachment #652177 - Flags: review?(mozilla) → review?(irving)
Comment on attachment 652177 [details] [diff] [review]
ACE for RCPT TO: and error boxes for illegal localparts

Review of attachment 652177 [details] [diff] [review]:
-----------------------------------------------------------------

::: mail/locales/en-US/chrome/messenger/messengercompose/composeMsgs.properties
@@ +232,5 @@
>  12600=Unable to authenticate to SMTP server %S. It does not support authentication (SMTP-AUTH) but you have chosen to use authentication. Please change the 'Authentication method' to 'None' in the 'Account Settings | Outgoing Server (SMTP)' or contact your email service provider for instructions.
>  
> +## @name NS_ERROR_ILLEGAL_LOCALPART
> +# LOCALIZATION NOTE (12601): %s is an email address with an illegal localpart
> +12601=The recipient %s contains illegal characters in the localpart. Please correct the address and try again.

The RFC has local part as two words. How about

There are non-ASCII characters in the local part of the recipient address %s. This is not yet supported. Please change this address and try again.
(In reply to Magnus Melin from comment #104)
> There are non-ASCII characters in the local part of the recipient address
> %s. This is not yet supported. Please change this address and try again.

Fine by me …
I'd like to emphasize, btw, that the alert boxes here are only the last defense against illegal addresses. Optimally, callers won't let such addresses get down here anyway. 
(See comments #83 and #85.)
Comment on attachment 652177 [details] [diff] [review]
ACE for RCPT TO: and error boxes for illegal localparts

Review of attachment 652177 [details] [diff] [review]:
-----------------------------------------------------------------

Fix (or justify leaving off) the error handling, everything else looks good.

::: mailnews/compose/src/nsSmtpProtocol.cpp
@@ +1748,5 @@
> +      const char *start = i;           // first character of the current address
> +      const char *lastAt = nullptr;    // last @ character in the current address
> +      const char *firstEvil = nullptr; // first illegal character in the current address
> +      bool done = !*i;
> +      while (!done)      

trailing whitespace

@@ +1751,5 @@
> +      bool done = !*i;
> +      while (!done)      
> +      {
> +        done = !*i; // eos?
> +        if (done || *i == ',') 

trailing whitespace

@@ +1765,5 @@
> +            {
> +              // illegal char in the domain part, hence use ACE
> +              nsCAutoString domain;
> +              domain.Assign(lastAt + 1, i - lastAt - 1);
> +              converter->ConvertUTF8toACE(domain, domain);

nsresult rv = converter->ConvertUTF8ToACE(...) and handle failures

@@ +1769,5 @@
> +              converter->ConvertUTF8toACE(domain, domain);
> +              addresses.Append(start, lastAt - start + 1);
> +              addresses.Append(domain);
> +              if (!done)
> +                addresses.Append(',');              

trailing whitespace
Attachment #652177 - Flags: review?(irving) → review-
Comment on attachment 652177 [details] [diff] [review]
ACE for RCPT TO: and error boxes for illegal localparts

Review of attachment 652177 [details] [diff] [review]:
-----------------------------------------------------------------

::: mail/locales/en-US/chrome/messenger/messengercompose/composeMsgs.properties
@@ +232,5 @@
>  12600=Unable to authenticate to SMTP server %S. It does not support authentication (SMTP-AUTH) but you have chosen to use authentication. Please change the 'Authentication method' to 'None' in the 'Account Settings | Outgoing Server (SMTP)' or contact your email service provider for instructions.
>  
> +## @name NS_ERROR_ILLEGAL_LOCALPART
> +# LOCALIZATION NOTE (12601): %s is an email address with an illegal localpart
> +12601=The recipient %s contains illegal characters in the localpart. Please correct the address and try again.

Just a quick l10n fly-by: Please try and avoid using a number as the id here. We have open bugs on removing these, and l10n don't like them. If it means that you need to fix the other bug to not use the number, then I'd be ok with landing it, but please take a look at avoiding it first.
Attachment #652177 - Flags: review?(mbanner)
Attached patch more error checking (obsolete) — Splinter Review
Changes:
- checking for conversion errors UTF-8 → ACE
- string id for alert text (I took the liberty to use correct indentation already since the code around it will have to change anyway when killing the other numeric ids)
- alert text as per Magnus' suggestion
- minor comment and whitespace fixup
Carrying over sr+.
Attachment #652177 - Attachment is obsolete: true
Attachment #679873 - Flags: superreview+
Attachment #679873 - Flags: review?(irving)
Attachment #679873 - Flags: review?(mbanner)
Comment on attachment 679873 [details] [diff] [review]
more error checking

Review of attachment 679873 [details] [diff] [review]:
-----------------------------------------------------------------

I'd love to see an xpcshell test case for this, but it would probably involve the SMTP fakeserver unless there's a way to get at the envelope recipient from Javascript.

r=me on the patch itself in any case.
Attachment #679873 - Flags: review?(irving) → review+
Comment on attachment 679873 [details] [diff] [review]
more error checking

Review of attachment 679873 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/compose/src/nsSmtpProtocol.cpp
@@ +94,5 @@
> +    case NS_ERROR_ILLEGAL_LOCALPART:
> +      bundle->GetStringFromName(
> +        NS_LITERAL_STRING("errorIllegalLocalPart").get(),
> +        getter_Copies(eMsg));
> +      msg = nsTextFormatter::vsmprintf(eMsg.get(), args);

Ok, this should probably be using formatStringFromName on the bundle, but seeing as the rest of the function is done this way, I think using vsmprintf is fine.

r=me for the string parts.
Attachment #679873 - Flags: review?(mbanner) → review+
(In reply to Irving Reid (:irving) from comment #109)
> I'd love to see an xpcshell test case for this

I do have a test for the convert-the-domain-part-to-ACE part, but I'm stuck at testing the invalid-character-in-the-local-part case: the backend wants to show a message box window, uncatchably crashing the test:

WARNING: NS_ENSURE_TRUE(m_callbacks) failed: file /home/kd/projects/mozilla/mozilla.org/src/trunk/mailnews/compose/src/nsSmtpUrl.cpp, line 845
###!!! ASSERTION: attempted to open a new window with no WindowCreator: 'mWindowCreator', file /home/kd/projects/mozilla/mozilla.org/src/trunk/mozilla/embedding/components/windowwatcher/src/nsWindowWatcher.cpp, line 707

Any suggestions?
If it is one of the "standard" alert dialogs, then you should be able to override it with something from this:

http://mxr.mozilla.org/comm-central/source/mailnews/test/resources/alertTestUtils.js

There's a few different objects we use in there in different places, you'll find examples throughout the xpcshell tests.
Attached patch xpcshell test (obsolete) — Splinter Review
(In reply to Mark Banner (:standard8) from comment #112)
> There's a few different objects we use in there in different places, you'll
> find examples throughout the xpcshell tests.

Thanks!
Attachment #682929 - Flags: review?(irving)
Comment on attachment 682929 [details] [diff] [review]
xpcshell test

Review of attachment 682929 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/compose/test/unit/test_bug127399.js
@@ +1,1 @@
> +/* -*- Mode: C++; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- */

Please use a more descriptive file name than the bug number.

@@ +175,5 @@
> +  type = "bug127399";
> +  registerAlertTestUtils();
> +
> +  var strBundleService = Cc["@mozilla.org/intl/stringbundle;1"]
> +                           .getService(Ci.nsIStringBundleService);

Services.strings

@@ +183,5 @@
> +
> +  // Ensure we have at least one mail account
> +  loadLocalMailAccount();
> +  var acctMgr = Cc["@mozilla.org/messenger/account-manager;1"]
> +                  .getService(Ci.nsIMsgAccountManager);

MailServices.accounts
Comment on attachment 682929 [details] [diff] [review]
xpcshell test

Review of attachment 682929 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the extra review Magnus. I agree about the test name; while we have many existing test cases named after bugs, we're trying to move away from that.

I pushed a try build just to make sure there aren't any platform specific timing issues; as long as that's OK, r=irving after you make Magnus' changes and fix the nits below.

::: mailnews/compose/test/unit/test_bug127399.js
@@ +169,5 @@
> +  do_check_eq(exceptionCaught, aExceptionExpected);
> +}
> +
> +
> +function run_test() 

Trailing white space

@@ +192,5 @@
> +  // Plain ASCII recipient address.
> +  test = kToASCII;
> +  DoSendTest(kToASCII, kToASCII, 0);
> +
> +  // Test 2: 

Trailing white space

@@ +202,5 @@
> +  // transaction (only), i.e. the To: header will stay as stated by the sender.
> +  test = kToValid;
> +  DoSendTest(kToValid, kToValidACE, 0);
> +
> +  // Test 3: 

Trailing white space

@@ +208,5 @@
> +  // allowed with unextended SMTP.
> +  // The old code would just strip the invalid character and try to send the
> +  // message to the remaining - wrong! - address.
> +  // The new code will present an informational message box and deny sending,
> +  // i.e. the test here will crash because we can't open windows from here.

Is this line of the comment still correct?
Attachment #682929 - Flags: review?(irving) → review+
Try run for 623c9f75fca9 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=623c9f75fca9
Results (out of 24 total builds):
    exception: 2
    success: 13
    warnings: 6
    failure: 3
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/ireid@mozilla.com-623c9f75fca9
Comment on attachment 682929 [details] [diff] [review]
xpcshell test

Try build tests on Windows failed because of in-line UTF-8 text in the .js tests
Attachment #682929 - Flags: review+ → review-
Contrary to comment #118, the actual cause of the Windows test failures is not the xpcshell test code but the Windows implementation of the isprint function in the original patch. (Kudos to mcsmurf for letting me use him as a debugger!)

This patch combines a strict RfC-5322-comformance check instead of using the isprint function with an xpcshell test updated to review comments.

Try-server build logs under https://tbpl.mozilla.org/?tree=Thunderbird-Try&rev=6c313f632acd show that this makes this xpcshell test on Windows as well (the test failures are not related to this patch and test).

Carrying over Neil's sr+ and Mark's string r+.
Attachment #679873 - Attachment is obsolete: true
Attachment #682929 - Attachment is obsolete: true
Attachment #684530 - Flags: superreview+
Attachment #684530 - Flags: review?(irving)
Comment on attachment 684530 [details] [diff] [review]
combined patch + xpcshell test

Review of attachment 684530 [details] [diff] [review]:
-----------------------------------------------------------------

It's a fine patch.
Attachment #684530 - Flags: review?(irving) → review+
<http://hg.mozilla.org/comm-central/rev/5a397838e33e>
Status: NEW → RESOLVED
Closed: 12 years ago
Keywords: helpwanted
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 20.0
Depends on: 856506
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: