Support internationalized <input type="email">

RESOLVED FIXED in mozilla13

Status

()

Core
DOM: Core & HTML
RESOLVED FIXED
7 years ago
5 years ago

People

(Reporter: Daniel.S, Assigned: mounir)

Tracking

(Blocks: 1 bug, {intl})

Trunk
mozilla13
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [parity-opera])

Attachments

(1 attachment, 1 obsolete attachment)

(Reporter)

Description

7 years ago
I noticed that input@type=email doesn't seem to support IDNs, for example the valid address "Max.Müller@example.org" isn't accepted as a valid mail address due to the u umlaut.
However, that string is a valid e-mail address and addresses like this get used more and more.

I'm sorry if this is too similar to issues like bug 127399 or bug 410763 (which are Thunderbird issues).
Note that this is purely a UI issue; "Max.Müller@example.org" is not an acceptable value for an email input. However, we should show "Max.Müller@example.org" while keeping "Max.xn--mller-kva@example.org" as the actual value.

Hence, sending to Firefox. Mounir, do move it to a better place if you know one.
Component: DOM: Core & HTML → General
Product: Core → Firefox
QA Contact: general → general
(Assignee)

Comment 2

7 years ago
I'm not sure where this should be done but probably not in the Firefox component. I guess this could be done by the content so every time a value is set, it will be converted to the correct IDN string. But the editor should be able to understand that too.

Jonas, Ehsan, opinions?
Component: General → General
Product: Firefox → Core
QA Contact: general → general
Mounir, what does the spec say about this?  We can go through hoops to make the underlying value of the editor to be the punycode and the displayed value being the Unicode value, but I think this is something that the spec needs to address.  Also, the same thing goes with <input type=url>, right?
Component: General → DOM: Core & HTML
QA Contact: general → general
Summary: <input type="email"> doesn't support IDNs → <input type="email"> and <input type="url"> don't support IDNs
(In reply to comment #3)
> Mounir, what does the spec say about this?  We can go through hoops to make the
> underlying value of the editor to be the punycode and the displayed value being
> the Unicode value, but I think this is something that the spec needs to
> address.  Also, the same thing goes with <input type=url>, right?

That's what the spec calls for.
(In reply to comment #4)
> (In reply to comment #3)
> > Mounir, what does the spec say about this?  We can go through hoops to make the
> > underlying value of the editor to be the punycode and the displayed value being
> > the Unicode value, but I think this is something that the spec needs to
> > address.  Also, the same thing goes with <input type=url>, right?
> 
> That's what the spec calls for.

Could you please post a link?
> User agents may transform the value for display and editing (e.g. converting
> punycode in the value to IDN in the display and vice versa).

<http://www.whatwg.org/html/#e-mail-state>
Duplicate of this bug: 623120
So, if I'm reading the spec correctly, we should return the puny code from nsHTMLInputElement::GetValue, and we can continue using the IDN variation for display and editing.  What do you think, Mounir?
Also, we should convert to punycode for validation...  It seems to me that all of the punycode conversion needs to happen in the content, and it doesn't need to change anything on the editor side.
Could we simply add two functions like:

GetDisplayValue/SetDisplayValue which puny-decodes and puny-endcodes respectively. Then whenever editor picks up the value from the input element it uses GetDisplayValue and whenever the editor wants to poke the value back into the input it uses SetDisplayValue.

That way the content side of things always deals with the value which the DOM and submission code uses. And editor always sees a user-friendly value. And those two functions handle the conversion in between.

Later we can expand those functions to deal with comma-separation issues for multiple email addresses etc.
(Assignee)

Comment 11

6 years ago
I agree with Jonas: it would be better to have the content only dealing with puny-encoded values and have the editor requesting puny-decoded value and setting puny-encoded ones.

Do we want to fix this for Gecko 2.0?
(Assignee)

Comment 12

6 years ago
(In reply to comment #10)
> Later we can expand those functions to deal with comma-separation issues for
> multiple email addresses etc.

I don't think we would be able to use those functions for comma-separation issues. Punycode creates a unique and reversible code but the comma-separation doesn't.
IOW: "foo@bar.com, bar@bar.com" AND "   foo@bar.com   , bar@bar.com   " will have the exact some DOM value: "foo@bar.com,bar@bar.com".
(In reply to comment #11)
> I agree with Jonas: it would be better to have the content only dealing with
> puny-encoded values and have the editor requesting puny-decoded value and
> setting puny-encoded ones.
> 
> Do we want to fix this for Gecko 2.0?

I'd say yes.  In its current form, these input fields are pretty broken for international users.
blocking2.0: --- → ?
Keywords: intl
Sicking convinced me that this shouldn't block.  Honestly, at this point, it doesn't take a lot to convince me that _any_ bug shouldn't block!  ;-)
blocking2.0: ? → ---
(Assignee)

Comment 15

6 years ago
Do we have generic puny-encode / puny-decode methods? I didn't see anything except GetASCIIOrigin and GetUTFOrigin which are far from being generic.
nsIIDNService should have useful things on it.
(Reporter)

Updated

6 years ago
Blocks: 344614
Whiteboard: [parity-opera]
(Assignee)

Comment 17

6 years ago
Created attachment 543018 [details] [diff] [review]
Proof of Concept

I think we made the complex over-complicated in the comments. The easiest solution is only to transform to punycode when we want to validate the value or submit it.

A few comments about this patch:
- I think it would be better to add convertUTF8toACE and convertUTF16toACE methods taking a inout argument to nsIIDNService but I'm not sure if they are specific reasons why it hasn't been done that way initially;
- nsIIOService::NewURI might punyencode the value already. At least, <input type='url'> accepts values with UTF8 characters;
- Tests are missing.
Assignee: nobody → mounir
Status: NEW → ASSIGNED
Attachment #543018 - Flags: feedback?(jonas)
Comment on attachment 543018 [details] [diff] [review]
Proof of Concept

Review of attachment 543018 [details] [diff] [review]:
-----------------------------------------------------------------

Don't you also need to return punycode for .value? I think you can add that conversion to G/SetValue though. But I like the general approach.

::: content/base/public/nsContentUtils.h
@@ +1723,5 @@
>     */
>    static void InitializeTouchEventTable();
> +
> +  static void TransformToPunycode(nsAString& aValue);
> +  static void TransformToPunycode(nsACString& aValue);

I'd rather this didn't use in-out parameters and instead used separate in and out arguments.
Attachment #543018 - Flags: feedback?(jonas) → feedback+
(Assignee)

Comment 19

6 years ago
This patch is a bit wrong: the specs doesn't ask us to submit punycoded value for url fields. And these fields actually handles very well UTF-8 values.
I've open bug 670883 to test UTF-8 values for <input type=url>.
Summary: <input type="email"> and <input type="url"> don't support IDNs → Support internationalized <input type="email">
(Assignee)

Comment 20

6 years ago
I did reopen the W3 bug because I do not agree with the resolution: there is no reason to submit punycoded value for <input type='email'> when we do submit UTF-8 values for <input type='text'>. Even if SMTP servers might not accept UTF-8 email addresses, website are already used to manage that situations. No need to be over-protective I believe.

We should only allow/validate UTF-8 values.
(Assignee)

Comment 21

6 years ago
Created attachment 545359 [details] [diff] [review]
Patch v1

Puny-encode value before validating.

I will have to open a follow-up because if an email address is longer than 63 characters without "." and contains UTF-8 characters, it will not be validated because of a nsIIDNService implementation limitation related to DNS.
Attachment #543018 - Attachment is obsolete: true
Attachment #545359 - Flags: review?(jonas)
(Assignee)

Updated

6 years ago
Whiteboard: [parity-opera] → [parity-opera][needs review]
sicking: ping?
Comment on attachment 545359 [details] [diff] [review]
Patch v1

Sorry about the extreme slowness :(

Hot chocolate is on me.
Attachment #545359 - Flags: review?(jonas) → review+
(Assignee)

Updated

5 years ago
Flags: in-testsuite+
Whiteboard: [parity-opera][needs review] → [parity-opera]
Target Milestone: --- → mozilla13
(Assignee)

Updated

5 years ago
Attachment #545359 - Flags: checkin+
https://hg.mozilla.org/mozilla-central/rev/34d97151ab88
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.