Closed Bug 854812 Opened 7 years ago Closed 7 years ago

Do not allow email with sub-domains or tld starting or ending with a '-' in (<input type='email'>)

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla23

People

(Reporter: Ms2ger, Assigned: alice.lieutier)

Details

Attachments

(1 file, 2 obsolete files)

From what I understand, the change is basically that a domain or tld part cannot start or end with a dash.

Here are the regexps:
old one:
/^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9-]+(?:\.[a-zA-Z0-9-]+)*$/

new one:
/^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,253}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,253}[a-zA-Z0-9])?)*$/
Doesn't that limit the size of some parts of the email? I guess {0,253} is there for that, right?
Yes sorry. So, the domain part should be:
[a-zA-Z0-9]              // one alphanumeric
(?:                      // possibly followed by  
 [a-zA-Z0-9-]{0,253}       // 0 to 253 alphanumeric or dash
 [a-zA-Z0-9]               // and one alphanumeric
)?
(?:                      // possibly followed by
  \.                       // one dot
  [a-zA-Z0-9]              // one alphanumeric
  (?:                      // possibly followed by
    [a-zA-Z0-9-]{0,253}       // 0 to 253 alphanumeric or dash
    [a-zA-Z0-9])              // and one alphanumeric
  ?)

So the total length of the domain part would be (255 + 1 + 255) = 511 characters.

Incidentally, the minimum length would be one alphanumeric character.
The last bit being a star, you can actually have as many tld parts as you want, so it's more like:
[a-zA-Z0-9]              // one alphanumeric
(?:                      // possibly followed by  
 [a-zA-Z0-9-]{0,253}       // 0 to 253 alphanumeric or dash
 [a-zA-Z0-9]               // and one alphanumeric
)?
(?:                      // possibly followed by as many as you want of
  \.                       // one dot
  [a-zA-Z0-9]              // one alphanumeric
  (?:                      // possibly followed by
    [a-zA-Z0-9-]{0,253}       // 0 to 253 alphanumeric or dash
    [a-zA-Z0-9]               // and one alphanumeric
  )?
)*

So foo@f.f.f.f.f is permitted, as there is no actual limit to the total length. Just that a part between two dots should not be longer than 255.
Attached patch updated code and test file (obsolete) — Splinter Review
This patch should cover the new rules. I updated the testfile to add some nez testcases.
Attachment #734268 - Flags: review?(mounir)
Comment on attachment 734268 [details] [diff] [review]
updated code and test file

Ms2ger, would you do the honour? :)
Attachment #734268 - Flags: review?(mounir) → review?(Ms2ger)
Assignee: nobody → alice.lieutier
Status: NEW → ASSIGNED
Comment on attachment 734268 [details] [diff] [review]
updated code and test file

Review of attachment 734268 [details] [diff] [review]:
-----------------------------------------------------------------

In my opinion the spec changes that requires label size limitation is wrong. Punydecoding should already be doing that check (though, we ignore it) and in that case, the limit is 63 characters, not 255. I think we should fix the HTML specification and implement it another way.

I think we should avoid any size checking for the moment, see if Hixie agrees to fix the specification and then rely on punydecoding failures.

Alice, could you attach a new patch based on those comments?
Ms2ger, do you agree with that?

::: content/html/content/src/HTMLInputElement.cpp
@@ +5742,2 @@
>        // A dot can't follow a dot.
> +      // A dot can't follow a dash.

// A dot can't follow a dot or a dash.

::: content/html/content/test/forms/test_input_email.html
@@ +112,5 @@
>    // Long strings with UTF-8.
>    [ 'this.is.email.should.be.longer.than.sixty.four.characters.föö@mözillä.tld', true ],
>    [ 'this-is-email-should-be-longer-than-sixty-four-characters-föö@mözillä.tld', true, true ],
> +  // cover the spec change
> +  [ 'foo@f', true ],

This is already tested above (foo@b).

@@ +113,5 @@
>    [ 'this.is.email.should.be.longer.than.sixty.four.characters.föö@mözillä.tld', true ],
>    [ 'this-is-email-should-be-longer-than-sixty-four-characters-föö@mözillä.tld', true, true ],
> +  // cover the spec change
> +  [ 'foo@f', true ],
> +  [ 'foo@f.f.f', true ],

such as this (foo@tulip.foo.bar).
FWIW, I have open a bug against the specifications regarding the label size limitation:
https://www.w3.org/Bugs/Public/show_bug.cgi?id=21617
Attached patch updated patch (obsolete) — Splinter Review
I updated the patch according to Mounir comments. 

We do not check for length anymore, so the only change is to check that any label of the domain part does not start or end with a dash.
Attachment #734268 - Attachment is obsolete: true
Attachment #734268 - Flags: review?(Ms2ger)
Attachment #735439 - Flags: review?(Ms2ger)
Comment on attachment 735439 [details] [diff] [review]
updated patch

Review of attachment 735439 [details] [diff] [review]:
-----------------------------------------------------------------

I'm sorry, I don't see myself having time to dive into this code long enough to understand it well enough to review your patch.
Attachment #735439 - Flags: review?(Ms2ger)
Comment on attachment 735439 [details] [diff] [review]
updated patch

Will do that.
Attachment #735439 - Flags: review?(mounir)
Comment on attachment 735439 [details] [diff] [review]
updated patch

Kyle, if you could double-check that patch, I wouldn't mind :)
Attachment #735439 - Flags: review?(khuey)
Comment on attachment 735439 [details] [diff] [review]
updated patch

Review of attachment 735439 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with the comments applied, also please change the description of the comment to something that describe better the content of the patch.

::: content/html/content/src/HTMLInputElement.cpp
@@ +5698,1 @@
>    // we know it's invalid.

Could you reformat the comment to have lines that a wrapped at 80 columns?

@@ +5719,1 @@
>    // that's not a valid email address.

ditto

::: content/html/content/test/forms/test_input_email.html
@@ +111,5 @@
>    [ 'foo@fu.cüm', true ],
>    // Long strings with UTF-8.
>    [ 'this.is.email.should.be.longer.than.sixty.four.characters.föö@mözillä.tld', true ],
>    [ 'this-is-email-should-be-longer-than-sixty-four-characters-föö@mözillä.tld', true, true ],
> +  // cover the spec change

I would put a comment more like:
"// The domains parts (sub-domains or tld) can't start or finish with a '-'."
Attachment #735439 - Flags: review?(mounir) → review+
Summary: Update email pattern to spec change → Do not allow email with sub-domains or tld starting or ending with a '-' in (<input type='email'>)
Comment on attachment 735439 [details] [diff] [review]
updated patch

Review of attachment 735439 [details] [diff] [review]:
-----------------------------------------------------------------

If we haven't settled the length question yet this looks fine.  r=me
Attachment #735439 - Flags: review?(khuey) → review+
Thanks for the review Kyle :)
I applied the review comments from Mounir.

I don't have sufficient rights to send this to Try, can one of you do it?
Attachment #735439 - Attachment is obsolete: true
https://hg.mozilla.org/integration/mozilla-inbound/rev/17f9958f5d99
Flags: in-testsuite+
Target Milestone: --- → mozilla23
https://hg.mozilla.org/mozilla-central/rev/17f9958f5d99
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
The size limitation bug is bug 884332.
You need to log in before you can comment on or make changes to this bug.