Closed
Bug 656009
Opened 13 years ago
Closed 13 years ago
Email address validation doesn't match HTML5 spec - presence of \
Categories
(Core :: DOM: Core & HTML, defect)
Core
DOM: Core & HTML
Tracking
()
RESOLVED
INVALID
People
(Reporter: gerv, Unassigned)
Details
tl;dr: - Usernames can include backslashes in our code, but not in HTML5. http://mxr.mozilla.org/mozilla-central/source/content/html/content/src/nsHTMLInputElement.cpp#4018 vs. http://www.whatwg.org/specs/web-apps/current-work/multipage/states-of-the-type-attribute.html#e-mail-state HTML5 says: "1*( atext / "." ) "@" ldh-str *( "." ldh-str ) where atext is defined in RFC 5322 section 3.2.3, and ldh-str is defined in RFC 1034 section 3.5." Mining those RFCs: atext = ALPHA / DIGIT / ; Printable US-ASCII "!" / "#" / ; characters not including "$" / "%" / ; specials. Used for atoms. "&" / "'" / "*" / "+" / "-" / "/" / "=" / "?" / "^" / "_" / "`" / "{" / "|" / "}" / "~" <ldh-str> ::= <let-dig-hyp> | <let-dig-hyp> <ldh-str> <let-dig-hyp> ::= <let-dig> | "-" <let-dig> ::= <letter> | <digit> <letter> ::= any one of the 52 alphabetic characters A through Z in upper case and a through z in lower case <digit> ::= any one of the ten digits 0 through 9 That leads to the following, if I'm not mistaken: /^[a-z0-9.!#$%&'*+\-/=?\^_`{|}~]+@[a-z0-9-]+(\.[a-z0-9-]+)*$/i which can be reduced to: /^[\w.!#$%&'*+\-/=?\^`{|}~]+@[a-z0-9-]+(\.[a-z0-9-]+)*$/i I note a discrepancy: usernames can include backslashes in our code, but not in HTML5 Also, do we really not have access to a regex engine in this code? Surely that would be more efficient, even if it involved a call to JS? Perhaps it's a different issue, but this is not very forwardly-compatible - isn't someone working on a spec for allowing unicode characters in the local part? Gerv
Comment 1•13 years ago
|
||
Actually, there is no '\' allowed. What you thought was a '\' is actually a '\''. I had to escape the '. You can easily test that with this data URL: data:text/html,<style>:invalid { box-shadow: 0 0 1.5px 1px red; }</style><input type='email' value='foo\bar@mail.com'> And I do not think that we need to use a regexp for the email address. We are using a regexp for the pattern attribute (through the JS engine) so it's technically doable but doesn't seem useful given that this code is readable right now. For IDN, the specs require to use punycode so the validation isn't going to change AFAIUI.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → INVALID
Updated•13 years ago
|
Component: Layout: Form Controls → DOM: Core & HTML
QA Contact: layout.form-controls → general
Version: unspecified → Trunk
Reporter | ||
Comment 2•13 years ago
|
||
You are quite right - fair enough :-) Thanks. My concern about using a regexp engine is that this code performs 2 function calls and 20 comparisons per character for the local part, and 2 function calls per character for the domain part, and the new definitions of moz-is-valid and so on do, in some cases, cause for validity to be checked after each change to the input field. Perhaps it's still so tiny as to make no difference, I don't know. But it seems inefficient to me :-) Gerv
Comment 3•13 years ago
|
||
I don't have any data on that but I would bet that any regexp will be slower than this code given that it is really specific to what we want.
Reporter | ||
Comment 4•13 years ago
|
||
I'm no expert on regexps but I know the regexp engine compiles them (presumably once, if you write it right) and then they are very fast. For a straightforward one like this, I'm sure it would better than 20 comparisons and 2 function calls per character! We could either ask Brendan or a JS person, or we could decide it's not performance-critical anyway. :-) Gerv
You need to log in
before you can comment on or make changes to this bug.
Description
•