I have already fixed on my branch at https://github.com/hsivonen/rust-url/tree/icu4x as a side effect of reimplementing the internals of the `idna` crate and noticing that this (literally per spec, AFAICT!) behavior made no sense and didn't match Firefox and Safari, and I have already reported this to the Unicode Consortium as a spec bug. Pulling the code into Gecko is bug 1889536. Prerequisites include reviewing and landing the above-mentioned branch, and also getting https://github.com/unicode-org/icu4x/pull/4712 and https://github.com/unicode-org/icu/pull/2945 reviewed and landed (unless we want to take a lot of code into Gecko that's not upstreamed, yet). (In reply to Anne (:annevk) from comment #4) > When I look at https://www.rfc-editor.org/rfc/rfc3492.txt section 6.2 it has this step: > > > {if n is a basic code point then fail} How is this step relevant to the problem reported here? AFAICT, the problem here is that the loop condition "while the input is not exhausted do begin" simply ends up not looping at all and never looping not being an error. I think the best way to fix this is to check for the label ending with a hyphen when it has been established that the label starts with `xn--`. If the label is just "xn--", it ends with a hyphen and should be treated as an error. Likewise, "xn--test-" ends with a hyphen: that is, if there are no Punycode digits, the label ends with a hyphen, which needs to be treated as an error. (I asked for an ends-with-hyphen check to be added to UTS 46.)
Bug 1887898 Comment 5 Edit History
Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.
I have already fixed this on my branch at https://github.com/hsivonen/rust-url/tree/icu4x as a side effect of reimplementing the internals of the `idna` crate and noticing that this (literally per spec, AFAICT!) behavior made no sense and didn't match Firefox and Safari, and I have already reported this to the Unicode Consortium as a spec bug. Pulling the code into Gecko is bug 1889536. Prerequisites include reviewing and landing the above-mentioned branch, and also getting https://github.com/unicode-org/icu4x/pull/4712 and https://github.com/unicode-org/icu/pull/2945 reviewed and landed (unless we want to take a lot of code into Gecko that's not upstreamed, yet). (In reply to Anne (:annevk) from comment #4) > When I look at https://www.rfc-editor.org/rfc/rfc3492.txt section 6.2 it has this step: > > > {if n is a basic code point then fail} How is this step relevant to the problem reported here? AFAICT, the problem here is that the loop condition "while the input is not exhausted do begin" simply ends up not looping at all and never looping not being an error. I think the best way to fix this is to check for the label ending with a hyphen when it has been established that the label starts with `xn--`. If the label is just "xn--", it ends with a hyphen and should be treated as an error. Likewise, "xn--test-" ends with a hyphen: that is, if there are no Punycode digits, the label ends with a hyphen, which needs to be treated as an error. (I asked for an ends-with-hyphen check to be added to UTS 46.)