Closed Bug 933494 Opened 6 years ago Closed 6 years ago
Public Suffix List tests don't test internationalized domain names
The test set for the PSL (netwerk/test/unit/data/test_psl.txt) doesn't contain any tests of internationalized domain names. I don't think it needs anything terribly complicated, but I think there should be *something*, such as the attached patch (relying on the existence of 公司.cn). The proposed test assumes that if you put in UTF-8 you want to get UTF-8 back out, and if you put IDN-coded names you want IDN-coded back. If that's not considered the appropriate behavior I am happy to change it.
(I don't really know whether gerv is an appropriate reviewer here - feel free to reassign.)
Can you also check full IDNs (i.e. some of those, recently added, where the TLD is also IDN)? As to what the appropriate behaviour is, in a sense that's partly defined by what the checkPublicSuffix() function is intended to _do_ :-) (There are several possible APIs one could imagine for accessing the PSL.) The behaviour at the moment seems sane to me, but... Gerv
(In reply to Gervase Markham [:gerv] from comment #2) > Can you also check full IDNs (i.e. some of those, recently added, where the > TLD is also IDN)? Certainly. (And I suppose I should also verify that this test passes on our own code as well as the third-party code I was testing last night, huh. ;-) > As to what the appropriate behaviour is, in a sense that's partly defined by > what the checkPublicSuffix() function is intended to _do_ :-) (There are > several possible APIs one could imagine for accessing the PSL.) The > behaviour at the moment seems sane to me, but... I don't have any real opinion about what is more convenient for API users. For a test suite, though, I think "you get out what you put in" is an elegant way to ensure that both Unicode and IDNA input are properly handled.
Well, this was more of a rathole than I really wanted it to be, but it's done now. Turns out the eTLDService always gives back IDNA-encoded strings, regardless of what you put in; since I still think the argument above is compelling, I had to work around it in the test case. And then I tripped over a bug down in the guts of xpcshell (see bug 933885). The change to effective_tld_names.dat just strips all the trailing whitespace.
Comment on attachment 826036 [details] [diff] [review] 933494-psl-idn-tests.diff r=gerv if you replace "IDNA-encoded" with "Punycoded", because the former term is ambiguous. Gerv
Attachment #826036 - Flags: review?(gerv) → review+
(In reply to Gervase Markham [:gerv] from comment #5) > r=gerv if you replace "IDNA-encoded" with "Punycoded", because the former > term is ambiguous. Changed in my copy. Will land together with bug 933885 after that is fully reviewed.
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
You need to log in before you can comment on or make changes to this bug.