Closed Bug 986241 Opened 10 years ago Closed 10 years ago

Wrong PSL entries: domains with spaces inside

Categories

(Core Graveyard :: Networking: Domain Lists, defect)

defect
Not set
normal

Tracking

(firefox31 fixed)

RESOLVED FIXED
mozilla31
Tracking Status
firefox31 --- fixed

People

(Reporter: mozilla, Unassigned)

References

Details

(Whiteboard: [qa-])

Attachments

(1 file, 1 obsolete file)

The current Public Suffix list at http://publicsuffix.org/list/effective_tld_names.dat (sha1 00401fac245e7da20280e9bef889fd9a650b764a) contain two domains with spaces:
bergamo .it
potenza .it

which means top-level domains "bergamo" and "potenza", with a comment of ".it".

Expected:
bergamo.it
potenza.it


Additionally, there are other 361 domains which have a trailing space (all of them .it ones), which are harmless but probably unintended.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Attached patch bug986241.patch (obsolete) — Splinter Review
The wrong behavior can be easily confirmed by looking at prepare_tlds output:
$ python2 ./prepare_tlds.py effective_tld_names.dat | egrep "bergamo|potenza"
ETLD_ENTRY("bergamo", false, false)
ETLD_ENTRY("potenza", false, false)

I am attaching a patch covering the two issues -wrong entries and trailing whitespace- (as two commits).


After applying the generated file, as expected, only differ in that the wrong domain entries are now correct:
--- a	2014-03-21 18:10:59.059821079 +0100
+++ b	2014-03-21 18:37:14.829816330 +0100
@@ -886,3 +886,3 @@
 ETLD_ENTRY("benevento.it", false, false)
-ETLD_ENTRY("bergamo", false, false)
+ETLD_ENTRY("bergamo.it", false, false)
 ETLD_ENTRY("bg.it", false, false)
@@ -1041,3 +1041,3 @@
 ETLD_ENTRY("pordenone.it", false, false)
-ETLD_ENTRY("potenza", false, false)
+ETLD_ENTRY("potenza.it", false, false)
 ETLD_ENTRY("pr.it", false, false)
Attachment #8394928 - Flags: review?(gerv)
Comment on attachment 8394928 [details] [diff] [review]
bug986241.patch

Review of attachment 8394928 [details] [diff] [review]:
-----------------------------------------------------------------

r=gerv. Although part of me thinks we should keep some trailing whitespace, just to make sure people's parsers can handle it.

Gerv
Attachment #8394928 - Flags: review?(gerv) → review+
I think it's more important to have a comment separated from the domain by whitespace. If they get it right, trailing whitespace should be no problem.
Maybe add some of this cases in a test TLD, like suggested in bug 943800
Keywords: checkin-needed
Rebased patch with the appropiate r=gerv in the commit msg.
Attachment #8394928 - Attachment is obsolete: true
Thanks Wes. Is it the interface or did it create a single commit instead of two?
https://hg.mozilla.org/mozilla-central/rev/212730695575
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla31
Whiteboard: [qa-]
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: