Closed
Bug 1451364
Opened 6 years ago
Closed 5 years ago
Issues resolving DNS records for wiki.mozilla.org
Categories
(Infrastructure & Operations :: SRE, task)
Infrastructure & Operations
SRE
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: Usul, Unassigned)
References
Details
jcristau> hmm. did the wiki.m.o cname get switched back to something.appsvcs-generic.nubis.allizom.org? <jcristau> i can't resolve it now * Usul checks <jcristau> wiki.mozilla.org. 59 IN CNAME www.wiki.prod.core.us-west-2.appsvcs-generic.nubis.allizom.org. <jcristau> www.wiki.prod.core.us-west-2.appsvcs-generic.nubis.allizom.org. 173 IN CNAME wiki-prod-1394614349.us-west-2.elb.amazonaws.com. <Usul> https://usul.pastebin.mozilla.org/9082147 <jcristau> my resolver doesn't like the second cname, likely dnssec-related, it already broke the other day and was worked around by having the cname directly to foo.elb.amazonaws.com with VPN on [ludovic@poney ~]$ dig wiki.mozilla.org ; <<>> DiG 9.9.4-RedHat-9.9.4-51.el7_4.2 <<>> wiki.mozilla.org ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55060 ;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 4, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;wiki.mozilla.org. IN A ;; ANSWER SECTION: wiki.mozilla.org. 60 IN CNAME www.wiki.prod.core.us-west-2.appsvcs-generic.nubis.allizom.org. www.wiki.prod.core.us-west-2.appsvcs-generic.nubis.allizom.org. 190 IN CNAME wiki-prod-1394614349.us-west-2.elb.amazonaws.com. wiki-prod-1394614349.us-west-2.elb.amazonaws.com. 60 IN A 52.89.171.193 wiki-prod-1394614349.us-west-2.elb.amazonaws.com. 60 IN A 54.68.243.126 wiki-prod-1394614349.us-west-2.elb.amazonaws.com. 60 IN A 34.213.203.225 ;; AUTHORITY SECTION: us-west-2.elb.amazonaws.com. 1534 IN NS ns-1475.awsdns-56.org. us-west-2.elb.amazonaws.com. 1534 IN NS ns-1769.awsdns-29.co.uk. us-west-2.elb.amazonaws.com. 1534 IN NS ns-332.awsdns-41.com. us-west-2.elb.amazonaws.com. 1534 IN NS ns-560.awsdns-06.net. ;; Query time: 20 msec ;; SERVER: 62.210.16.6#53(62.210.16.6) ;; WHEN: mer. avril 04 15:06:51 2018 ;; MSG SIZE rcvd: 362 [ludovic@poney ~]$ [ludovic@poney ~]$ dig wiki.mozilla.org @9.9.9.9 ; <<>> DiG 9.9.4-RedHat-9.9.4-51.el7_4.2 <<>> wiki.mozilla.org @9.9.9.9 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 47945 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;wiki.mozilla.org. IN A ;; Query time: 455 msec ;; SERVER: 9.9.9.9#53(9.9.9.9) ;; WHEN: mer. avril 04 15:08:28 2018 ;; MSG SIZE rcvd: 45 [ludovic@poney ~]$ dig wiki.mozilla.org @8.8.8.8 ; <<>> DiG 9.9.4-RedHat-9.9.4-51.el7_4.2 <<>> wiki.mozilla.org @8.8.8.8 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23114 ;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;wiki.mozilla.org. IN A ;; ANSWER SECTION: wiki.mozilla.org. 59 IN CNAME www.wiki.prod.core.us-west-2.appsvcs-generic.nubis.allizom.org. www.wiki.prod.core.us-west-2.appsvcs-generic.nubis.allizom.org. 299 IN CNAME wiki-prod-1394614349.us-west-2.elb.amazonaws.com. wiki-prod-1394614349.us-west-2.elb.amazonaws.com. 59 IN A 52.89.171.193 wiki-prod-1394614349.us-west-2.elb.amazonaws.com. 59 IN A 34.213.203.225 wiki-prod-1394614349.us-west-2.elb.amazonaws.com. 59 IN A 54.68.243.126 ;; Query time: 46 msec ;; SERVER: 8.8.8.8#53(8.8.8.8) ;; WHEN: mer. avril 04 15:08:41 2018 ;; MSG SIZE rcvd: 228
Reporter | ||
Comment 1•6 years ago
|
||
[ludo@Oulanl ~]$ dig wiki.mozilla.org @127.0.0.1 ; <<>> DiG 9.11.3-RedHat-9.11.3-2.fc27 <<>> wiki.mozilla.org @127.0.0.1 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 53995 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;wiki.mozilla.org. IN A ;; Query time: 2195 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: mer. avril 04 21:59:44 CEST 2018 ;; MSG SIZE rcvd: 45 [ludo@Oulanl ~]$ dig wiki.mozilla.org @9.9.9.9 ; <<>> DiG 9.11.3-RedHat-9.11.3-2.fc27 <<>> wiki.mozilla.org @9.9.9.9 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 24268 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;wiki.mozilla.org. IN A ;; Query time: 714 msec ;; SERVER: 9.9.9.9#53(9.9.9.9) ;; WHEN: mer. avril 04 22:00:07 CEST 2018 ;; MSG SIZE rcvd: 45 [ludo@Oulanl ~]$ dig wiki.mozilla.org @8.8.8.8 ; <<>> DiG 9.11.3-RedHat-9.11.3-2.fc27 <<>> wiki.mozilla.org @8.8.8.8 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 41154 ;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;wiki.mozilla.org. IN A ;; ANSWER SECTION: wiki.mozilla.org. 59 IN CNAME www.wiki.prod.core.us-west-2.appsvcs-generic.nubis.allizom.org. www.wiki.prod.core.us-west-2.appsvcs-generic.nubis.allizom.org. 189 IN CNAME wiki-prod-1394614349.us-west-2.elb.amazonaws.com. wiki-prod-1394614349.us-west-2.elb.amazonaws.com. 59 IN A 34.213.203.225 wiki-prod-1394614349.us-west-2.elb.amazonaws.com. 59 IN A 52.89.171.193 wiki-prod-1394614349.us-west-2.elb.amazonaws.com. 59 IN A 54.68.243.126 ;; Query time: 41 msec ;; SERVER: 8.8.8.8#53(8.8.8.8) ;; WHEN: mer. avril 04 22:00:15 CEST 2018 ;; MSG SIZE rcvd: 228 [ludo@Oulanl ~]$ dig wiki.mozilla.org @1.1.1.1 ; <<>> DiG 9.11.3-RedHat-9.11.3-2.fc27 <<>> wiki.mozilla.org @1.1.1.1 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16482 ;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1536 ;; QUESTION SECTION: ;wiki.mozilla.org. IN A ;; ANSWER SECTION: wiki.mozilla.org. 41 IN CNAME www.wiki.prod.core.us-west-2.appsvcs-generic.nubis.allizom.org. www.wiki.prod.core.us-west-2.appsvcs-generic.nubis.allizom.org. 281 IN CNAME wiki-prod-1394614349.us-west-2.elb.amazonaws.com. wiki-prod-1394614349.us-west-2.elb.amazonaws.com. 41 IN A 34.213.203.225 wiki-prod-1394614349.us-west-2.elb.amazonaws.com. 41 IN A 52.89.171.193 wiki-prod-1394614349.us-west-2.elb.amazonaws.com. 41 IN A 54.68.243.126 ;; Query time: 77 msec ;; SERVER: 1.1.1.1#53(1.1.1.1) ;; WHEN: mer. avril 04 22:00:22 CEST 2018 ;; MSG SIZE rcvd: 228 [ludo@Oulanl ~]$
Reporter | ||
Comment 2•6 years ago
|
||
[ludo@Oulanl ~]$ traceroute 9.9.9.9 traceroute to 9.9.9.9 (9.9.9.9), 30 hops max, 60 byte packets 1 gateway (192.168.0.254) 2.996 ms 2.907 ms 2.862 ms 2 gri82-1-78-212-88-254.fbx.proxad.net (78.212.88.254) 6.401 ms 6.596 ms 6.848 ms 3 78.254.18.190 (78.254.18.190) 7.082 ms 7.307 ms 7.531 ms 4 toulouse-6k-1-po3.intf.routers.proxad.net (212.27.57.141) 8.100 ms 8.765 ms 8.728 ms 5 bzn-crs16-2-be1107.intf.routers.proxad.net (194.149.160.1) 19.646 ms 19.638 ms 19.598 ms 6 londres-6k-1-po104.intf.routers.proxad.net (194.149.161.238) 24.498 ms 24.838 ms 25.210 ms 7 195.66.225.238 (195.66.225.238) 25.203 ms 24.726 ms 24.902 ms 8 dns.quad9.net (9.9.9.9) 23.368 ms !X 22.918 ms !X 23.219 ms !X [ludo@Oulanl ~]$
Reporter | ||
Comment 3•6 years ago
|
||
avril 04 21:59:34 Oulanl unbound[25177]: [25177:0] info: start of service (unbound 1.6.8). avril 04 21:59:43 Oulanl unbound[25177]: [25177:0] info: validation failure core.us-west-2.appsvcs-generic.nubis.allizom.org. A IN avril 04 21:59:44 Oulanl unbound[25177]: [25177:0] info: validation failure wiki.mozilla.org. A IN
Reporter | ||
Comment 4•6 years ago
|
||
With logging enabled unbound barfs on : avril 04 22:31:41 Oulanl unbound[27049]: [27049:0] info: start of service (unbound 1.6.8). avril 04 22:32:10 Oulanl unbound[27049]: [27049:0] info: validation failure <appsvcs-generic.nubis.allizom.org. NS IN>: no NSEC3 closest encloser from 2600:1401:2::f0 for DS nubis.allizom.org. while building chain of trust avril 04 22:32:10 Oulanl unbound[27049]: [27049:0] info: validation failure <core.us-west-2.appsvcs-generic.nubis.allizom.org. A IN>: no NSEC3 closest encloser from 184.85.248.65 for DS nubis.allizom.org. while building chain of trust avril 04 22:32:10 Oulanl unbound[27049]: [27049:0] info: validation failure <wiki.mozilla.org. A IN>: key for validation nubis.allizom.org. is marked as invalid because of a previous validation failure <core.us-west-2.appsvcs-generic.nubis.allizom.org. A IN>: no NSEC3 closest encloser from 184.85.248.65 for DS nubis.allizom.org. while building chain of trust
Comment 5•6 years ago
|
||
Seems a little strange our SOA is a RFC1918 address and is not in an NS record. This should be "OK" as long as the nameservers can reach the SOA server at that address (unsure if that is the case). $ dig @8.8.8.8 -t soa mozilla.org +short infoblox1.private.mdc2.mozilla.com. sysadmins.mozilla.org. 2019040116 180 180 1209600 60 $ dig @8.8.8.8 -t ns mozilla.org +short ns5-65.akam.net. ns7-66.akam.net. ns1-240.akam.net. ns4-64.akam.net. $ dig @8.8.8.8 infoblox1.private.mdc2.mozilla.com +short 10.50.75.120 Versus something like google.com $ dig @8.8.8.8 -t soa google.com +short ns1.google.com. dns-admin.google.com. 191622961 900 900 1800 60 $ dig @8.8.8.8 -t ns google.com +short ns3.google.com. ns4.google.com. ns2.google.com. ns1.google.com. $ dig @8.8.8.8 ns1.google.com +short 216.239.32.10
Comment 6•6 years ago
|
||
(In reply to Ludovic Hirlimann [:Usul] from comment #4) > With logging enabled unbound barfs on : > avril 04 22:31:41 Oulanl unbound[27049]: [27049:0] info: start of service > (unbound 1.6.8). > avril 04 22:32:10 Oulanl unbound[27049]: [27049:0] info: validation failure > <appsvcs-generic.nubis.allizom.org. NS IN>: no NSEC3 closest encloser from > 2600:1401:2::f0 for DS nubis.allizom.org. while building chain of trust > avril 04 22:32:10 Oulanl unbound[27049]: [27049:0] info: validation failure > <core.us-west-2.appsvcs-generic.nubis.allizom.org. A IN>: no NSEC3 closest > encloser from 184.85.248.65 for DS nubis.allizom.org. while building chain > of trust > avril 04 22:32:10 Oulanl unbound[27049]: [27049:0] info: validation failure > <wiki.mozilla.org. A IN>: key for validation nubis.allizom.org. is marked as > invalid because of a previous validation failure > <core.us-west-2.appsvcs-generic.nubis.allizom.org. A IN>: no NSEC3 closest > encloser from 184.85.248.65 for DS nubis.allizom.org. while building chain > of trust If I followed correctly, this is expected (as desribed by :digi). Our DS record passes a flag that lower delegations will not pass DNSSEC validation once we shift over to the amazonaws delegation [1]. [1] - https://tools.ietf.org/html/rfc5155#section-6
Reporter | ||
Comment 7•6 years ago
|
||
delegation is fine per https://zonemaster.net/test/39a314485b0424b8 Maybe we're hitting an unbound bug.
Comment 8•6 years ago
|
||
FWIW my log (after running "unbound-control verbosity 3" and "unbound-control flush_zone allizom.org") is http://paste.debian.net/hidden/efab4fde/ Running unbound 1.6.7. Unsurprisingly "unbound-control insecure_add appsvcs-generic.nubis.allizom.org" makes things work.
Comment 9•6 years ago
|
||
$ dig +dnssec nubis.allizom.org ds @ns7-66.akam.net. ; <<>> DiG 9.11.3-1-Debian <<>> +dnssec nubis.allizom.org ds @ns7-66.akam.net. ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 56552 ;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 4, ADDITIONAL: 1 ;; WARNING: recursion requested but not available ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags: do; udp: 4096 ;; QUESTION SECTION: ;nubis.allizom.org. IN DS ;; AUTHORITY SECTION: r5ui8gqs780ip552dgk1nl5u5tcsf3o3.allizom.org. 3600 IN NSEC3 1 0 1 6367D67C2F1FDC27 R5VS339K4QTGAVS5QN5RFDFFMU3FT0CP CNAME RRSIG r5ui8gqs780ip552dgk1nl5u5tcsf3o3.allizom.org. 3600 IN RRSIG NSEC3 7 3 3600 20180407125954 20180404115954 15617 allizom.org. YtuB1WpxXs0cthXwfqgNtYzvPs+tTQ9knoFRDbU1nMMWbTaE+CZfhLW+ tvH4RoxeMo9kcbE+4GnKqSJJ7Pfyiji1NSpWAsXTrrRnpfdZD9zcLLRA 8Xa4B2EnnBjb5WS1g+WEzqEpLDKrDli2UVLuDFPg+rfeZbTfDFUKjKG7 UAI= allizom.org. 3600 IN SOA infoblox1.private.mdc2.mozilla.com. sysadmins.mozilla.org. 2018032724 180 180 1209600 3600 allizom.org. 3600 IN RRSIG SOA 7 2 3600 20180407125954 20180404115954 15617 allizom.org. UQ6bqng3XH49ngf5hxIWkoooeaKIhA7qR3o5o0+N18E7bNGAVdI8hc8b Uvpe84g5dqnoMev8VmwnretjXW2Vm69/f+HQPjJQFNCpAHfDyaRhtWUP 4mNuFAIHMnel4fpPqf2IU8vSGNGWLsXzvRAxfVEOLpT5zAEbpxFlMJvz tx8= ;; Query time: 14 msec ;; SERVER: 96.7.49.66#53(96.7.49.66) ;; WHEN: Thu Apr 05 11:24:53 CEST 2018 ;; MSG SIZE rcvd: 563 That NSEC3 record does not seem to have the opt-out flag set, unless I'm missing something?
Reporter | ||
Comment 10•6 years ago
|
||
Jd can you guys have a look, please? This breaks planet and wiki for anyone who's checking dnssec like France's 3 biggest ISP (free.fr).
Flags: needinfo?(jcrowe)
Comment 11•6 years ago
|
||
(In reply to Julien Cristau [:jcristau] from comment #9) > That NSEC3 record does not seem to have the opt-out flag set, unless I'm > missing something? The opt-out flag can be observed with: > $ dig +dnssec www.wiki.prod.core.us-west-2.appsvcs-generic.nubis.allizom.org @ns7-66.akam.net > ;; AUTHORITY SECTION: > appsvcs-generic.nubis.allizom.org. 3600 IN NS ns-1743.awsdns-25.co.uk. > appsvcs-generic.nubis.allizom.org. 3600 IN NS ns-1109.awsdns-10.org. > appsvcs-generic.nubis.allizom.org. 3600 IN NS ns-795.awsdns-35.net. > appsvcs-generic.nubis.allizom.org. 3600 IN NS ns-426.awsdns-53.com. > qjohc55d28rt0132v0vio8ikqd5s2n6o.allizom.org. 3600 IN NSEC3 1 0 1 B8B7DB93C1F45C6B QTJ5A1OVC94I6KTUE9VJEM2U8J7AD5DU NS > qjohc55d28rt0132v0vio8ikqd5s2n6o.allizom.org. 3600 IN RRSIG NSEC3 7 3 3600 20180408125954 20180405115954 15617 allizom.org. F6DbE/S1gS2YECMjGB2+6+aNnoEE3BHVqtYgV6M+MdL0nX/7UQCqDbKc A9o8PdU1YQ/GF6pGFqtzyG5rlOWyHMA5typwIcqUmU4hvLYGxHowGfow wNUGRSWSeWeXkNZu3XahxbS4mk5dEEvjNUXUuhXXAPW/b6ZAQegp8LTj /O8= This marks the first NS record that's returned from mozilla.org authorities - delegating the remainder of the query to route53. I think dnssec is a red herring here. What's odd is a dig +trace of www.wiki.prod.core.us-west-2.appsvcs-generic.nubis.allizom.org shows an initial delegation of appsvcs-generic.nubis.allizom.org to route53, and a subsequent delegation of us-west-2.appsvcs-generic.nubis.allizom.org to a second set of route53 nameservers. Despite being unusual I don't think this is the cause.
Updated•6 years ago
|
Summary: wiki.mozilla.org doesn't resolve on quad9 → Issues resolving DNS records for wiki.mozilla.org
Comment 12•6 years ago
|
||
(In reply to Brian Hourigan [:digi] from comment #11) > qjohc55d28rt0132v0vio8ikqd5s2n6o.allizom.org. 3600 IN NSEC3 1 0 1 B8B7DB93C1F45C6B QTJ5A1OVC94I6KTUE9VJEM2U8J7AD5DU NS That also seems to have opt-out cleared? Unless I'm misreading, nsec3 rdata is hash algorithm, flags, iterations, salt, next hashed owner name, and a list of rr types.
Comment 13•6 years ago
|
||
Incidentally, a query for core.us-west-2.appsvcs-generic.nubis.allizom.org results in NXDOMAIN, which is wrong since that is non-terminal. Which is likely to cause further issues with qname minimization.
Reporter | ||
Comment 14•6 years ago
|
||
from http://dnsviz.net/d/wiki.mozilla.org/dnssec/ amazonaws.com to us-west-2.elb.amazonaws.com: No SOA RR was returned with the NODATA response. (205.251.192.27, 205.251.195.199, 2600:9000:5300:1b00::1, 2600:9000:5303:c700::1, UDP_0_EDNS0_32768_4096) amazonaws.com to us-west-2.elb.amazonaws.com: The Authoritative Answer (AA) flag was not set in the response. (205.251.192.27, 205.251.195.199, 2600:9000:5300:1b00::1, 2600:9000:5303:c700::1, UDP_0_EDNS0_32768_4096) appsvcs-generic.nubis.allizom.org to us-west-2.appsvcs-generic.nubis.allizom.org: No SOA RR was returned with the NODATA response. (205.251.193.170, 205.251.195.27, 205.251.196.85, 205.251.198.207, 2600:9000:5301:aa00::1, 2600:9000:5303:1b00::1, 2600:9000:5304:5500::1, 2600:9000:5306:cf00::1, UDP_0_EDNS0_32768_4096) appsvcs-generic.nubis.allizom.org to us-west-2.appsvcs-generic.nubis.allizom.org: The Authoritative Answer (AA) flag was not set in the response. (205.251.193.170, 205.251.195.27, 205.251.196.85, 205.251.198.207, 2600:9000:5301:aa00::1, 2600:9000:5303:1b00::1, 2600:9000:5304:5500::1, 2600:9000:5306:cf00::1, UDP_0_EDNS0_32768_4096) www.wiki.prod.core.us-west-2.appsvcs-generic.nubis.allizom.org/CNAME: A query for www.wiki.prod.core.us-west-2.appsvcs-generic.nubis.allizom.org results in a NOERROR response, while a query for its ancestor, core.us-west-2.appsvcs-generic.nubis.allizom.org, returns a name error (NXDOMAIN), which indicates that subdomains of core.us-west-2.appsvcs-generic.nubis.allizom.org, including www.wiki.prod.core.us-west-2.appsvcs-generic.nubis.allizom.org, don't exist. (205.251.192.60, 205.251.194.207, 205.251.197.154, 205.251.198.90, 2600:9000:5300:3c00::1, 2600:9000:5302:cf00::1, 2600:9000:5305:9a00::1, 2600:9000:5306:5a00::1, UDP_0_EDNS0_32768_4096)
Comment 15•6 years ago
|
||
Can we get wiki/planet CNAMEs pointed directly to the amazonaws.com names until the bogusness is sorted out on the allizom.org side?
Comment 16•6 years ago
|
||
(In reply to Julien Cristau [:jcristau] from comment #13) > Incidentally, a query for core.us-west-2.appsvcs-generic.nubis.allizom.org > results in NXDOMAIN, which is wrong since that is non-terminal. Which is > likely to cause further issues with qname minimization. I'm with Julien. I sent an email out-of-band to some folks last night with my final thoughts. If you scrutinize the dnsviz.net output, you can see how the .com properly has the out-out bit set (just as a reference), but not on the NSEC3 record for nubis.allizom.org, and that is where my bind nameserver starts getting upset. named[12300]: error (no valid RRSIG) resolving 'nubis.allizom.org/DS/IN' $ dig com +noadditional +dnssec +multiline|grep NSEC3 CK0POJMG874LJREF7EFN8430QVIT8BSM.com. 893 IN RRSIG NSEC3 8 2 86400 ( CK0POJMG874LJREF7EFN8430QVIT8BSM.com. 893 IN NSEC3 1 1 0 - ( NS SOA RRSIG DNSKEY NSEC3PARAM ) $dig nubis.allizom.org +noadditional +dnssec +multiline|grep NSEC3 tk97ggh9fv2362h7bvpqr44gb50v9a50.allizom.org. 3587 IN RRSIG NSEC3 7 3 3600 ( tk97ggh9fv2362h7bvpqr44gb50v9a50.allizom.org. 3587 IN NSEC3 1 0 1 FF54DABE98DC4522 ( (Clearing NI for :jd, nubis team has been consulted and there is no further info from them)
Flags: needinfo?(jcrowe)
Comment 17•6 years ago
|
||
Interesting, but nubis.allizom.org doesn't actually exist as a domain, in theory, it's the subdomains that will have NS records pointing to AWS / Route53 However, why is allizom.org reporting a private host in its SOA record ? $> dig allizom.org soa ; <<>> DiG 9.11.3-RedHat-9.11.3-2.fc27 <<>> allizom.org soa ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 37192 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 0 ;; QUESTION SECTION: ;allizom.org. IN SOA ;; ANSWER SECTION: allizom.org. 3580 IN SOA infoblox1.private.mdc2.mozilla.com. sysadmins.mozilla.org. 2018032733 180 180 1209600 3600 $> host infoblox1.private.mdc2.mozilla.com infoblox1.private.mdc2.mozilla.com has address 10.50.75.120
Comment 19•6 years ago
|
||
Just a nudge that this still exists, and users notice it. [04-20 13:48:46] <andrew> fauweh is wikimo down? [04-20 13:48:58] <andrew> https://irccloud.mozilla.com/pastebin/ze5WxlAJ/
Comment 20•6 years ago
|
||
I don't think this is a issue with either DNSSEC or Akamai.
Assignee: infra → nobody
Component: Infrastructure: Other → Infrastructure: AWS
Comment 21•6 years ago
|
||
I'm out of my depth, but if comment 16 and <https://lutz.donnerhacke.de/Blog/Outsourcing-mit-Hindernissen> are correct, it's an issue with how the allizom.org nameservers respond to queries for nubis.allizom.org. "dig nubis.allizom.org" certainly fails with every validating resolve I've checked. (It should be a valid negative response, not an error.) And that doesn't involve Route 53 in any way. I don't know how your DNS stack works, but it seems like a bug with how the signer, or Akamai's authoritative DNS software, or Mozilla's authoritative DNS software (Infoblox?) handles signed empty non-terminals. If so, perhaps the zone just needs to be resigned, or the software needs to be fixed, or the software needs to be upgraded. And maybe it would be possible to work around it by adding a classic "Do Not Remove" TXT record for nubis.allizom.org.
Comment 22•6 years ago
|
||
(In reply to Matt Nordhoff (aka Peng on IRC & forums) from comment #21) > And maybe it would be possible to work around it by adding a classic "Do Not Remove" TXT record for nubis.allizom.org. Please try that rather soon.^^ Your dns server problem sounds a bit like https://github.com/bluejekyll/trust-dns/issues/53 (but not exactly the same).
Comment 23•6 years ago
|
||
As of 20 minutes ago: <http://dnsviz.net/d/nubis.allizom.org/WtqLog/dnssec/> Queries for "nubis.allizom.org." returned an NSEC3 record with the hash for some other name. ("0ge0tg7g8mkoml9al6qcts98u20adkd7.allizom.org." instead of "up32ens15h11olfrlhq04p3b94rbsa88.allizom.org.".) Since it was not proof of whether or not "nubis.allizom.org." existsed validators considered it bogus. (A query for "appsvcs-generic.nubis.allizom.org." did seem to return its own correct NSEC3 record.) (Opt-out is orthogonal to whether or not a zone has insecure delegations. A zone that doesn't use opt-out just needs to sign the referral properly.) As of 5 minutes ago: <http://dnsviz.net/d/nubis.allizom.org/WtqPIQ/dnssec/> The zone has been resigned with a different salt and it's using the correct new hash ("3fg7kbvo3d99gfjalvnf3fls3tepqemr.allizom.org.").
Comment 24•6 years ago
|
||
(In reply to Matt Nordhoff (aka Peng on IRC & forums) from comment #23) > Queries for "nubis.allizom.org." returned an NSEC3 record with the hash for > some other name. ("0ge0tg7g8mkoml9al6qcts98u20adkd7.allizom.org." instead of > "up32ens15h11olfrlhq04p3b94rbsa88.allizom.org.".) > > Since it was not proof of whether or not "nubis.allizom.org." existsed > validators considered it bogus. Thank you, this was very helpful. We outsource zone signing and we're working to resolve this with them.
Comment 25•6 years ago
|
||
This is broken again for me now.
Reporter | ||
Comment 26•6 years ago
|
||
from #moc <jcristau> can't resolve planet or wiki again today...
Comment 27•6 years ago
|
||
http://dnsviz.net/d/wiki.mozilla.org/dnssec/ Down for me, too. (I'm so thankful that our PowerDNS just works out of the box.) P-256 or P-384 instead of RSA 1024 bit would be great. A CNAME to the unsecured *.amazonaws.com reduces your DNS authenticity to absurdity: Is there a way to directly set A/AAAA records? Thanks.
Comment 28•6 years ago
|
||
It's the same issue as before. http://dnsviz.net/d/nubis.allizom.org/Wur4UA/dnssec/ $ dig @ns1-240.akam.net +dnssec +norecurse nubis.allizom.org allizom.org. 3600 IN SOA infoblox1.private.mdc2.mozilla.com. sysadmins.mozilla.org. 2018032779 180 180 1209600 3600 allizom.org. 3600 IN RRSIG SOA 7 2 3600 20180505125954 20180502115954 42617 allizom.org. MM1PYs20A7cGAekOluAhD8I3H00sMPcmAlwMFMAwcNTHFGymSB3rdYwN vOkE6h1sInacER1v5iTI9Ysm3DcnNaffCFZIN1bTqN0ksKbvKOmSDRJW pvl7KUmXtEyBJAS4WW96itVw9KhMexQS2W0XrZt8cEnADOrh964K4kp6 QjQ= ufbvu3fk9s21287d050gp8eohtue8g9b.allizom.org. 3600 IN NSEC3 1 0 1 1ECE85B1485ECC72 UIQPLPQM8DMCCMUMP782VMR3BGMOGVNL CNAME RRSIG ufbvu3fk9s21287d050gp8eohtue8g9b.allizom.org. 3600 IN RRSIG NSEC3 7 3 3600 20180505125954 20180502115954 42617 allizom.org. BvZbg1MYoKaFi+uqkkFAqhyAuqBnY8DvBxVVMq4+YaxtiJ2ZO/nC4gxd fHAtdVVqwl1Bjb8ZyXoBaNWjlKb57G9r6LtVtmZMSIecneBDGNANeafi Zw/uYHI95Sjj+qBvUOXRNCGlOFocfntUa7JAT0PMCXPwYIhRTE+fEzd+ 1RE= $ ldns-nsec3-hash -s 1ECE85B1485ECC72 -a 1 nubis.allizom.org. j3t17gfviv9c9meh3jbg9norfg1g81vb. j3t17gfviv9c9meh3jbg9norfg1g81vb != ufbvu3fk9s21287d050gp8eohtue8g9b
Comment 29•6 years ago
|
||
Due to user reports from both :jcristau and :Peng_, and a recommendation from :gozer, I've changed the DNS public/private CNAME records for wiki.mozilla.org from www.wiki.prod.core.us-west-2.appsvcs-generic.nubis.allizom.org to www.wiki.prod.core.us-west-2.appsvcs-generic.nubis.allizom.org as a temporary fix
Comment 30•6 years ago
|
||
Yikes, hit enter too fast: changed *to* wiki-prod-1394614349.us-west-2.elb.amazonaws.com
Comment 31•6 years ago
|
||
Same for Web of Things Gateway? http://iot.mozilla.org/gateway
Comment 32•6 years ago
|
||
Changed DNS on OPENNIC. Gateway works again. Might be unrelated.
Comment 33•6 years ago
|
||
Yes, it's the same thing for iot.mozilla.org. It has an equivalent DNS setup. iot.mozilla.org. 18 IN CNAME www.haul.prod.core.us-west-2.appsvcs-generic.nubis.allizom.org. www.haul.prod.core.us-west-2.appsvcs-generic.nubis.allizom.org. 258 IN CNAME haul-web-prod-1392505985.us-west-2.elb.amazonaws.com. (DNSSEC validation disabled for that query.) Your new DNS resolver probably doesn't support DNSSEC.
Comment 34•6 years ago
|
||
This is apparently also affecting voice.mozilla.org.
Comment 35•6 years ago
|
||
Hi there, I'm working on voice.mozilla.org. I'm not seeing this issue personally, but do we know how many people are seeing this issue? The reason I am asking is, we made a big announcement today, and are expecting an important influx of traffic. https://blog.mozilla.org/blog/2018/06/07/parlez-vous-deutsch-rhagor-o-leisiau-i-common-voice/
Comment 36•6 years ago
|
||
Filed new bug for voice.mozilla.org issue: https://bugzilla.mozilla.org/show_bug.cgi?id=1467419
Comment 37•6 years ago
|
||
Just note, voice.mozilla.org appears to be working now for the affected users thanks to :rtucker in bug 1467419. Big thanks!
Comment 38•6 years ago
|
||
A bug has been discovered related to non-empty terminal records, that users and the general public helped identify, and it's currently being addressed. The permanent fix for this problem is scheduled to be released on June 25th. Workarounds should continue to be used until we can verify the fix. Empty non-terminal records are "domain names that own no resource records, but have subdomains that do." For example, if a zone has a record a.b in the zone customer.com (full name a.b.customer.com), but doesn't have a b record (full name b.customer.com), an empty non-terminal record b.customer.com is created that returns NODATA/NOERROR. The empty non-terminal creation is necessary for dnssec conformance. When a zone has dnssec, a question for an empty non-terminal should result in a NODATA/NOERROR response and the response should have a corresponding NSEC3 record. A validating resolver uses the NSEC3 record to authenticate the NODATA/NOERROR response. In some cases, the NSEC3 record returned for empty non-terminals were failing dnssec validation due to a bug in generating the NSEC3 records for empty non-terminals.
Comment 39•6 years ago
|
||
We have discussed and approved a change to add a TXT record under nubis.allizom.org to work around this bug. While larger sites have already received a alternate workaround this change will extend a workaround to the remainder of Nubis service owners and users.
Updated•5 years ago
|
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•