Closed
Bug 762342
Opened 12 years ago
Closed 12 years ago
DNS Issues on production
Categories
(Infrastructure & Operations :: Infrastructure: Other, task)
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 762346
People
(Reporter: st3fan, Unassigned)
Details
This should never happen. If the DNS servers that we use are unreliable then we might want to maintain an /etc/hosts file. [root@pancake-web4 supervisor]# curl -i http://pancake-elasticsearch1:9200/pancake curl: (6) Couldn't resolve host 'pancake-elasticsearch1'
Comment 1•12 years ago
|
||
Can you paste /etc/resolv.conf from that host here ?
Comment 2•12 years ago
|
||
[root@pancake-web4 ~]# cat /etc/resolv.conf search labs.phx1.mozilla.com nameserver 10.8.110.5
Comment 3•12 years ago
|
||
Seems to be working now, but why is DHCP only returning a single nameserver? Punting over to server ops in case they can shed some light. I'll venture a guess and some DNS outage in PHX1
Assignee: gozer → server-ops
Component: General → Server Operations
Product: Pancake → mozilla.org
QA Contact: general → phong
Target Milestone: M3 → ---
Version: unspecified → other
Comment 4•12 years ago
|
||
(In reply to Stefan Arentz [:st3fan] from comment #0) > This should never happen. If the DNS servers that we use are unreliable then > we might want to maintain an /etc/hosts file. > > [root@pancake-web4 supervisor]# curl -i > http://pancake-elasticsearch1:9200/pancake > curl: (6) Couldn't resolve host 'pancake-elasticsearch1' Before you go off on reliability (and we don't know what happened here, yet), PLEASE use fqdns in your configs. If it's super critical, use IPs. Our DBs use IPs vs hostnames because : 1) It cuts down resolution, DBs don't move every day 2) It doesn't fail if there's a blip in DNS. Punting over to the infra team, CC'ing rtucker to check about DHCP.
Assignee: server-ops → server-ops-infra
Component: Server Operations → Server Operations: Infrastructure
QA Contact: phong → jdow
Comment 5•12 years ago
|
||
Also, do you have a timeline? So we can narrow down the search? Gozer, do you *know* of a DNS outage?
Reporter | ||
Comment 6•12 years ago
|
||
Here are some timestamps: fxhome-lattice-server.stderr.log:[W 120514 09:35:42 elasticsearch:270] ElasticSearch Request Error 599 fxhome-lattice-server.stderr.log:[E 120606 19:01:05 elasticsearch:272] ElasticSearch Request Error 599 fxhome-lattice-server.stderr.log:[E 120606 19:05:16 elasticsearch:272] ElasticSearch Request Error 599 fxhome-lattice-server.stderr.log:[E 120606 19:01:48 elasticsearch:272] ElasticSearch Request Error 599 fxhome-lattice-server.stderr.log:[W 120514 13:34:53 elasticsearch:270] ElasticSearch Request Error 599 fxhome-lattice-server.stderr.log:[E 120606 19:25:09 elasticsearch:272] ElasticSearch Request Error 599 fxhome-lattice-server.stderr.log:[E 120606 19:25:19 elasticsearch:272] ElasticSearch Request Error 599 fxhome-lattice-server.stderr.log:[E 120606 19:26:24 elasticsearch:272] ElasticSearch Request Error 599 fxhome-lattice-server.stderr.log-20120408:[W 120404 16:20:56 elasticsearch:268] ElasticSearch Request Error 599 fxhome-lattice-server.stderr.log-20120408:[W 120404 16:20:59 elasticsearch:268] ElasticSearch Request Error 599
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 7•12 years ago
|
||
Curious how fqdns will help. If the DNS is unreachable then those will also fail no?
Comment 8•12 years ago
|
||
True, but it's the right way™ to go. It helps keeps things sane, like looking at pancake-elasticsearch I have no idea which datacenter that's in. I know phx1 because 10.8 is phx1 and we're starting to have across DC ES instances (over in IT, not Labs) and in those cases, not using FQDNs can cause issues.
Reporter | ||
Comment 9•12 years ago
|
||
Sorry this should not have been marked as fixed. We still need to improve the DNS config. We will configure full names. But I would also like to make this more resilient by configuring at least 2 nameservers. Is there a pair that we can use?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Updated•12 years ago
|
Status: REOPENED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → DUPLICATE
Comment 11•12 years ago
|
||
726346 has the correct nameservers you can use.
Updated•11 years ago
|
Component: Server Operations: Infrastructure → Infrastructure: Other
Product: mozilla.org → Infrastructure & Operations
You need to log in
before you can comment on or make changes to this bug.
Description
•