External DNS died at 7:12am

RESOLVED FIXED

Status

mozilla.org Graveyard
Server Operations
--
minor
RESOLVED FIXED
9 years ago
3 years ago

People

(Reporter: justdave, Assigned: justdave)

Tracking

Details

------------------------------------------------------------------------
r2641 | justdave@mozilla.com | 2009-01-15 07:27:43 -0800 (Thu, 15 Jan 2009) | 2 lines

fix missing quote

------------------------------------------------------------------------
r2640 | justdave@mozilla.com | 2009-01-15 07:11:44 -0800 (Thu, 15 Jan 2009) | 2 lines

Add mozilla-europe.de

======================

The missing quote caused the DNS servers to fail to restart after picking up the config change.  The fix was manually applied to mradm01 at roughly 7:20 in order to get things up and running, giving a total external outage time of about 8 minutes.

All DNS servers were fixed within 2 to 3 minutes of committing the fix to SVN (I had to add svn.mozilla.org to the hosts files on the DNS servers for them to pick it up though)

Internal DNS should not have been affected unless you were using 10.2.74.123 for your DNS server (which is not what DHCP is handing out).
In addition to the config file syntax fix, the auto-update scripts on the DNS servers have also been fixed to syntax-check the config file before restarting the server, and refuse to restart if it's broken, which should prevent this situation from happening again in the future.
Assignee: server-ops → justdave
Status: NEW → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → FIXED
The nagios script we already had that checks for zone file syntax errors has also been updated to also check the main config file.
This seems to have caused some builds to fail - we've star'd the failing builds, but still unclear why this outage hit internal build machines.
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.