Closed Bug 659238 Opened 14 years ago Closed 14 years ago

DNS errors resolving hosts in build network

Categories

(Release Engineering :: General, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bear, Unassigned)

Details

at 0300 the devs and myself noticed that some hosts were not resolving: test-master01.b.m.o cruncher.b.m.o then I started seeing error emails from buildbot-master04.build.scl1.m.c: Exception in /builds/buildbot/tests1/master/twistd.log: 2011-05-23 23:59:03-0700 [-] Unhandled Error Traceback (most recent call last): Failure: twisted.internet.error.DNSLookupError: DNS lookup failed: address 'mail.build.mozilla.org' not found: [Errno -2] Name or service not known. -------------------------------------------------------------------------------- Exception in /builds/buildbot/tests1/master/twistd.log: 2011-05-23 23:59:09-0700 [-] Unhandled Error Traceback (most recent call last): Failure: twisted.internet.error.DNSLookupError: DNS lookup failed: address 'mail.build.mozilla.org' not found: [Errno -2] Name or service not known. after that I found that I couldn't get tbpl to load more than 50% closed the tree at 0330 until it's recovered
Severity: major → blocker
Severity: blocker → major
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → DUPLICATE
reopening just to give a public view of the issue - i'll clean it up when the IT bug is closed
Assignee: network-operations → nobody
Severity: major → normal
Status: RESOLVED → REOPENED
Component: Server Operations: Netops → Release Engineering
QA Contact: mrz → release
Resolution: DUPLICATE → ---
Severity: normal → blocker
Do not alter Importance as it pages oncall. This is obviously being worked on so doing this is counterproductive.
Severity: blocker → normal
dns is now resolving - will close bug as dup against the IT bug when tree is back open
In order to complete the migration of mozilla.org to full DNSSEC signing we had to roll build.mozilla.org from being a SOA to traditional $INCLUDE. Some of the required bits were left which gave a false positive that things were functional. After correcting the configuration we then encountered a DNSSEC signing exception that needed to be corrected. During this time hosts *.build.mozilla.org (inclusive) failed to resolve at various times as the config changes were pushed to all locations. Configs are back to a stable and operational state with everything under the single mozilla.org signing key.
I have reopened the tree, the current situation seems fine. In case something would break again, I will report back.
Status: REOPENED → RESOLVED
Closed: 14 years ago14 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.