Closed Bug 765033 Opened 12 years ago Closed 12 years ago

transient failure to resolve ftp.mozilla.org

Categories

(Infrastructure & Operations Graveyard :: NetOps, task)

x86_64
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: catlee, Assigned: cransom)

Details

several tests just failed all around the same time on various machines with failures to resolve ftp.mozilla.org.

e.g.
https://tbpl.mozilla.org/php/getParsedLog.php?id=12672390&tree=Mozilla-Inbound&full=1
--2012-06-14 13:59:58--  http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-linux-debug/1339705710/firefox-16.0a1.en-US.linux-i686.tar.bz2
Resolving ftp.mozilla.org... failed: Temporary failure in name resolution.
wget: unable to resolve host address `ftp.mozilla.org'

https://tbpl.mozilla.org/php/getParsedLog.php?id=12672406&tree=Mozilla-Inbound&full=1
https://tbpl.mozilla.org/php/getParsedLog.php?id=12672330&tree=Mozilla-Inbound&full=1

can someone take a look at the logs on build.scl1.mozilla.com (which is our nameserver) to see if something happened around 14:00?
sorry, nameserver is ns-vip.build.scl1.mozilla.com.
moving to ServerOps, per suggestion from Phong in mtg Tuesday about oncall response.
Assignee: server-ops-releng → server-ops
Component: Server Operations: RelEng → Server Operations
QA Contact: arich → phong
There are a lot of errors in ns-vip.build.scl1.mozilla.com:/var/named/chroot/var/log/named/lameservers, including a bunch of 'network unreachable' around the time of their name resolution error (13:59 on 6/14/12).  Passing to netops to see if they know of any network blips around that time.
Assignee: server-ops → network-operations
Component: Server Operations → Server Operations: Netops
QA Contact: phong → ravi
Assignee: network-operations → cransom
It looks like a 3crowd outage as the failure messages only point to a single lame 3crowd host for about a minute:
14-Jun-2012 14:00:16.262 lame-servers: info: error (unexpected RCODE SERVFAIL) resolving 'bugzilla.3crowd.mozilla.net/A/IN': 173.249.32.30#53
14-Jun-2012 14:00:16.308 lame-servers: info: error (unexpected RCODE SERVFAIL) resolving 'bugzilla.3crowd.mozilla.net/A/IN': 173.249.32.30#53
14-Jun-2012 14:00:16.314 lame-servers: info: error (unexpected RCODE SERVFAIL) resolving 'bugzilla.3crowd.mozilla.net/A/IN': 173.249.32.30#53
14-Jun-2012 14:00:59.789 lame-servers: info: error (unexpected RCODE SERVFAIL) resolving 'ftp.3crowd.mozilla.net/A/IN': 173.249.32.30#53
etc.
The 'network unreachable' events only pertain to ipv6 and do not apply here.
There were no network blips detected (smokeping confirms).
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Product: mozilla.org → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.