Closed Bug 789441 Opened 12 years ago Closed 12 years ago

addons2.stage.db.phx1 can't reach puppet1.private.phx1

Categories

(mozilla.org Graveyard :: Server Operations, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dumitru)

Details

The other addons*.stage.db seem fine, but

[root@addons2.stage.db.phx1 ~]# host puppet1.private.phx1.mozilla.com
puppet1.private.phx1.mozilla.com has address 10.8.75.10
[root@addons2.stage.db.phx1 ~]# nc -vz puppet1.private.phx1.mozilla.com 8140

times out

Looks like it started yesterday:

Sep  6 19:42:54 addons2 puppet-agent[1168]: (/Stage[main]/Ldap_users::Groups::Admin/Ldap_users::Dotfiles[ckolos]/File[/home/ckolos/]) Failed to generate additional resources using 'eval_generate: Connection timed out - connect(2)

I'm guessing this is a flow problem, since

[root@addons2.stage.db.phx1 ~]# nc -vz puppet1.private.scl3.mozilla.com 8140
Connection to puppet1.private.scl3.mozilla.com 8140 port [tcp/*] succeeded!

Can you check the flows, and if nothing's amiss bounce back to server ops?
I've pulled it from the zeus pool, even though mysql seemed to be OK.
We've also had ridiculous amounts of nagios flapping, here's the link to yesterday:

http://nagios1.private.phx1.mozilla.com/phx1/cgi-bin/history.cgi?host=addons2.stage.db.phx1.mozilla.com&type=0&statetype=0&archive=1

(and today is similar).

Setting it to server operations - if nc works, then the flow is open and there's not really much network operations can do.
Assignee: network-operations → server-ops
Component: Server Operations: ACL Request → Server Operations
QA Contact: ravi → jdow
nc's working now, but my paste above where it didn't work was for real, so this is definitely flapping.  But that suggests it's not a flow problem.
Replication is broken, because:

                Last_IO_Error: error reconnecting to master 'slave_user@10.8.70.139:3306' - retry-time: 60  retries: 86400


There are network-related problems here.
I did a yum update and the new kernel had a newer be2net driver for the NIC card.
old: 4.0.160r
new (current): 4.1.307r

Let's keep an eye on that.
Assignee: server-ops → dgherman
addons2.db.stage hasn't been flapping, so I'm going to mark this resolved.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.