addons2.stage.db.phx1 can't reach puppet1.private.phx1


Status Graveyard
Server Operations
5 years ago
3 years ago


(Reporter: dustin, Assigned: dumitru)





5 years ago
The other addons*.stage.db seem fine, but

[root@addons2.stage.db.phx1 ~]# host has address
[root@addons2.stage.db.phx1 ~]# nc -vz 8140

times out

Looks like it started yesterday:

Sep  6 19:42:54 addons2 puppet-agent[1168]: (/Stage[main]/Ldap_users::Groups::Admin/Ldap_users::Dotfiles[ckolos]/File[/home/ckolos/]) Failed to generate additional resources using 'eval_generate: Connection timed out - connect(2)

I'm guessing this is a flow problem, since

[root@addons2.stage.db.phx1 ~]# nc -vz 8140
Connection to 8140 port [tcp/*] succeeded!

Can you check the flows, and if nothing's amiss bounce back to server ops?

Comment 1

5 years ago
I've pulled it from the zeus pool, even though mysql seemed to be OK.
We've also had ridiculous amounts of nagios flapping, here's the link to yesterday:

(and today is similar).

Setting it to server operations - if nc works, then the flow is open and there's not really much network operations can do.
Assignee: network-operations → server-ops
Component: Server Operations: ACL Request → Server Operations
QA Contact: ravi → jdow

Comment 3

5 years ago
nc's working now, but my paste above where it didn't work was for real, so this is definitely flapping.  But that suggests it's not a flow problem.
Replication is broken, because:

                Last_IO_Error: error reconnecting to master 'slave_user@' - retry-time: 60  retries: 86400

There are network-related problems here.

Comment 5

5 years ago
I did a yum update and the new kernel had a newer be2net driver for the NIC card.
old: 4.0.160r
new (current): 4.1.307r

Let's keep an eye on that.
Assignee: server-ops → dgherman
addons2.db.stage hasn't been flapping, so I'm going to mark this resolved.
Last Resolved: 5 years ago
Resolution: --- → FIXED
Product: → Graveyard
You need to log in before you can comment on or make changes to this bug.