Closed Bug 536398 Opened 15 years ago Closed 15 years ago

Put staging build machines on Build-VPN, and remove from Mozilla-MPT

Categories

(mozilla.org Graveyard :: Server Operations, task)

x86
All
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: joduinn, Assigned: mrz)

References

Details

Please make the following machines accessible on Build-VPN, and remove them from Mozilla-MPT.

staging-1.9-master
staging-master
staging-nightly-updates 
staging-opsi 
staging-puppet
staging-stage
staging-try-master
talos-staging-master
test-linslave
test-mgmt
moz2-darwin9-slave03
moz2-darwin9-slave04
moz2-darwin9-slave08
moz2-linux-slave03
moz2-linux-slave04
moz2-win32-slave03 
moz2-win32-slave04
moz2-win32-slave19
moz2-win32-slave20
moz2-win32-slave21

(Assuming we find no surprises, I'll file another bug later to do the same with the production machines.)
Correction: moz2-win32-slave19, 20, 21 should *not* be moved to Build-VPN yet. I've also fixed inventory, which incorrectly showed some production slaves as staging.
To clarify, we're not moving anything.  We're only adding deny rules to the firewall.
On this one:

sm-staging-try-master.mozilla.org has address 10.2.76.35

We can filter access to that host from the vpn but there are other non-build hosts on 10.2.76.0/24 that I can't filter access to.  Means someone on some other host on 10.2.76.0/24 "could" still hit 10.2.76.35.

I'd exclude that host.
sh-3.2$ host staging-1.9-master.build.mozilla.org
Host staging-1.9-master.build.mozilla.org not found: 3(NXDOMAIN)
(In reply to comment #4)
> sh-3.2$ host staging-1.9-master.build.mozilla.org
> Host staging-1.9-master.build.mozilla.org not found: 3(NXDOMAIN)

Oh, this was being removed as part of bug#534329. Ignore this machine, its going away.
moz2-win32-slave17 should also be on the Build-VPN list. I've fixed inventory to list it as a staging machine.
No, the staging slaves for moz2 are 
 moz2-linux-slave03,04,17
 moz2-win32-slave03,04,21
I'm confused.  Could you perhaps attach an updated list with IP addresses?
staging-master.build.mozilla.org        10.2.71.208
moz2-darwin9-slave03.build.mozilla.org  10.2.71.141
moz2-darwin9-slave04.build.mozilla.org  10.2.71.142
moz2-darwin9-slave08.build.mozilla.org  10.2.71.163
moz2-linux-slave03.build.mozilla.org    10.2.71.105
moz2-linux-slave04.build.mozilla.org    10.2.71.18
moz2-linux-slave17.build.mozilla.org    10.2.71.218
win32-slave03.build.mozilla.org         10.2.71.119
win32-slave04.build.mozilla.org         10.2.71.22
win32-slave21.build.mozilla.org         10.2.71.222

staging-opsi.build.mozilla.org          10.2.71.216
staging-puppet.build.mozilla.org        10.2.71.91
staging-stage.build.mozilla.org         10.2.71.82

Alice can advise what to do about talos masters and slaves, given they're split between MPT and Castro.
(In reply to comment #9)
> staging-master.build.mozilla.org        10.2.71.208
> moz2-darwin9-slave03.build.mozilla.org  10.2.71.141
> moz2-darwin9-slave04.build.mozilla.org  10.2.71.142
> moz2-darwin9-slave08.build.mozilla.org  10.2.71.163
> moz2-linux-slave03.build.mozilla.org    10.2.71.105
> moz2-linux-slave04.build.mozilla.org    10.2.71.18
> moz2-linux-slave17.build.mozilla.org    10.2.71.218
> win32-slave03.build.mozilla.org         10.2.71.119
> win32-slave04.build.mozilla.org         10.2.71.22
> win32-slave21.build.mozilla.org         10.2.71.222
> 
> staging-opsi.build.mozilla.org          10.2.71.216
> staging-puppet.build.mozilla.org        10.2.71.91
> staging-stage.build.mozilla.org         10.2.71.82

Thanks Nick.

> Alice can advise what to do about talos masters and slaves, given they're split
> between MPT and Castro.
No need. We're leaving Talos machines (on QA network) and try machines (separately sandbox'd) unchanged for now.  We'll deal with moving the Talos machines into Build-VPN next month as part of the upcoming Talos hardware and consolidating networks.
Assignee: server-ops → dmoore
When can I push these rules into production?

object-group network build-lockdown
 network-object host 10.2.71.141
 network-object host 10.2.71.142
 network-object host 10.2.71.163
 network-object host 10.2.71.105
 network-object host 10.2.71.18
 network-object host 10.2.71.218
 network-object host 10.2.71.119
 network-object host 10.2.71.22
 network-object host 10.2.71.222
 network-object host 10.2.71.216
 network-object host 10.2.71.91
 
 
access-list corp-outbound line 39 deny ip any object-group build-lockdown

(Line 39 puts this right before the blanket permit).
Assignee: dmoore → mrz
Flags: needs-downtime+
Whiteboard: 12/29/2009 @ 7pm
Rules in place.  Will sit on this for a day before closing.

fcore1(config)# access-list co
Access Rules Download Complete: Memory Utilization: 2%
fcore1(config)# q
fcore1# q  

Logoff

Connection to fw1 closed.
mrz@boris [~/] 43> date
Mon Dec 28 14:54:20 PST 2009
Missed 2 - 

object-group network build-lockdown
 network-object host 10.2.71.208
 network-object host 10.2.71.82
Flags: needs-downtime+
Whiteboard: 12/29/2009 @ 7pm
Armen and I are unable to connect to the build-vpn using the config that joduinn provided. Is it setup for remote access? I'm using the same keys/certs as for MPT.

My connection fails at the TLS handshake. Here's the output from the Viscosity log:

Tue Dec 29 10:15:00 2009: TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity)
Tue Dec 29 10:15:00 2009: TLS Error: TLS handshake failed
Tue Dec 29 10:15:00 2009: SIGUSR1[soft,tls-error] received, process restarting
Severity: normal → major
I am raising the priority since I can't get to do any relevant amount of work like this.
Severity: major → critical
Debugging.

Keys all match on the server side:

[root@cm-vpn01 keys]# md5sum *
b72198f5b4f7fdf6a54fae9a2a0355aa  database.txt
048e7e2f6694886ce27f048724e9a9a7  dh1024.pem
57cfae2e09f8c8934543e04006ddbf9a  dh2048.pem
2af787f8dc585b881532651b17f2904a  dmvpn01.crt
f89303e1103a365909374523aba8fc77  dmvpn01.key
ee58943bcfa5c878786a35da3201382f  moz-ca.crt
7c42ebfd2675df8cd320345857502d06  ta.key

b72198f5b4f7fdf6a54fae9a2a0355aa  database.txt
048e7e2f6694886ce27f048724e9a9a7  dh1024.pem
57cfae2e09f8c8934543e04006ddbf9a  dh2048.pem
2af787f8dc585b881532651b17f2904a  dmvpn01.crt
f89303e1103a365909374523aba8fc77  dmvpn01.key
ee58943bcfa5c878786a35da3201382f  moz-ca.crt
7c42ebfd2675df8cd320345857502d06  ta.key
Clock was a little off.

[root@bm-vpn01 keys]# date
Thu Feb 25 00:14:54 PST 2010
[root@bm-vpn01 keys]# ntpdate 10.2.71.5
29 Dec 10:01:48 ntpdate[18379]: step time server 10.2.71.5 offset -4976018.439527 sec

(still debugging)
Authentication WFM.  Didn't test before the date fix.

Dec 29 10:40:26 bm-vpn01 openvpn[1591]: 63.245.220.240:42313 [mzeier@mozilla.com] Peer Connection Initiated with 63.245.220.240:42313


I'm seeing errors from 99.254.8.58

Dec 29 10:38:36 bm-vpn01 openvpn[1591]: TLS Error: incoming packet authentication failed from 99.254.8.58:59497
Dec 29 10:38:45 bm-vpn01 openvpn[1591]: Authenticate/Decrypt packet error: packet HMAC authentication failed
Dec 29 10:38:45 bm-vpn01 openvpn[1591]: TLS Error: incoming packet authentication failed from 99.254.8.58:59497
Dec 29 10:40:25 bm-vpn01 openvpn[1591]: 63.245.220.240:42313 Re-using SSL/TLS context
Dec 29 10:40:25 bm-vpn01 openvpn[1591]: 63.245.220.240:42313 LZO compression initialized
Dec 29 10:40:26 bm-vpn01 openvpn[1591]: 63.245.220.240:42313 [mzeier@mozilla.com] Peer Connection Initiated with 63.245.220.240:42313
I thought we tested this VPN before doing this switch.  iptables isn't setup on this correctly at all.  Working to fix that.
Was missing NAT rules.  For posterity,

[root@bm-vpn01 openvpn]# cat /etc/sysconfig/iptables
# Generated by iptables-save v1.3.5 on Sun Dec 13 23:58:11 2009
*filter
:INPUT ACCEPT [7865:1054856]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [1499:266315]
-A FORWARD -s 10.2.171.0/255.255.255.0 -j ACCEPT 
-A FORWARD -d 10.2.171.0/255.255.255.0 -j ACCEPT 
-A FORWARD -i eth0 -j ACCEPT
-A FORWARD -j LOG --log-prefix "FWD DENY " --log-level 6 
-A FORWARD -j DROP 
COMMIT
# Completed on Fri Dec 11 15:19:26 2009
# Generated by iptables-save v1.3.5 on Fri Dec 11 15:19:26 2009
*nat
:PREROUTING ACCEPT [441464:84291121]
:POSTROUTING ACCEPT [9914:768421]
:OUTPUT ACCEPT [9914:768421]
-A POSTROUTING -s 10.2.171.0/255.255.255.0 -o eth0 -j MASQUERADE 
COMMIT
# Completed on Sun Dec 13 23:58:11 2009
Working for me now. Copied my existing MPT config and simply substituted the new host IP.
Severity: critical → normal
great, calling fixed.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Did the OpenVPN process get halted? I'm getting the same TLS error that nthomas did in https://bugzilla.mozilla.org/show_bug.cgi?id=506405#c30:
Mon Jan  4 08:59:19 2010: WARNING: No server certificate verification method has been enabled.  See http://openvpn.net/howto.html#mitm for more info.
Mon Jan  4 08:59:19 2010: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
Mon Jan  4 08:59:19 2010: Control Channel Authentication: using 'ta.key' as a OpenVPN static key file
Mon Jan  4 08:59:19 2010: LZO compression initialized
Mon Jan  4 08:59:19 2010: UDPv4 link local: [undef]
Mon Jan  4 08:59:19 2010: UDPv4 link remote: 63.245.208.227:1194
Mon Jan  4 09:00:19 2010: TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity)
Mon Jan  4 09:00:19 2010: TLS Error: TLS handshake failed
Mon Jan  4 09:00:19 2010: SIGUSR1[soft
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
OpenVPN is okay, time on the VM was off again. I see errors from your IP though.
My issue ended up being a configuration problem. Shyam helped me fix it.
Status: REOPENED → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.