Need AWS instances for Nagios hosts

RESOLVED FIXED

Status

Infrastructure & Operations
CIDuty
RESOLVED FIXED
5 years ago
2 months ago

People

(Reporter: hwine, Assigned: rail)

Tracking

Details

(Whiteboard: [reit-nagios])

Attachments

(1 attachment)

(Reporter)

Description

5 years ago
Create 3 AWS masters (1 per region) to support Nagios monitoring [extracted from bug 896812]:

nagios1.private.releng.usw1.mozilla.com
nagios1.private.releng.usw2.mozilla.com
nagios1.private.releng.use1.mozilla.com 

Specs from Ashish Vijayaram [:ashish] in bug 896812 comment #9)
> (In reply to Hal Wine [:hwine] from bug 896812 comment #8)
> > Based on engops meeting this morning, ideally these will be:
> >  - size small instances 
> >  - RHEL or CentOS based
> >  - our choice of IP based on the subnets identified in bug 901784 comment 1
> > 
> 
> - Yes, small size instances with about 1GB RAM should suffice
> - They need to be RHEL and in fact, they will be puppetized by Infra Puppet
> - Ideally private, which is where the other Nagios instances lie, will also
> help replicating ACLs
> 
> FWIW we currently have nagios1.private.euw1 running on EC2. I'll reuse that
> setup here as far as possible. Thanks!



They can have ashish's keys to start with and once we have flows, they can be puppetized.
Just to give an idea of timeline here :

It would be awesome to have these in place by the beginning of next week, at the latest :) Once they're up, Ashish will have to get flows opened with Netops and then finish his setup in time to test all this during the Aug 24th treeclosing maint window.

Thanks!
(Reporter)

Comment 2

5 years ago
:ashish can you add a pointer to your ssh pub key and/or attach it please?
Flags: needinfo?(ashish)
Created attachment 787893 [details]
ashish-pubkey

Ashish's pubkeys from LDAP
Flags: needinfo?(ashish)
Product: mozilla.org → Release Engineering
(Assignee)

Comment 4

5 years ago
Sorry for the delay, we were busy with the releases. I'll create the machines today.
Assignee: nobody → rail
(Assignee)

Comment 5

5 years ago
> nagios1.private.releng.usw1.mozilla.com

We don't have any instances in the us-west-1 region. Do we need a nagios server in this region?
Flags: needinfo?(shyam)
(Reporter)

Comment 6

5 years ago
At one point us-west-1 was a disaster recovery region that RelEng would deploy to if another region went down for a significant time. In that case, we'd want the monitoring to already be in place.

If us-west-1 is no longer part of a disaster recovery plan, then we don't need monitoring.
Flags: needinfo?(shyam)
(Assignee)

Comment 7

5 years ago
ashish, can you verify if the following configuration OK for you? They are based on RHEL 6.4.

use1: ec2-user@10.134.75.31
usw2: ec2-user@10.132.75.28

If they are OK we can add them to the inventory and DNS.
Flags: needinfo?(ashish)
:rail Thanks for setting these up. I was able to login to these from nagios1.releng.scl3. These will also need to be reachable from puppet1.private.scl3 too. Is that something you could setup as well?
Flags: needinfo?(ashish)
(Assignee)

Comment 9

5 years ago
I think it's related to the net flows. I can ssh to this host from vpn, but not from scl3 servers.
(Reporter)

Comment 10

5 years ago
(In reply to Rail Aliiev [:rail] from comment #5)
> > nagios1.private.releng.usw1.mozilla.com
> 
> We don't have any instances in the us-west-1 region. Do we need a nagios
> server in this region?

Yes -- I checked with :catlee on comment 6, and us-west-1 is part of our disaster recovery plan. So we do need one there. The current 2 are more urgent.
(In reply to Rail Aliiev [:rail] from comment #9)
> I think it's related to the net flows. I can ssh to this host from vpn, but
> not from scl3 servers.

Thanks :rail!. I have no further questions to these being added to Inventory and DNS. I will work with Netops to have flows open from the appropriate puppetmaster.
(Assignee)

Comment 12

5 years ago
np, I'll provide an us-west-1 based one as well, but I need to fix the subnets there first.
(Assignee)

Comment 13

5 years ago
ec2-user@10.130.75.84 for usw1

I'll add them to the inventory.
(Assignee)

Comment 14

5 years ago
~ ❯ host nagios1.private.releng.usw1.mozilla.com
nagios1.private.releng.usw1.mozilla.com has address 10.130.75.84

rail@magma  [17:26:08] 
~ ❯ host 10.130.75.84
84.75.130.10.in-addr.arpa domain name pointer nagios1.private.releng.usw1.mozilla.com.

rail@magma  [17:26:13] 
~ ❯ host nagios1.private.releng.usw2.mozilla.com
nagios1.private.releng.usw2.mozilla.com has address 10.132.75.28

rail@magma  [17:26:39] 
~ ❯ host 10.132.75.28
28.75.132.10.in-addr.arpa domain name pointer nagios1.private.releng.usw2.mozilla.com.

rail@magma  [17:26:42] 
~ ❯ host nagios1.private.releng.use1.mozilla.com  
nagios1.private.releng.use1.mozilla.com has address 10.134.75.31

rail@magma  [17:26:50] 
~ ❯ host 10.134.75.31
31.75.134.10.in-addr.arpa domain name pointer nagios1.private.releng.use1.mozilla.com.
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED

Updated

2 months ago
Product: Release Engineering → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.