Closed Bug 903224 Opened 11 years ago Closed 11 years ago

Need AWS instances for Nagios hosts

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: hwine, Assigned: rail)

References

Details

(Whiteboard: [reit-nagios])

Attachments

(1 file)

Create 3 AWS masters (1 per region) to support Nagios monitoring [extracted from bug 896812]:

nagios1.private.releng.usw1.mozilla.com
nagios1.private.releng.usw2.mozilla.com
nagios1.private.releng.use1.mozilla.com 

Specs from Ashish Vijayaram [:ashish] in bug 896812 comment #9)
> (In reply to Hal Wine [:hwine] from bug 896812 comment #8)
> > Based on engops meeting this morning, ideally these will be:
> >  - size small instances 
> >  - RHEL or CentOS based
> >  - our choice of IP based on the subnets identified in bug 901784 comment 1
> > 
> 
> - Yes, small size instances with about 1GB RAM should suffice
> - They need to be RHEL and in fact, they will be puppetized by Infra Puppet
> - Ideally private, which is where the other Nagios instances lie, will also
> help replicating ACLs
> 
> FWIW we currently have nagios1.private.euw1 running on EC2. I'll reuse that
> setup here as far as possible. Thanks!



They can have ashish's keys to start with and once we have flows, they can be puppetized.
Just to give an idea of timeline here :

It would be awesome to have these in place by the beginning of next week, at the latest :) Once they're up, Ashish will have to get flows opened with Netops and then finish his setup in time to test all this during the Aug 24th treeclosing maint window.

Thanks!
:ashish can you add a pointer to your ssh pub key and/or attach it please?
Flags: needinfo?(ashish)
Attached file ashish-pubkey
Ashish's pubkeys from LDAP
Flags: needinfo?(ashish)
Product: mozilla.org → Release Engineering
Sorry for the delay, we were busy with the releases. I'll create the machines today.
Assignee: nobody → rail
> nagios1.private.releng.usw1.mozilla.com

We don't have any instances in the us-west-1 region. Do we need a nagios server in this region?
Flags: needinfo?(shyam)
At one point us-west-1 was a disaster recovery region that RelEng would deploy to if another region went down for a significant time. In that case, we'd want the monitoring to already be in place.

If us-west-1 is no longer part of a disaster recovery plan, then we don't need monitoring.
Flags: needinfo?(shyam)
ashish, can you verify if the following configuration OK for you? They are based on RHEL 6.4.

use1: ec2-user@10.134.75.31
usw2: ec2-user@10.132.75.28

If they are OK we can add them to the inventory and DNS.
Flags: needinfo?(ashish)
:rail Thanks for setting these up. I was able to login to these from nagios1.releng.scl3. These will also need to be reachable from puppet1.private.scl3 too. Is that something you could setup as well?
Flags: needinfo?(ashish)
I think it's related to the net flows. I can ssh to this host from vpn, but not from scl3 servers.
(In reply to Rail Aliiev [:rail] from comment #5)
> > nagios1.private.releng.usw1.mozilla.com
> 
> We don't have any instances in the us-west-1 region. Do we need a nagios
> server in this region?

Yes -- I checked with :catlee on comment 6, and us-west-1 is part of our disaster recovery plan. So we do need one there. The current 2 are more urgent.
(In reply to Rail Aliiev [:rail] from comment #9)
> I think it's related to the net flows. I can ssh to this host from vpn, but
> not from scl3 servers.

Thanks :rail!. I have no further questions to these being added to Inventory and DNS. I will work with Netops to have flows open from the appropriate puppetmaster.
np, I'll provide an us-west-1 based one as well, but I need to fix the subnets there first.
ec2-user@10.130.75.84 for usw1

I'll add them to the inventory.
~ ❯ host nagios1.private.releng.usw1.mozilla.com
nagios1.private.releng.usw1.mozilla.com has address 10.130.75.84

rail@magma  [17:26:08] 
~ ❯ host 10.130.75.84
84.75.130.10.in-addr.arpa domain name pointer nagios1.private.releng.usw1.mozilla.com.

rail@magma  [17:26:13] 
~ ❯ host nagios1.private.releng.usw2.mozilla.com
nagios1.private.releng.usw2.mozilla.com has address 10.132.75.28

rail@magma  [17:26:39] 
~ ❯ host 10.132.75.28
28.75.132.10.in-addr.arpa domain name pointer nagios1.private.releng.usw2.mozilla.com.

rail@magma  [17:26:42] 
~ ❯ host nagios1.private.releng.use1.mozilla.com  
nagios1.private.releng.use1.mozilla.com has address 10.134.75.31

rail@magma  [17:26:50] 
~ ❯ host 10.134.75.31
31.75.134.10.in-addr.arpa domain name pointer nagios1.private.releng.use1.mozilla.com.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: