1155044 - Setup robust logging service, using ELK or heka, for new AWS infrastructure

Assignee

Description

•

9 years ago

Setup either SQS-based or syslog-based logging service for AWS.  I'm leaning toward ELK, but am open to thoughts!

After the service is stood up, we'll need to:
* setup log shipping on our staging infrastructure
* setup logstash inputs to accomodate incoming info
* determine if we want logging such as system, puppet, auth, syslog routed as well.
* Make all of the dashboards

JP Schneider [:jp]

Assignee

Updated

•

9 years ago

Blocks: 1123833

JP Schneider [:jp]

Assignee

Comment 1

•

9 years ago

I have got this started.  What I've done:

1) Created a templated-in-us-west-2 and us-east-1 m3.2xl Elasticsearch cluster from these directions (https://jdotpz.github.io/#gh-weblog-1402498716135).  This is essentially capable of careful autoscaling horizontally.  Made this into an autoscale group: prod__loggins_elasticsearch-as 
2) Created a loggins-elasticsearch-ec2-sg security group to allow for like servers to auto discover each other with the aws discovery plugin
3) Created a prod__loggins_ec2 security group, more in line with naming standards, which will host syslog collection over UDP 1514 and allow an ELB to access it via https/http.  This server proxies the elasticsearch cluster, so that will not be open to the public.
4) Created an ELB to sit in front of Kibana.  prod--elb-for-logstash, along with an associated prod__loggins_elb_sg security group allowing 80/443
5) Created a templated-in-us-west-2 logstash/kibana combo server which will process logs over UDP syslog, and serve kibana traffic over the SSL-protected and basic-auth protected nginx proxy.  Made this into an autoscaling group, loading servers into the created prod--elb-for-logstash server.  Put this in a prod__loggins_ec2_sg security group, allowing UDP/TCP 1514 from anywhere (for now), and the associated ELB 80/443 traversal.

Jeez, I probably did more than that, but that's as much as I can remember to document. :)

Assignee: nobody → jschneider

JP Schneider [:jp]

Assignee

Comment 2

•

9 years ago

DNS entries associated with this:

loggins-es.mocotoolsprod.net -- Elasticsearch master server, CNAMEd to the EC2 instance.
loggins.mocotoolsprod.net -- The end user end point for viewing Kibana and reports.  Pointed to the elb for kibana.
logshipper.mocotoolsprod.net -- Where we'll point rsyslog to for shipment of the logs.  Pointed to the ec2 address of the logstash/kibana instance.

JP Schneider [:jp]

Assignee

Comment 3

•

9 years ago

Alright, got the node all configured.

* http://loggins.mocotoolsprod.net/#/dashboard/file/default.json is happytime with a default dashboard.  
* I've redone the AMI for the logstash/kibana combo server after I fixed the ES config in /etc/logstash/logstash_syslog.conf and /var/www/config.js.  I've applied that AMI to the autoscaling group.
* I smoked the hell out of a cigarette.

JP Schneider [:jp]

Assignee

Comment 4

•

9 years ago

Next steps:  We should send it logs from nginx, and potentially syslog.

Rsyslog endpoint will be:  logshipper.mocotoolsprod.net on UDP 1514.  We can also use GELF if we want to be fancy, I'd just have to open up the ports.

JP Schneider [:jp]

Assignee

Updated

•

9 years ago

Depends on: 1155071

Daniel Maher [:phrawzty]

Updated

•

9 years ago

Updated

•

9 years ago

Blocks: 1118288
No longer blocks: 1123833

JP Schneider [:jp]

Assignee

Comment 5

•

9 years ago

We're currently logging all via syslog to our loggins.mocotoolsprod.net server (use https, let me know if you need the u/n).  

Next steps / Down the road
1) We want to setup logstash forwarder to enable better tagging/parsing of disparate log types
2) Setup good dashboards based on that tagged/parsed log stream
3) Setup alerting / Hooks into datadog possibly
4) Look a scaling / hosted options for these logs/services.

Daniel Maher [:phrawzty]

Comment 6

•

9 years ago

(In reply to JP Schneider [:jp] from comment #5)
> We're currently logging all via syslog to our loggins.mocotoolsprod.net
> server (use https, let me know if you need the u/n).  

Please add these credentials to our shared LastPass.

Flags: needinfo?(jschneider)

JP Schneider [:jp]

Assignee

Comment 7

•

9 years ago

I'm gonna change those creds (they came with my ami) and then I shall.

Flags: needinfo?(jschneider)

JP Schneider [:jp]

Assignee

Comment 8

•

9 years ago

Shared in lastpass!
I simultaneously upsized the node since we were exercising that "demo" node a bit hard.

Robert Helmer [:rhelmer]

Comment 9

•

9 years ago

What's left before we can go live?

Flags: needinfo?(jschneider)

JP Schneider [:jp]

Assignee

Comment 10

•

9 years ago

Closing this bug.

Status: NEW → RESOLVED

Closed: 9 years ago

Flags: needinfo?(jschneider)

Resolution: --- → FIXED

Bugzilla

Quick Search

Setup robust logging service, using ELK or heka, for new AWS infrastructure

Categories

(Socorro :: Infra, task)

Tracking

(Not tracked)

People

(Reporter: jschneider, Assigned: jschneider)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Updated

Updated

Updated

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10