Add Elasticsearch plugin to MDN New Relic

RESOLVED FIXED

Status

Infrastructure & Operations
WebOps: Community Platform
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: jezdez, Assigned: cyliang)

Tracking

Details

(Whiteboard: [kanban:https://kanbanize.com/ctrl_board/4/538] )

(Reporter)

Description

4 years ago
New Relic supports plugins such as the one we use for MySQL and Memcache etc. There is also a plugin to fetch some general data from the Elasticsearch cluster that would be useful when debugging changes: https://rpm.newrelic.com/accounts/263620/plugins/directory/141

This seems like a quick win as it's free and doesn't require maintenance on our side.

Updated

4 years ago
Whiteboard: [kanban:https://kanbanize.com/ctrl_board/4/538]
(Assignee)

Updated

4 years ago
Assignee: server-ops-webops → cliang
(Assignee)

Comment 1

4 years ago
I've enabled the New Relic plugin agent on the stage (mdn_stage) and production (mdn_prod) clusters.  Would you like the New Relic plugin agent enabled in the dev environment as well?  

I've also enabled collectd gathering of Elasticsearch cluster-wide stats for both clusters. [1]  This should make those values available for graphs and dashboards in graphite.  


A few notes:

* The start of the mdn_prod New Relic graph looks a little odd because the deployment of the New Relic agent was a little slow, so not all of the ES nodes were sending data to New Relic.  

* I'm not sure how the New Relic plugin is coming up with the node information listed in the "cluster" tab.  The stage cluster has three data nodes and the production cluster has 5:

$ curl -XGET 'http://developer-elasticsearch-stage-zlb.webapp.scl3.mozilla.com:9200/_cluster/health?pretty=true'
{
  "cluster_name" : "mdn_stage",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 25,
  "active_shards" : 50,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0
}
$ curl -XGET 'http://developer-elasticsearch-zlb.webapp.scl3.mozilla.com:9200/_cluster/health?pretty=true'
{
  "cluster_name" : "mdn_prod",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 5,
  "number_of_data_nodes" : 5,
  "active_primary_shards" : 10,
  "active_shards" : 50,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0
}



[1] https://github.com/phobos182/collectd-elasticsearch
(Assignee)

Comment 2

4 years ago
:jezdez  - 

From what I can see, the data being pulled in by New Relic roughly corresponds with the data I'm seeing in graphite.  The main exception is the node information, which is still wrong.  (It looks like New Relic is totaling the number of nodes being reported by each node in the cluster.)

Otherwise, is the data being pulled into New Relic Elasticsearch good enough?
Flags: needinfo?(jezdez)
(Reporter)

Comment 3

4 years ago
:cyliang Yeah, this looks great, thank you!
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Flags: needinfo?(jezdez)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.