Closed Bug 1125945 Opened 10 years ago Closed 10 years ago

Verify the number of CPU that we need for New Relic for APM monitoring

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: schan, Assigned: nmaul)

Details

(Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/1158] )

New Relic has a new pricing model that is CPU based. We need to validate the number of CPU that we have in our general account as one of the data input points for the licensing discussion.
Assignee: server-ops-webops → nmaul
Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/350]
I wasn't able to see in the New Relic graphs how to count the # of CPUs. However, it's very easy to count *physical* CPU cores on a server: cat /sys/devices/system/cpu/cpu*/topology/physical_package_id | sort -u | wc -l Or *virtual* CPU cores: find /sys/devices/system/cpu/ -mindepth 1 -maxdepth 1 -type d -name 'cpu[0-9]*' | wc -l mig found ~407 hosts that have 'newrelic' in their applied puppet class names somewhere, and an experimental mig search found ~1493 CPU cores (including hyperthreading and virtual cores) amongst those servers. That should be approximately sufficient for now to provide the number of CPUs that we're reporting to New Relic, at least for business purposes, AS LONG AS they're counting all cores, not just non-VM, non-physical cores. Closing as FIXED for now, but please feel free to reopen if more data is required. Technical notes for the curious: export GPG_AGENT_INFO=$HOME/.gnupg/S.gpg-agent # OS X bugfix for mig ./mig-cmd file -t "tags->>'operator'='IT'" -path /var/lib/puppet/classes.txt -content "newrelic" ./mig-cmd file -show all -t "id IN (select agentid from commands, json_array_elements(commands.results) as r where commands.actionid = 6124806970875082752 AND r#>>'{foundanything}' = 'true')" -path "/sys/devices/system/cpu/" -name "physical_package_id" -content "^" > results-2.txt grep topology results-2.txt | cut -d\ -f1-2 | perl -pe '($fqdn, $cpu) = split(" "); ($found{$fqdn}) = ($cpu =~ /cpu(\d+)(?=\D)/); next LINE; END { print "$k ".(1+$v)."\n" while ($k,$v) = each %found }' | grep -v topology > results-3.txt cat results-3.txt | cut -d\ -f2 | perl -pe '$a += $_; END { warn $a }'
Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/350]
Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/1158]
I would guess that atoll's numbers make a good high-end estimate. Reason being, one of our classes is "newrelic::sysmond"... the system monitoring part of NR. That's free, and we use it on a lot of nodes.. anywhere it might possibly be useful, really (databases, memcache nodes, zlb's, webapp admin nodes, etc). The only thing that counts towards our license usage is nodes that report data for an app... which is pretty much just web/celery nodes. Excluding nodes that *only* have that might help, but runs the risk of undercounting- some things that report to NR do so via packages installed in a virtualenv... not from puppet. Still, that might make for a reasonable lower-end estimate.
Closing this out on the theory that comment 1 is sufficient.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.