Closed Bug 751626 Opened 13 years ago Closed 13 years ago

install ganglia on cruncher

Categories

(Infrastructure & Operations :: RelOps: General, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: arich, Unassigned)

Details

Please modify cruncher's puppet config to install ganglia (same config as other .srv.releng.scl3.mozilla.com hosts).
I installed it, as well as for relengweb1 (in the new RelengWeb cluster). https://ganglia-scl3.mozilla.org/ganglia/ is not working, so I can't tell if it worked. IIIRC, ganglia in SCL3 isn't set up yet, so I'll call that good.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Cruncher: appears at http://ganglia1.srv.releng.scl3.mozilla.com/ganglia/?p=2&c=RelEngSCL3Srv&h=cruncher.srv.releng.scl3.mozilla.com nagios check result is still UNKNOWN Relengweb1: does not appear in web interface
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I changed relengweb1 to use a different puppet class, but it doesn't seem to have made any difference. I'm still getting May 6 03:16:49 ganglia1 /usr/sbin/gmetad[31034]: data_thread() got no answer from any [RelengWeb] datasource and May 7 11:46:46 relengweb1 /usr/sbin/gmond[12605]: [PYTHON] Can't call the metric handler function for [tcp_unknown] in the python module [tcpconn].#012 The mana pages are outdated and not much use. I'm basically stuck here. Infra folks, can you help out? I have a number of other hosts to add to ganglia, so please let me know what I'm doing wrong so I don't repeat it!
Assignee: dustin → server-ops-infra
Component: Server Operations: RelEng → Server Operations: Infrastructure
QA Contact: arich → jdow
I copied the check_ganglia script from the old cruncher to the new, so that the nagios check still passes. How does that usually get installed by puppet? It looks like the ganglia-new module, which I'll admit I didn't try (I only tried ganglia, ganglia2, and then ganglia-client, on digi's advice).
OK, I got some help in IRC. A big typo didn't help. And ganglia2's the one to use. Thanks!
Assignee: server-ops-infra → server-ops-releng
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Component: Server Operations: Infrastructure → Server Operations: RelEng
QA Contact: jdow → arich
Resolution: --- → FIXED
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.