Closed Bug 1089699 Opened 11 years ago Closed 10 years ago

socorro1.stage.db.phx1.mozilla.com:Ganglia - PostgreSQL Last Reports Update is UNKNOWN: CHECKGANGLIA UNKNOWN: Error while getting value Host/value not found

Categories

(Socorro :: Backend, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: rwatson, Assigned: selenamarie)

Details

[16:10:36] nagios-phx1 Mon 09:10:36 PDT [1331] socorro1.stage.db.phx1.mozilla.com:Ganglia - PostgreSQL Last Reports Update is UNKNOWN: CHECKGANGLIA UNKNOWN: Error while getting value Host/value not found (http://m.mozilla.org/Ganglia+-+PostgreSQL+Last+Reports+Update) [16:10:38] nagios-phx1 Mon 09:10:38 PDT [1333] socorro1.stage.db.phx1.mozilla.com:Ganglia - pgBouncer connections is UNKNOWN: CHECKGANGLIA UNKNOWN: Error while getting value Host/value not found (http://m.mozilla.org/Ganglia+-+pgBouncer+connections) [16:10:39] nagios-phx1 Mon 09:10:39 PDT [1336] socorro1.stage.db.phx1.mozilla.com:Ganglia - PostgreSQL Connections is UNKNOWN: CHECKGANGLIA UNKNOWN: Error while getting value Host/value not found (http://m.mozilla.org/Ganglia+-+PostgreSQL+Connections) [16:10:41] nagios-phx1 Mon 09:10:41 PDT [1339] socorro1.stage.db.phx1.mozilla.com:Ganglia - PostgreSQL Performance Test Query is UNKNOWN: CHECKGANGLIA UNKNOWN: Error while getting value Host/value not found (http://m.mozilla.org/Ganglia+-+PostgreSQL+Performance+Test+Query) Have been receiving these alerts. No docs so unsure what to do. As this is stage, have downtimed 24h.
Initial investigation: These checks are also running against prod hosts successfully. Checked svn log and didn't see anything recent that might have affected this. Found this in the logs: 2014-10-27 08:17:05.235 23447 LOG File descriptor limit: 1024 (H:4096), max_client_conn: 400, max fds possible: 850 2014-10-27 08:17:05.235 23447 FATAL @src/main.c:754 in function main(): unix socket is in use, cannot continue unsure if that corresponds with the first nagios alert i saw in chan: 08:00 < nagios-phx1> | Mon 08:00:37 PDT [1310] socorro1.stage.db.phx1.mozilla.com:Ganglia - PostgreSQL Connections is UNKNOWN: CHECKGANGLIA UNKNOWN: Error while getting value Host/value not found (http://m.mozilla.org/Ganglia+-+PostgreSQL+Connections)
Assignee: nobody → sdeckelmann
Flags: needinfo?(rbryce)
This is still alerting.. Wed 09:35:31 PDT [1875] socorro1.stage.db.phx1.mozilla.com:Ganglia - PostgreSQL Last Reports Update is UNKNOWN: CHECKGANGLIA UNKNOWN: Error while getting value Host/value not found
(In reply to Ryan Watson [:w0ts0n] from comment #3) > This is still alerting.. > Wed 09:35:31 PDT [1875] socorro1.stage.db.phx1.mozilla.com:Ganglia - > PostgreSQL Last Reports Update is UNKNOWN: CHECKGANGLIA UNKNOWN: Error while > getting value Host/value not found I believe this error is an internal ganglia/nagios error based on https://bugzilla.mozilla.org/show_bug.cgi?id=916274 -- can someone from IT confirm?
This is no longer an issue now that socorro is in ze cloud.
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: needinfo?(rbryce)
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.