Multiple newrelic agents not reporting

RESOLVED FIXED

Status

P1
normal
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: emorley, Assigned: fubar)

Tracking

Details

(Reporter)

Description

4 years ago
https://rpm.newrelic.com/accounts/677903/servers?showHidden=true

treeherder-rabbitmq1.stage.private.scl3.mozilla.com
treeherder-processor1.stage.private.scl3.mozilla.com
treeherder-processor2.stage.private.scl3.mozilla.com
treeherder-processor3.stage.private.scl3.mozilla.com
treeherder-etl1.stage.private.scl3.mozilla.com
treeherder-etl2.stage.private.scl3.mozilla.com

No data reporting for this server since Jan 29, 2015 3:03 PM.
Flags: needinfo?(klibby)
(Assignee)

Comment 1

4 years ago
It's the datacenter proxy again; will go re-open that bug once I find it.
Flags: needinfo?(klibby)
(Assignee)

Updated

4 years ago
Depends on: 1114611
(Reporter)

Comment 2

4 years ago
Ah thank you :-)
(Reporter)

Comment 3

4 years ago
Comment 0 was about the nrsysmond agent not reporting for stage. 

In addition we're now seeing the python agents on multiple nodes on stage/prod stop reporting too. ie the lists of nodes are shrinking here:
https://rpm.newrelic.com/accounts/677903/applications/4180461/environment
https://rpm.newrelic.com/accounts/677903/applications/5585473/environment
Summary: Multiple stage node newrelic agents not reporting → Multiple newrelic agents not reporting
(Reporter)

Comment 4

4 years ago
Created attachment 8563354 [details]
run_celery_worker_buildapi-supervisor.log

From treeherder-etl1.private.scl3.

The "UserWarning: Cannot load extension u'celerymon.bin.celerymon:MonitorDelegate'" appears earlier in the log, so seems unrelated.

However what is noticeably absent from the log after the last process restart is _any_ newrelic agent output at all. It's like we're not even running it.
(Reporter)

Comment 5

4 years ago
Yeah if you look at the exception tracebacks (since they reveal the python version), the newrelic agent output seems to stop immediately after the process restart where python 2.7 was deployed.
(Reporter)

Comment 7

4 years ago
(In reply to Ed Morley [:edmorley] from comment #5)
> Yeah if you look at the exception tracebacks (since they reveal the python
> version), the newrelic agent output seems to stop immediately after the
> process restart where python 2.7 was deployed.

We're getting no reporting at all now on prod (https://rpm.newrelic.com/accounts/677903/applications/4180461) could you take a look? :-)
Flags: needinfo?(klibby)
(Assignee)

Comment 8

4 years ago
It's on infra/opsec to fix, in 1114611.
Flags: needinfo?(klibby)
(Assignee)

Comment 9

4 years ago
newrelic agents added to python27 and jobs restarted. looks like things are showing up in new relic again. even made the warning banner about mismatched versions go away.
Thanks :fubar!
(Reporter)

Updated

4 years ago
Assignee: nobody → klibby
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.