Closed Bug 1296320 Opened 9 years ago Closed 9 years ago

treeherder-rabbitmq log parser queue backlog due to slow transit

Categories

(Tree Management :: Treeherder: Infrastructure, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jlaz, Unassigned)

References

Details

Nagios alert: 07:46:36 <@nagios-scl3> (IRC) Thu 07:46:36 PDT [5137] treeherder-rabbitmq2.private.scl3.mozilla.com:Rabbit Unread Messages is WARNING: RABBITMQ_OVERVIEW WARNING - messages WARNING(2839) messages_ready WARNING (2721), messages_unacknowledged OK (118) (http://m.mozilla.org/Rabbit+Unread+Messages) It looks like the log_parser queue is backlogged, but it was mentioned that this is caused by archive.m.o's CDN being routed via Asia. Filing a bug for tracking this issue.
Group: infra
Happened again, logging for tracking 10:56 <@nagios-scl3> (IRC) Wed 02:56:19 PDT [5070] treeherder-rabbitmq2.private.scl3.mozilla.com:Rabbit Unread Messages is CRITICAL: RABBITMQ_OVERVIEW CRITICAL - messages CRITICAL (20370) messages_ready CRITICAL (20308), messages_unacknowledged OK (62) (http://m.mozilla.org/Rabbit+Unread+Messages)
See: https://rpm.newrelic.com/accounts/677903/applications/4180461/externals In this latest case is affecting requests to queue.taskcluster.net too - which is served by CloudFront.
and there are still problems today- jobs finished and 50 minutes later, no perf data is being ingested.
Fixed by moving to Heroku.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.