Closed Bug 1206343 Opened 9 years ago Closed 9 years ago

treeherder-rabbitmq1.private.scl3.mozilla.com:Rabbit Unread Messages is CRITICAL

Categories

(Tree Management :: Treeherder: Infrastructure, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: jlaz, Unassigned)

Details

21:15:01 <@nagios-scl3> (IRC) Fri 21:15:01 PDT [5473] treeherder-rabbitmq1.private.scl3.mozilla.com:Rabbit Unread Messages is CRITICAL: RABBITMQ_OVERVIEW CRITICAL - messages CRITICAL (2609) messages_ready CRITICAL (2490), messages_unacknowledged OK (119) (http://m.mozilla.org/Rabbit+Unread+Messages) We've been coordinating with fubar and philor in #treeherder, but wanted to keep a bug open to track progress. We've restarted the celery workers so far, and it seems to be chipping away slowly.
Group: infra
Thanks for the heads up, but treeherder isn't a relops service (though releng relies on it). I suspect this belongs in the dev-services queue. fubar?
Moving to Treeherder's infra component (https://mana.mozilla.org/wiki/display/websites/treeherder.mozilla.org#treeherder.mozilla.org-AdminContacts) Opening the bug since there's nothing confidential in here :-)
Assignee: relops → nobody
Group: infra
Component: RelOps → Treeherder: Infrastructure
Priority: -- → P1
Product: Infrastructure & Operations → Tree Management
QA Contact: arich → laura
Version: other → ---
The queues are now back to zero: https://rpm.newrelic.com/accounts/677903/plugins/16138 The backlog appears to have been caused by Amazon S3 having issues, so Treeherder was unable to fetch the logs requests.exceptions:ConnectTimeout: HTTPConnectionPool(host='mozilla-releng-blobs.s3.amazonaws.com', port=80): Max retries exceeded with url: /blobs/b2g-inbound/sha512/538b00eb193068d6256e02112680050c39c5bedfdf04de88154499c9aa0a6902b93911801debd3d008f5f9afc4e1903deb49d73cb4657b0cdfcec3fd6dfdfc56 (Caused by ConnectTimeoutError(<requests.packages.urllib3.connection.HTTPConnection object at 0x7fc29d5d14d0>, 'Connection to mozilla-releng-blobs.s3.amazonaws.com timed out. (connect timeout=30)'))
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.