Closed
Bug 922276
Opened 12 years ago
Closed 12 years ago
Elastic Search Request Timeouts
Categories
(Infrastructure & Operations Graveyard :: WebOps: Engagement, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: davidwalsh, Assigned: bburton)
Details
Starting around midnight last night (Sep 30th), I started frequently receiving these emails:
Title:
[mdn] [celery@developer-celery1.webapp.scl3.mozilla.com] Error: Task elasticutils.contrib.django.tasks.index_objects (fa2dd8a4-451b-4a86-90a3-60e9e784fc3d): <MaybeEncodingError: Error sending result: '<ExceptionInfo: UnpickleableExceptionWrapper('requests.exceptions', 'Timeout', (TimeoutError("HTTPConnectionPool(host='elasticsearch-zlb.webapp.scl3.mozilla.com', port=9200): Request timed out. (timeout=5)", ), ), 'Timeout(TimeoutError("HTTPConnectionPool(host=\'elasticsearch-zlb.webapp.scl3.mozilla.com\', port=9200): Request timed out. (timeout=5)", ), )')>'. Reason: ''TimeoutError' object has no attribute 'url''.>
Content:
Task elasticutils.contrib.django.tasks.index_objects with id fa2dd8a4-451b-4a86-90a3-60e9e784fc3d raised exception:
'<MaybeEncodingError: Error sending result: \'<ExceptionInfo: UnpickleableExceptionWrapper(\'requests.exceptions\', \'Timeout\', (TimeoutError("HTTPConnectionPool(host=\'elasticsearch-zlb.webapp.scl3.mozilla.com\', port=9200): Request timed out. (timeout=5)",),), \'Timeout(TimeoutError("HTTPConnectionPool(host=\\\'elasticsearch-zlb.webapp.scl3.mozilla.com\\\', port=9200): Request timed out. (timeout=5)",),)\')>\'. Reason: \'\'TimeoutError\' object has no attribute \'url\'\'.>'
Task was called with args: (<class 'wiki.models.DocumentType'>, [7718L]) kwargs: {}.
The contents of the full traceback was:
Traceback (most recent call last):
File "/data/www/developer.mozilla.org/kuma/vendor/packages/celery/celery/concurrency/processes/pool.py", line 215, in worker
put((READY, (job, i, result)))
File "/usr/lib64/python2.6/multiprocessing/queues.py", line 366, in put
return send(obj)
File "/data/www/developer.mozilla.org/kuma/vendor/src/requests/requests/packages/urllib3/exceptions.py", line 23, in __reduce__
return self.__class__, (None, self.url)
MaybeEncodingError: Error sending result: '<ExceptionInfo: UnpickleableExceptionWrapper('requests.exceptions', 'Timeout', (TimeoutError("HTTPConnectionPool(host='elasticsearch-zlb.webapp.scl3.mozilla.com', port=9200): Request timed out. (timeout=5)",),), 'Timeout(TimeoutError("HTTPConnectionPool(host=\'elasticsearch-zlb.webapp.scl3.mozilla.com\', port=9200): Request timed out. (timeout=5)",),)')>'. Reason: ''TimeoutError' object has no attribute 'url''.
-- Just to let you know, celeryd at developer-celery1.webapp.scl3.mozilla.com.
Worked with :jakem to resolve the issue, he may have more information.
Assignee | ||
Updated•12 years ago
|
Assignee: server-ops-webops → bburton
Status: NEW → ASSIGNED
Assignee | ||
Comment 1•12 years ago
|
||
The production ElasticSearch cluster experienced a network partition due to some core network maintenance which occurred yesterday. Unfortunately it did not recover from this on its own.
Additionally, a monitoring misconfiguration, which was fixed in bug 922267, caused the cluster's bad state not to generate an alert.
At this time the cluster has been restored to proper health and the monitoring is in place.
Let me know if there are any questions.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Updated•9 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•