Closed
Bug 1156329
Opened 10 years ago
Closed 10 years ago
Public Bugzilla Elasticsearch Cluster is Stale (has ETL has stopped?)
Categories
(bugzilla.mozilla.org :: Infrastructure, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: ekyle, Assigned: fubar)
References
(Blocks 1 open bug)
Details
The public cluster has stale date. The ETL may have stopped!
| Assignee | ||
Comment 1•10 years ago
|
||
hung process from the 12th. killed it and will keep an eye on the next few cron runs.
Assignee: nobody → klibby
| Assignee | ||
Comment 2•10 years ago
|
||
still happy. looking back at the logs...
the 2015-04-13 03:00 run is where things first get weird. looks like it does a normal run, but at 03:01 we start getting 'Waiting on thread "etl" notices which continue until 04:27 when we get:
2015-04-13 04:27:19.344993 - WARNING: problem logging to es
at File /data/www/Bugzilla-ETL/bzETL/util/env/log_usingElasticSearch.py, line 85, in t
ime_delta_pusher
at File /data/www/Bugzilla-ETL/bzETL/util/thread/threads.py, line 252, in _run
caused by
ERROR: problem
at File /data/www/Bugzilla-ETL/bzETL/util/env/elasticsearch.py, line 259, in extend
at File /data/www/Bugzilla-ETL/bzETL/util/env/log_usingElasticSearch.py, line 82, in time_delta_pusher
at File /data/www/Bugzilla-ETL/bzETL/util/thread/threads.py, line 252, in _run
caused by
ERROR: Problem with call to http://elasticsearch4.bugs.scl3.mozilla.com:9200/debug/public_etl/_bulk
{"index":{"_id": "B72B955C2820C8475DDA94347E7D5D521D52A2AF"}}
{"timestamp": 1428924439000, "params":
at File /data/www/Bugzilla-ETL/bzETL/util/env/elasticsearch.py, line 339, in post
at File /data/www/Bugzilla-ETL/bzETL/util/env/elasticsearch.py, line 243, in extend
at File /data/www/Bugzilla-ETL/bzETL/util/env/log_usingElasticSearch.py, line 82, in time_delta_pusher
at File /data/www/Bugzilla-ETL/bzETL/util/thread/threads.py, line 252, in _run
caused by
ERROR: HTTPConnectionPool(host='elasticsearch4.bugs.scl3.mozilla.com', port=9200): Max retries exceeded with url: /debug/public_etl/_bulk (Caused by <class '_socket.gaierror'>: [Errno -3] Temporary failure in name resolution)
at File /usr/lib64/pypy-2.2.1/site-packages/requests/adapters.py, line 382, in send
at File /usr/lib64/pypy-2.2.1/site-packages/requests/sessions.py, line 486, in send
at File /usr/lib64/pypy-2.2.1/site-packages/requests/sessions.py, line 383, in request
at File /usr/lib64/pypy-2.2.1/site-packages/requests/api.py, line 44, in request
at File /usr/lib64/pypy-2.2.1/site-packages/requests/api.py, line 88, in post
at File /data/www/Bugzilla-ETL/bzETL/util/env/elasticsearch.py, line 321, in post
given the time span, and the number of 'Temporary failure in name resolution' errors (5200+), I think that's a lie. There's nothing useful in the elasticsearch logs, but I've come to accept that as normal for ES.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•