Closed
Bug 911095
Opened 12 years ago
Closed 12 years ago
OrangeFactor not showing failures since 27-08-2013
Categories
(Tree Management Graveyard :: OrangeFactor, defect)
Tree Management Graveyard
OrangeFactor
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: emorley, Unassigned)
Details
Guessing the logparser/pulse needs a kick?
To what machine would I need to request access to be able to do this myself?
Flags: needinfo?(jgriffin)
Comment 1•12 years ago
|
||
As mentioned in IRC, you want access to orangefactor1.dmz.phx1.mozilla.com.
The logparser is running and consuming messages from pulse. In the error log, I see this repeated over and over:
2013-08-30 08:37:02,329 - BuildLogMonitor - ERROR - Max retries exceeded for url: /logs/builds/_count
Traceback (most recent call last):
File "/home/webtools/apps/logparser/src/logparser/logparser/savelogs.py", line 60, in parse
lp.parseFiles()
File "/home/webtools/apps/logparser/src/logparser/logparser/logparser.py", line 106, in parseFiles
self.postResultsToElasticSearch(testdata)
File "/home/webtools/apps/logparser/src/logparser/logparser/logparser.py", line 196, in postResultsToElasticSearch
self._post_testgroup_to_elasticsearch(data)
File "/home/webtools/apps/logparser/src/logparser/logparser/logparser.py", line 183, in _post_testgroup_to_elasticsearch
testgroup.submit()
File "/home/webtools/apps/logparser/src/mozautolog/mozautolog/esautolog.py", line 80, in submit
self._generate_testrun()
File "/home/webtools/apps/logparser/src/mozautolog/mozautolog/esautolog.py", line 58, in _generate_testrun
doc_type = [self.doc_type])
File "/home/webtools/apps/logparser/src/mozautoeslib/mozautoeslib/eslib.py", line 186, in query
doc_types=self.doc_type)
File "/home/webtools/apps/logparser/lib/python2.6/site-packages/pyes-0.15.0-py2.6.egg/pyes/es.py", line 793, in count
return self._query_call("_count", query, indexes, doc_types, **query_params)
File "/home/webtools/apps/logparser/lib/python2.6/site-packages/pyes-0.15.0-py2.6.egg/pyes/es.py", line 246, in _query_call
response = self._send_request('GET', path, body, querystring_args)
File "/home/webtools/apps/logparser/lib/python2.6/site-packages/pyes-0.15.0-py2.6.egg/pyes/es.py", line 205, in _send_request
response = self.connection.execute(request)
File "/home/webtools/apps/logparser/lib/python2.6/site-packages/pyes-0.15.0-py2.6.egg/pyes/connection_http.py", line 167, in _client_call
return getattr(conn.client, attr)(*args, **kwargs)
File "/home/webtools/apps/logparser/lib/python2.6/site-packages/pyes-0.15.0-py2.6.egg/pyes/connection_http.py", line 59, in execute
response = self.client.urlopen(Method._VALUES_TO_NAMES[request.method], uri, body=request.body, headers=request.headers)
File "/home/webtools/apps/logparser/lib/python2.6/site-packages/pyes-0.15.0-py2.6.egg/pyes/urllib3/connectionpool.py", line 294, in urlopen
return self.urlopen(method, url, body, headers, retries-1, redirect) # Try again
File "/home/webtools/apps/logparser/lib/python2.6/site-packages/pyes-0.15.0-py2.6.egg/pyes/urllib3/connectionpool.py", line 294, in urlopen
return self.urlopen(method, url, body, headers, retries-1, redirect) # Try again
File "/home/webtools/apps/logparser/lib/python2.6/site-packages/pyes-0.15.0-py2.6.egg/pyes/urllib3/connectionpool.py", line 294, in urlopen
return self.urlopen(method, url, body, headers, retries-1, redirect) # Try again
File "/home/webtools/apps/logparser/lib/python2.6/site-packages/pyes-0.15.0-py2.6.egg/pyes/urllib3/connectionpool.py", line 294, in urlopen
return self.urlopen(method, url, body, headers, retries-1, redirect) # Try again
File "/home/webtools/apps/logparser/lib/python2.6/site-packages/pyes-0.15.0-py2.6.egg/pyes/urllib3/connectionpool.py", line 255, in urlopen
raise MaxRetryError("Max retries exceeded for url: %s" % url)
MaxRetryError: Max retries exceeded for url: /logs/builds/_count
I can access the ES cluster from this machine. Restarting the logparser didn't help. I am curious about that URL; that's not a full URL, but I'm not sure if it's only printing the path for some reason. I don't know what would have changed to cause an error like this, though.
Comment 2•12 years ago
|
||
Also since the logparser has been consuming messages, even though it's not outputting anything to ES, we'll have to use the back-fill script to go over past logs after we fix this.
Comment 3•12 years ago
|
||
The logparser is trying to write to both the dev and production instances of ES, but the dev instance is unreachable:
[webtools@orangefactor1.dmz.phx1 bin]$ curl http://elasticsearch-zlb.webapp.scl3.mozilla.com:9200/
{
"ok" : true,
"status" : 200,
"name" : "elasticsearch5_scl3",
"version" : {
"number" : "0.20.5",
"snapshot_build" : false
},
"tagline" : "You Know, for Search"
}[webtools@orangefactor1.dmz.phx1 bin]$ curl http://elasticsearch-zlb.dev.vlan81.phx.mozlla.com:9200/
curl: (6) Couldn't resolve host 'elasticsearch-zlb.dev.vlan81.phx.mozilla.com'
[webtools@orangefactor1.dmz.phx1 bin]$
I'm going to take the dev server out of the list for now.
Flags: needinfo?(jgriffin)
Comment 4•12 years ago
|
||
logparser is running again; I need to patch it to make it more resilient to ES failures like this.
Comment 5•12 years ago
|
||
Hmm, so the ES address of the dev server in the logparser config was wrong.
was: elasticsearch-zlb.dev.vlan81.phx.mozilla.com
should be: elasticsearch-zlb.dev.vlan81.phx1.mozilla.com
(i.e., phx1 not phx)
I corrected this, and it's happy again. But, since this was always wrong, I'm not actually sure it's the reason why OF isn't updating.
| Reporter | ||
Comment 6•12 years ago
|
||
How do we go about backfilling the missing data? :-)
Flags: needinfo?(jgriffin)
Comment 7•12 years ago
|
||
We don't have a good way. We used to have a scraper that would scrape the FTP site and populate data based on logs it found, but that's completely bitrotted and would take some work to get running again. We'd also have to guard against writing duplicate data to ES, which the scraper is not currently smart enough to do.
The scraper was removed in this changeset, for being completely obsolete: http://hg.mozilla.org/automation/logparser/rev/5556a69ce358
Flags: needinfo?(jgriffin)
| Reporter | ||
Comment 8•12 years ago
|
||
That's unfortunate, but can't be helped. Thank you anyway!
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
| Assignee | ||
Updated•11 years ago
|
Product: Testing → Tree Management
Updated•5 years ago
|
Product: Tree Management → Tree Management Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•